Czech version
Data Analysis II (MADII)
Spring 2022
Course description:
The course is focused on more advanced algorithms of network analysis. The lectures deal with the essence of the individual algorithms in order to be able to assess the suitability of the methods in their usage. In seminars, experiments with selected datasets and tools are performed.
- Ongoing activities: 19-36 points
- Implementation: 12-24 points
- Data analysis: 10-20 points
- Written exam: 10-20 points
-
DATA ANALYSIS II Tasks.pdf
Lectures and Seminars:
Literature and sources:
-
Literature
- Ian H. Witten, Eibe Frank , Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques (Third Edition). The Morgan Kaufmann Series in Data Management Systems, 2011. ISBN 978-0123748560.
- Zaki, M. J., Meira Jr, W. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press.
- Bramer, M. (2013). Principles of data mining. Springer.
- Albert-Laszlo Barabasi.
Network Science
- Mark Newman. Networks: An Introduction. Oxford University Press, 2010. ISBN 978-0199206650.
-
Tools for network analysis and visualization
Pajek - Program for Large Network Analysis,
Pajek
NodeXL - Tempalte for Excel,
NodeXL
SNAP - Stanford Network Analysis Project,
SNAP
Gephi
, Graphviz etc.
Visual Complexity
D3.js - JavaScript library for manipulating documents based on data,
D3.js
Course Outline:
-
Data for data mining, types and sources of data
-
Attributes and their types, sparse data, incomplete and inaccurate data
-
Algebraic and geometric interpretation of data
-
Probabilistic interpretation of data
-
Numerical and categorial attributes, the basic analytical approaches
-
Data mining, pre-processing and data cleaning
-
Data representation
-
Foundations of data analysis (classification, clustering)
-
Networks and their properties
-
Types of networks and their representation
-
Basic measures and metrics
-
Structure and global properties of networks
-
Basic data structures for network representation
-
Basic algorithms for network analysis