Czech version Czech version

Data Analysis II (MADII)

Spring 2022

Course description:

The course is focused on more advanced algorithms of network analysis. The lectures deal with the essence of the individual algorithms in order to be able to assess the suitability of the methods in their usage. In seminars, experiments with selected datasets and tools are performed.



Lectures and Seminars:


Literature and sources:

  • Literature    
  •     Ian H. Witten, Eibe Frank , Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques (Third Edition). The Morgan Kaufmann Series in Data Management Systems, 2011. ISBN 978-0123748560.
  •     Zaki, M. J., Meira Jr, W. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press.
  •     Bramer, M. (2013). Principles of data mining. Springer.
  •     Albert-Laszlo Barabasi. Network Science
  •     Mark Newman. Networks: An Introduction. Oxford University Press, 2010. ISBN 978-0199206650.
  • Tools for network analysis and visualization
  •     Pajek - Program for Large Network Analysis, Pajek
  •     NodeXL - Tempalte for Excel, NodeXL
  •     SNAP - Stanford Network Analysis Project, SNAP
  •     Gephi , Graphviz etc.
  •     Visual Complexity
  •     D3.js - JavaScript library for manipulating documents based on data, D3.js
 

Course Outline:

  1. Data for data mining, types and sources of data
  2. Attributes and their types, sparse data, incomplete and inaccurate data
  3. Algebraic and geometric interpretation of data
  4. Probabilistic interpretation of data
  5. Numerical and categorial attributes, the basic analytical approaches
  6. Data mining, pre-processing and data cleaning
  7. Data representation
  8. Foundations of data analysis (classification, clustering)
  9. Networks and their properties
  10. Types of networks and their representation
  11. Basic measures and metrics
  12. Structure and global properties of networks
  13. Basic data structures for network representation
  14. Basic algorithms for network analysis