Skip to content

Latest commit

 

History

History
133 lines (96 loc) · 8.22 KB

README.md

File metadata and controls

133 lines (96 loc) · 8.22 KB

Big data course 2019

Authors: Marc & Liubov (network theory) | Anirudh & Felix (Big data in mental health) | Loic (data management)

Let us know more about you in this form!

This is the repository of the CRI Digital Master "Big data" course for Fall 2019.

Repository for the whole course is here https://github.com/Big-data-course-CRI/materials_big_data_cri_2019

Big data course which we are making at CRI cri-paris.org for Fall 2019.

Course description on network theory

This course will provide an introduction to the field of big data, with a focus on network data and data for mental health. Topics will cover data project management, infrastructure of big data, data analysis and visualisation, and mental health data. The course will be divided into a big data and a network data parts.

Why focus on network data? Over the past century, network studies have had significant impact in disciplines as varied as mathematics, sociology, physics, biology, computer science or quantitative geography, giving birth to Network Science as a field of itself. With the recent rise of social networks in the last decade, their use has now become widespread in the digital world. Here we will provide an introduction to the field of Network Science, from the theoretical foundations (generating, analysing, perturbing networks) to the practical hands-on part (analysis and visualisation of a real-world networks).

Network topics will cover

(Marc, Liubov)

  1. How to construct networks from real data?
  2. How to analyze networks? (centrality measures, community detection, statistical analyses etc.)
  3. How to visualise networks?
  4. Dynamics and spreading phenomena on networks (epidemics / information spreading, diffusion)
  5. How do networks wirings change in time? (network robustness, temporal networks)
  6. How to represent more complex network data? Multilayer, multiplex networks.

Students will select, analyse and present a network of their choice as part of a personal project for the course. They will also choose an advanced topic in network science & big data for which they will make a presentation in a reverse classroom setting. They will in particular contribute to a wikipedia page about that topic.

Data Efforts in the Mental Health part:

In this part, students will be presented with topics related to the infrastructure of ‘big data’. They will be introduced to barriers, current trends, types, protocols and importance of ‘big data’ collection in the sphere of mental health, specifically through the (i) Healthy Brain Network project for 10000 children collecting and sharing neuroimaging & phenotypic data. Students will also contribute to the development of (ii) A Linked Semantic Mental Health Database and scientific framework mapping signs, symptoms and behaviors to subjective and objective measures, projects and technologies (https://github.com/ChildMindInstitute/mhdb/wiki) (iii) MindLogger Data Collection Platform & App to dramatically improve the convenience, consistency, efficiency, accuracy & analysis of widely distributed data efforts (https://mindlogger.org/)

Students will then spend the last part of the course working on a research project developing and applying digital tools related to ‘big data’ and mental health, using the skills obtained from the first part of the course.

Resources

Introductory material on networks:

Big Data & Mental Health:

(Felix, Anirudh)

How To Use Github

https://guides.github.com/activities/hello-world/ Executing notebooks from github https://mybinder.org/

Data

Network databases Index of Complex Networks (ICON): https://icon.colorado.edu/ 5,000+ networks Network repository: http://networkrepository.com/ offers a lot of visualisation tools already in the website

On Github: Deezer Social Networks, Facebook Page-Page Networks, Wikipedia Article Networks: https://github.com/benedekrozemberczki/datasets A Repository of Benchmark Graph Datasets for Graph Classification (31 Graph Datasets In Total https://github.com/shiruipan/graph_datasets Repository of Network repositories: https://github.com/ComplexNetTSP/ComplexNetWiki/wiki/Networks-datasets

Resources

  1. Introductory interactive textbook by A-L Barabasi: http://networksciencebook.com/ Chapter 2 for network metrics Chapter 9 for community detection
  2. An introduction to network visualisation: BASIC Gephi: http://www.martingrandjean.ch/gephi-introduction/ INTERMEDIATE Cytoscape: https://github.com/cytoscape/cytoscape-tutorials/wiki ADVANCED R: https://kateto.net/network-visualization Python: https://www.analyticsvidhya.com/blog/2018/04/introduction-to-graph-theory-network-analysis-python-codes/ Cytoscape.js: https://blog.js.cytoscape.org/2016/05/24/getting-started/ D3.js: https://www.d3-graph-gallery.com/network