E0 259 : Data Analytics
Ramesh Hariharan (Strand Genomics and CSA, IISc) and Rajesh Sundaresan (ECE, IISc)
30 November 2016, 8:00 am onwards.
Venue: ECE 1.08.
Lectures: Tuesdays and Thursdays : 9:30 - 11:00 am, CSA 254 Seminar Hall
Lectures and assignments
Last year's lectures and assignments
D. Thirumulanathan, Ankit Jauhari
Data sets from astronomy, genomics, visual neuroscience, sports, speech recognition, computational linguistics and social networks will be analysed to answer specific scientific questions.
Statistical tools and modeling techniques will be introduced as needed to analyse the data and eventually address the scientific question.
Data Analytics is assuming increasing importance in recent times. Several industries are now built around the use of data for decision making. Several research areas too, genomics and neuroscience being notable examples, are increasingly focused on large-scale data generation rather than small-scale experimentation to generate initial hypotheses. This brings about a need for data analytics. This course will develop modern statistical tools and modeling techniques through hands-on data analysis in a variety of application domains.
The course will illustrate the principles of hands-on data analytics through several case studies (8-10 such studies). On each topic, we will introduce a scientific question and discuss why it should be addressed. Next, we will present the available data, how it was collected, etc. We will then discuss models, provide analyses, and finally touch upon how to address the scientific question using the analyses.
We plan to cover the following case studies.
- Astronomy: From Tycho Brahe's observations to the conclusion that Mars moves in an elliptical orbit.
- Visual Neuroscience: Neural correlates predict search difficulty.
- Genomics: Understanding the causes of cancer.
- Sports: The Duckworth-Lewis-Stern method for setting targets in shortened limited overs cricket matches.
- Genomics: The basis for red-green colour blindness.
- Genomics: Evolutionary history of Indian caste populations.
- Signal Processing: Video background separation.
- Networks: Community detection.
- Recommendation systems.
- Networks: Functional connectivity patterns of the brain.
- Random Processes (E2 202) OR Probability and Statistics (E0 232) OR equivalent.
There will be about eight assignments, one on each topic. A fair amount of hands-on work is expected. Students will use R or Python or Matlab or other similar tool.
- 50/100 : Homeworks
- 20/100 : Scribing of lecture notes
- 30/100 : Course project and presentation
- There is no text book for this course. Various handouts will be provided from different sources.