Causality in Big Data
Project Description

The recent explosion in data in various fields demand new way of thinking on how the data can be understood and how useful relations can be extracted. In all disciplines of sciences, one of the fundamental subjects of studies is causality, which is used in research in physics, as well as in all kinds of arguments in daily life. In this project we like to use various existing techniques, such as Granger causality, information flow, transfer entropy, etc., to analyze data ranging from heart beats, breathing patterns during sleep, brain signals, financial time series, to communication data on social network. We hope to use statistical physics, information theory and computer science to approach the problem of finding causal relation in big data.

Supervisor
SZETO Kwok Yip
Quota
2
Course type
UROP1000
UROP1100
UROP2100
UROP3100
UROP4100
Applicant's Roles

Students will first learn some basics tools in statistical physics such as information entropy before analyzing data. The type of data used depends on its availability in public domain and the interest of the student. The purpose is to educate the students with mathematical tools of analysis, as well as model building techniques in physics.

Applicant's Learning Objectives

Students will learn information entropy, its use and relation to physics.
Data analysis techniques will be built , rather than using available package.
Students will also understand the pros and cons of various techniques used in data mining for causal relations.

Complexity of the project
Moderate