Data Mining Lab
How do we find useful patterns and anomalies in big data? How to handle huge data that do not fit in the memory or disk of a single machine? How to analyze high-velocity data streams? In Data Mining Lab, we research on algorithms, systems, and discoveries for extremely scalable data analysis with applications on knowledge discovery and anomaly detection. Our main research topics include graph mining, tensor analysis, scalable machine learning, and stream mining.
How can we find patterns and anomalies in large graphs that do not fit in the memory or disks of a single machine? We develop algorithms, and systems to analyze large graphs, like social networks or the Web, to find important patterns and anomalies. Specifically, we work on scalable graph mining platforms, graph compression, triangle analysis, graph analysis, anomaly detection, and random walks on graphs.
How can we analyze multi-dimensional data, like network intrusion logs (source-ip, target-ip, port-number, timestamp), or social networks over time (sender, receiver, time)? Tensors are suitable for modeling these multi-dimensional data, and we work on scalable tensor analysis algorithms: scalable eigensolver, and scalable tensor decomposition algorithms.
Scalable Machine Learning
How can we learn from massive amount of data in a scalable way? We work on scaling up machine learning algorithms, including belief propagation, data clustering, similarity calculation in graphs, logistic regression, and large scale recommendation systems.
How can we discover useful patterns from time-evolving high speed data streams? How to analyze the data streams quickly and accurately, with little space overhead? We work on fast algorithms for stream mining, including top-k frequent items over time, with applications on social networks and health care.