Fundamental Data Mining

Synchronization-based Data Mining: Synchronization is a powerful concept regulating a large variety of complex processes ranging from the metabolism in a cell to opinion formation in a group of individuals. Synchronization phenomena in nature have been widely investigated (e.g. :flashing fireflies, crickets, yeast, etc.) and models concisely describing the dynamical synchronization process have been proposed. Here, inspired by the synchronization phenomena, we introduce it into data mining domain, and have proposed several data mining algorithms. These algorithms shows several desirable properties compared to the state-of-the-art algorithms.   >>Enter<<

Complex Network Mining: Complex networks appear in many diverse settings, for example in social networks, protein-protein interaction networks in biology, person-account graphs in financial fraud detection, and others. It is also interdisciplinary field that combines ideas from mathematics, physics, computer science and other areas. Here, features of networks are focused on and functional questions, such as community detection, link prediction, graphs similarity, critical nodes identification and so on, are discussed in data mining perspective. In addition, we aim to propose several data mining algorithms in order to develop science research in the area of complex networks.   >>Enter<<

Feature Engineering: In many real world applications, the acquisition of sufficient labeled data is quite expensive and time consuming, for example, in text classification, one may have an easy access to a large database of documents by crawling the web, but only a small part of them are classified by hand. Thus gven the shortage of manual labeling in data mining, semi-supervised learning, which aims at improving prediction performance by leveraging both limited labeled data and a large amount of available unlabeled data, has gained growing attentions and considerable interests from the machine learning and data mining communities in recent years.   >>Enter<<

Spatial-Temporal Data Mining: The development in GPS and mobile computing techniques have generated massive spatial-temporal trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Spatial-temporal data mining is an emerging research area dedicated to the development and application of novel computational techniques for the analysis of large spatial-temporal databases. We propose some algorithms for processing, mining spatial-temporal data in a variety of mining tasks (such as spatial-temporal pattern mining, outlier detection, and spatial-temporal data cluster, and spatial-temporal data forecast, and group situation analysis based spatial-temporal). Research issue is to foster a broad range of applications.   >>Enter<<

Data Stream Mining: Data stream refers to dynamical data sets which continuously and rapidly grow over time. Data stream mining has gained increasing attentions due to its wide practical applications such as target marketing, email filtering and network intrusion detection. However, due to the evolving nature of data streams, data stream mining is non-trivial task. Aiming at classification, compression, clustering, outlier detection, concept drift in data streams, we propose a series of algorithms which can solve core problems effectively such as how to adapt to evolving concept and detect concept drift accurately.   >>Enter<<

Recommender System

Information filtering, as one of the technologies to solve the problem of information overload, has great importance in theory and practice. From the point of theorectical view, it is one of the issues in data mining. From the point of practical view, it is an important component in many e-commercial websites, which brings a lot of benefits for the on-line retailers. In recent years, information filtering has attracted many researchers from computer science, physical statistics, economics and mathematics. However, many problems are still not well solved, such as the diversity of recommendations as well as the personalized use of algorithms, sparsity of data sets and the characteristic of user-item bipartite networks. In order to solve those problems, we proposed some approaches using the theory and technique including computer science, physical statistics and mathematics.   >>Enter<<

Interdisplinary Research

Brain Network Mining: Mental diseases such as Alzheimer’s disease, schizophrenia and major depression, which now many people are paying extensive attention to, have become the most common diseases in the world. Research shows many mental diseases are associated with abnormality of brain networks, and fMRI, DTI and other neuroimaging techniques provide a way to acquire human brain’s structural connectome and functional connectome networks. Our goal is to find mental disease’s pathogenesis pattern or biomarker in brains by applying data mining and machine learning methods on brain networks, through which we can learn the internal pathogenesis of mental diseases, and it also helps doctor to diagnose mental diseases more objectively and accurately.   >>Enter<<

Environmental Data Mining: Sustainable flood retention basins (SFRBs) have diverse functions, such as drinking water supply, hydropower generation, drainage, etc. Making full use of these functions of current flood retention structures is a kind of adaptive measure to climate change. We developed a series of frameworks including feature selection, clustering, and classification (single-label classification, multi-label classification) to mine the intrinsic patterns (key features, of SFRB and to analyze the associated uncertainty. The findings provide scientific foundation and technical support for European flood risk management.   >>Enter<<