Research
My research is mainly about local pattern discovery connects explainable subgroups in the datasets to the machine learning models, which we call Exceptional Model Mining (EMM). My study values the importance of data and the functions learned from those data as certain representations. Hence, I`m interested in exploring various search algorithms for pattern discovery and machine learning / probabilistic models for knowledge discovery. Specifically, my research areas include Exceptional Model Mining, Causal Inference, Trustworthy Machine Learning and Spatio-Temporal Data Mining.
|
|
Trustworthy Machine Learning (On-going)
How much can we trust the predictions of Image recognition model backed by Deep Neural Networks like Transformers and Resnet?
This project proposes a set of perturbation operations that can be applied on the underlying data to generate test samples of different types. The perturbations reflect potential changes in operating environments, and interrogate various properties ranging from the strictly quantitative to more qualitative.
|
|
Beyond Discriminant Patterns: On the Robustness of Decision Rule Ensembles (On-going)
How much can we trust the rules discovered from one dataset when applying to the others?
This project proposes to discover local decision rules that are robust across different environments.
|
|
Adversarial Balancing based representation learning for Causal Effect Inference (Finished)
Can we estimate the potential effect before applying a clinical treatment plan, or a policy?
This project proposes a deep neural network framework to solve the causal inference problem with observational data. The confounding bias is tackled by applying a Generative Adversarial Netwoks.
|
|
Generative model with Variational Auto-encoder (On-going)
Have you ever encountered mode collaspe when training VAE model? Why does a probabilistic generative model fail to capture the data distribution?
This project explores the phenomenon of KL vanishing problem from the views of both encoder and decoder, aiming to build a more generalizable generative model for multiple datasets.
|
|
Graph Convolutional Neural Network For Link Prediction With Implicit Feedback (On-going)
How to predict future interactions between customers and products?
Graph data reveals the interactive patterns in the world between multiple types of entities. How dows the interactive patterns demonstrate meaningful representations? This project explores link prediction on heterogeneous graph with multi-modal data using graph convolutional neural networks.
|
|
Exceptional Model Mining on multi-modal data (finished)
How does the interactions of spatial, temporal, text or network data reveal specific patterns in sub-populations against the overall distribution?
This project proposes to explore data mining tools to discover interesting subgroups from multi-modal data distributions by employing a probabilistic modeling method.
|
Adversarial Balancing-Based Representation Learning For Causal Effect Inference With Observational Data
Xin Du,
Lei Sun,
Wouter Duivesteijn,
Alexander Nikolaev,
Mykola Pechenizkiy
Data Mining and Knowledge Discovery Special Issue: Mining for Health, 2021
|
Exceptional Spatio-temporal Behavior Mining Through Bayesian Non-Parametric Modeling
Xin Du,
Yulong Pei,
Wouter Duivesteijn,
Mykola Pechenizkiy
Journal Track of EMCL PKDD, 2020
|
Fairness in Network Representation by Latent Structural Heterogeneity in Observational Data
Xin Du,
Yulong Pei,
Wouter Duivesteijn,
Mykola Pechenizkiy
AAAI Conference on Artificial Intelligence (AAAI), 2020
|
|