Hey.

Welcome to Nan's homepage! I am currently a Machine Learning Engineer on the Relevance & Personalization team at Airbnb, building deep neural models to improve Airbnb's homes search ranking. Prior to Airbnb, I was a Research Scientist on multiple ML teams at Facebook, where my work spanned across conversational automation using NLP and statistical modeling, optimizing human labeling workflows using generative models and model-assisted sampling, and various other predictive modeling projects. Before Facebook, I was a data scientist in the Applied Machine Learning group at Apple and on the Data Products and Research team at Upwork.
I received my Ph.D. in Computer Science from University of California, Santa Barbara, where I did research in data mining and applied ML. While studying in the beautiful Santa Barbara, I was blessed with internship opportunities at Microsoft Research (Cambridge, UK), IBM Research and Microsoft Bing.
My latest CV is here.

Research Interests

My research interests lie in the general topics of machine learning and data mining. I build a variety of machine learning models to solve interesting data problems, such as learning embeddings, search and ranking, information retrieval and recommender systems, and natural language processing. In my spare time, I enjoy learning about neural networks and deep learning. Some side projects on neural networks I contributed to can be found here.

My projects at Facebook have been in the realm of general applied machine learning, including ranking systems, embedding models, text classification, video feature extraction and sequencing, spatial clustering and image contour detection.

In the past, I worked on various regression/classification algorithms along with feature engineering and regularization mechanisms. I also had experiences in topic modeling, regularized mixture models, graphical models, inference, Bayesian online learning and reinforcement learning through function approximation.

When I was a Ph.D. student, I designed probabilistic models and combinatorial algorithms for solving graph problems, ranging from graph indexing and querying, to graph anomaly detection. More specifically, I have worked on label-based subgraph matching via density indexing, attribute proximity computation using personalized PageRank aggregation, and vertex classification through graph augmentation and random walks. My research focus later switched to applying statistical modeling and applied machine learning for graph problems, such as using a regularized mixture model to uncover anomalous regions of a vertex-attributed graph in a principled manner.

Next

Selected Publications

Please visit my Google Scholar profile for all my publications.

Ph.D. Dissertation

Nan Li, "Uncovering interesting attributed anomalies in large graphs", [pdf].

Archive

Tobias G. Tiecke, Xianming Liu, Amy Zhang, Andreas Gros, Nan Li, Gregory Yetman, Talip Kilic, Siobhan Murray, Brian Blankespoor, Espen B. Prydz, Hai-Anh H. Dang, "Mapping the world population one building at a time", CoRR abs/1712.05839 (2017), [pdf]

Conference Papers

Nan Li, Huan Sun, Kyle Chipman, Jemin George, Xifeng Yan, "A Probabilistic Approach to Uncovering Attributed Graph Anomalies", Proc. of the 2014 SIAM International Conference on Data Mining (SDM'14),Philadelphia, PA, Apr. 24-26, 2014, pp. 82-90, [pdf].

Nan Li, Ziyu Guan, Lijie Ren, Jian Wu, Jiawei Han, Xifeng Yan. "gIceberg: towards iceberg analysis in large graphs", Proc. of the 2013 IEEE International Conference on Data Engineering (ICDE'13), Brisbane, Australia, Apr. 8-12, 2013, pp. 1021-1032, [pdf].

Nan Li, Xifeng Yan, Zhen Wen, Arijit Khan, "Density index and proximity search in large graphs", Proc. of the 2012 ACM International Conference on Information and Knowledge Management (CIKM'12), Maui, HI, USA, Oct. 29-Nov. 2, 2012, pp. 235-244, [pdf].

Arijit Khan, Nan Li, Xifeng Yan, Ziyu Guan, Supriyo Chakraborty and Shu Tao, "Neighborhood based fast graph search in large networks", Proc. of the 2011 International Conference on Management of Data (SIGMOD'11), Athens, Greece, Jun. 12-16, 2011, pp. 901-912.

Charu Aggarwal, Nan Li, "On node classification in dynamic content-based networks", Proc. of the 2011 SIAM International Conference on Data Mining (SDM'11),Phoenix, AZ, USA, Apr. 28-30, 2011, pp. 355-366 [pdf].

Nan Li, Naoki Abe, "Temporal cross-sell optimization using action proxy-driven reinforcement learning", ICDM 2011 Workshop on Optimization BasedMethods for Emerging Data Mining Problems (ICDMW’11), Vancouver, Canada, Dec. 2011, pp. 259-266 [pdf].

Nan Li, Yinghui Yang, Xifeng Yan, "Cross-selling optimization for customized product promotion", Proc. of the 2010 SIAM International Conference on Data Mining (SDM'10), Columbus, OH, USA, Apr. 29-May 1, 2010, pp. 918-929 [pdf].

Journal Papers

Charu Aggarwal, Nan Li, "On supervised mining of dynamic content-based networks", Statistical Analysis and Data Mining, Vol. 5, No. 1, 2012, pp. 16-34 [pdf].

Nan Li, Desheng Dash Wu, "Using text mining and sentiment analysis for online forums hotspot detection and forecast", Decision Support Systems, Vol. 48, No. 2, 2010, pp. 354-368 [pdf].

Nan Li, Xun Liang, Xinli Li, Chao Wang, Desheng Dash Wu, "Network environment and financial risk using machine learning and sentiment analysis", Hum. Ecol. Risk Assess., Vol. 15, No. 2, 2009, pp. 227-252 [pdf].

Next

Some Cool Projects I Worked on at UCSB

My past projects include creating probabilistic models and algorithms to solve various graph mining problems, business optimization and customer lifetime value modeling, Bayesian online learning for user ranking, and so on. Here below are those I find particularly interesting:

Say Hello.

Send me an email by clicking the button below, or add me on LinkedIn!