Welcome to Nan's homepage! I am currently a Research Scientist on the Messenger Ads Ranking team at Facebook, working on solving conversational AI problems using NLP. I received my Ph.D. in Computer Science from University of California, Santa Barbara. My research interests lie in machine learning and its various applications. While studying in the beautiful Santa Barbara, I was also blessed with internship opportunities at Microsoft Research in Cambridge, UK, IBM Research in Yorktown Heights, NY and Microsoft Bing in Bellevue, WA. Prior to joining Facebook, I worked as a data scientist in the Applied Machine Learning group at Apple, as well as a data scientist on the Data Products and Research team at oDesk (now Upwork).
My latest CV is here.

Research Interests

My research interests lie in the general topics of machine learning and data mining. I build a variety of machine learning models to solve interesting data problems, such as learning embeddings, search and ranking, information retrieval and recommender systems, and natural language processing. In my spare time, I enjoy learning about neural networks and deep learning. Some side projects on neural networks I contributed to can be found here.

My projects at Facebook have been in the realm of general applied machine learning, including ranking systems, embedding models, text classification, video feature extraction and sequencing, spatial clustering and image contour detection.

In the past, I worked on various regression/classification algorithms along with feature engineering and regularization mechanisms. I also had experiences in topic modeling, regularized mixture models, graphical models, inference, Bayesian online learning and reinforcement learning through function approximation.

When I was a Ph.D. student, I designed probabilistic models and combinatorial algorithms for solving graph problems, ranging from graph indexing and querying, to graph anomaly detection. More specifically, I have worked on label-based subgraph matching via density indexing, attribute proximity computation using personalized PageRank aggregation, and vertex classification through graph augmentation and random walks. My research focus later switched to applying statistical modeling and applied machine learning for graph problems, such as using a regularized mixture model to uncover anomalous regions of a vertex-attributed graph in a principled manner.


Selected Publications

Please visit my Google Scholar profile for all my publications.

Ph.D. Dissertation

Nan Li, "Uncovering interesting attributed anomalies in large graphs", [pdf].


Tobias G. Tiecke, Xianming Liu, Amy Zhang, Andreas Gros, Nan Li, Gregory Yetman, Talip Kilic, Siobhan Murray, Brian Blankespoor, Espen B. Prydz, Hai-Anh H. Dang, "Mapping the world population one building at a time", CoRR abs/1712.05839 (2017), [pdf]

Conference Papers

Nan Li, Huan Sun, Kyle Chipman, Jemin George, Xifeng Yan, "A Probabilistic Approach to Uncovering Attributed Graph Anomalies", Proc. of the 2014 SIAM International Conference on Data Mining (SDM'14),Philadelphia, PA, Apr. 24-26, 2014, pp. 82-90, [pdf].

Nan Li, Ziyu Guan, Lijie Ren, Jian Wu, Jiawei Han, Xifeng Yan. "gIceberg: towards iceberg analysis in large graphs", Proc. of the 2013 IEEE International Conference on Data Engineering (ICDE'13), Brisbane, Australia, Apr. 8-12, 2013, pp. 1021-1032, [pdf].

Nan Li, Xifeng Yan, Zhen Wen, Arijit Khan, "Density index and proximity search in large graphs", Proc. of the 2012 ACM International Conference on Information and Knowledge Management (CIKM'12), Maui, HI, USA, Oct. 29-Nov. 2, 2012, pp. 235-244, [pdf].

Arijit Khan, Nan Li, Xifeng Yan, Ziyu Guan, Supriyo Chakraborty and Shu Tao, "Neighborhood based fast graph search in large networks", Proc. of the 2011 International Conference on Management of Data (SIGMOD'11), Athens, Greece, Jun. 12-16, 2011, pp. 901-912.

Charu Aggarwal, Nan Li, "On node classification in dynamic content-based networks", Proc. of the 2011 SIAM International Conference on Data Mining (SDM'11),Phoenix, AZ, USA, Apr. 28-30, 2011, pp. 355-366 [pdf].

Nan Li, Naoki Abe, "Temporal cross-sell optimization using action proxy-driven reinforcement learning", ICDM 2011 Workshop on Optimization BasedMethods for Emerging Data Mining Problems (ICDMW’11), Vancouver, Canada, Dec. 2011, pp. 259-266 [pdf].

Nan Li, Yinghui Yang, Xifeng Yan, "Cross-selling optimization for customized product promotion", Proc. of the 2010 SIAM International Conference on Data Mining (SDM'10), Columbus, OH, USA, Apr. 29-May 1, 2010, pp. 918-929 [pdf].

Journal Papers

Charu Aggarwal, Nan Li, "On supervised mining of dynamic content-based networks", Statistical Analysis and Data Mining, Vol. 5, No. 1, 2012, pp. 16-34 [pdf].

Nan Li, Desheng Dash Wu, "Using text mining and sentiment analysis for online forums hotspot detection and forecast", Decision Support Systems, Vol. 48, No. 2, 2010, pp. 354-368 [pdf].

Nan Li, Xun Liang, Xinli Li, Chao Wang, Desheng Dash Wu, "Network environment and financial risk using machine learning and sentiment analysis", Hum. Ecol. Risk Assess., Vol. 15, No. 2, 2009, pp. 227-252 [pdf].


Some Cool Projects I Worked On

My past projects include creating probabilistic models and algorithms to solve various graph mining problems, business optimization and customer lifetime value modeling, Bayesian online learning for user ranking, and so on. Here below are those I find particularly interesting:

Say Hello.

Send me an email by clicking the button below, or add me on LinkedIn!