PhD in Statistics
I am currently PhD candidate in Statistics at Iowa State University under the direction of Dr. Yehua Li. My research interests include High Performance Computing, False Discovery Rate Control, Clustering and Missing Data Analysis.
I completed my Master and Bachelor degree in Renmin University of China. My advisor of Master degree is Dr. Xiaoling Lu. My research was about data mining and matrix factorization at that time.
The programming languages I use most are R and Julia. I have been using R for 8 years and Julia for 4 years. I use R for plotting and reporting as well as small projects. When need to do heavy computing, I will turn to Julia for higher performance.
- Ph.D. in Statistics, Iowa State University, 2012 – Now.
- Master in Statistics, Renmin University of China, 2010 – 2012.
- Bachelor in Statistics, Renmin University of China, 2006 – 2010.
- 8 years experience with R
- 4 years experience with Julia
- 4 years experience with Linux Shell and git
- Some experience with Python
- Proficient with
PAN, L., LI, Y., HE, K., LI, Y. and LI, Y. (2016). Latent Gaussian Mixture Models For Nationwide Kidney Transplant Center Evaluation. (Submitted, under review). arXiv:1703.03753.
LU, X., SI, J., PAN, L. and ZHAO, Y. (2011). Imputation of missing data using ensemble algorithms. Fuzzy Systems and Knowledge Discovery, 2011 Eighth International Conference on Shanghai. pp. 1312-1315
- First Place in the 15th Annual Data Mining Cup, May 2014.
Predicted item returning probability given customer and item information in an online shopping problem, utilizing ensemble algorithm consisting of C5.0, support vector machine and random forests. In charge of the C5.0 which gave the best performance.
- Research Assistant, 2014 – Now.
Model the effects of kidney transplant centers on surgery recipients survival time. Do clustering and heterogeneity detection on latent transplant centers effects while controlling the false discovery rate.
- Intern at Novartis Pharmaceuticals, NJ, May 2015 – August 2015.
Project 1: Built
shiny based user interface for data analysis and visualization on remote server.
Project 2: Modeled the labor investment of pharmaceutical projects in decades to predict future labor investments. Also built an interactive visualization app for this data.
Agriculture Experiment Station Consulting Group, May 2014 – July 2014.
Teaching Assistant, August 2012 – May 2014.
- PAN, L., LI, Y., HE, K., LI, Y. and LI, Y. (2015). Generalized Linear Mixed Model with Normal Mixture Random Effects. Joint Statistical Meetings. ASA. Seattle, WA, USA, Aug. 2015.
- Data Mining
- High Performance Computing
- Multiple Testing, False Discovery Rate Control
- Clustering, Subgroup Analysis
- Missing Data Analysis
- Data Visualization
- Health Policy
Implement kernel density estimation and kernel regression. In particular this package can deal with bounded kernel estimation using beta and gamma kernel and can choose bandwidth via cross valuation.
Fit a Generalized Linear Mixed Model with Gaussian mixture random effects and decide the number of components for Gaussian mixture. And further conduct a multiple test to detect heterogeneity while controlling the False Discovery Rate.
Implement a lot of useful and handy R functions in Julia. The purpose is to provide better statistical functions for Julia language as well as make it easy to translate R code into Julia.
Implement the Kasahara-Shimotsu Test to decide number of components in Gaussian Mixture Model.
R package to solve the nonnegative matrix factorization problem using coordinate descent.
- Contribute to
library into Julia, significantly speeding up several basic arithmetic operations.
- Contribute to several core statistical packages in Julia community including