Lanfeng Pan

PhD in Statistics

I am currently PhD candidate in Statistics at Iowa State University under the direction of Dr. Yehua Li. My research interests include High Performance Computing, False Discovery Rate Control, Clustering and Missing Data Analysis.

I completed my Master and Bachelor degree in Renmin University of China. My advisor of Master degree is Dr. Xiaoling Lu. My research was about data mining and matrix factorization at that time.

The programming languages I use most are R and Julia. I have been using R for 8 years and Julia for 4 years. I use R for plotting and reporting as well as small projects. When need to do heavy computing, I will turn to Julia for higher performance.

View CV in PDF.

Contact Me

Education

Skills

Awards

Predicted item returning probability given customer and item information in an online shopping problem. In charge of tuning the C5.0 algorithm which gave the best performance. See details and news reports at

Papers

Assessed the service quality of nationwide kidney transplant facilities and provided important guide for policymaker. To guarantee the fairness of comparison between facilities with very different number of patients, we proposed to model the facilities as random effects with Gaussian Mixture distribution. Our model avoided estimating variance for each facility so it was more stable than fixed effect models. Furthermore we compared facilities directly based on the their effects so our identification rule was superior than those based on p values.

Work experience

Project 1: Built shiny based user interface for data analysis and visualization on remote server. Project 2: Modeled the labor investment of pharmaceutical projects in decades to predict future labor investments. Also built an interactive visualization app for this data.

Presentation

Research Interests

Software Packages

Implement kernel density estimation and kernel regression. In particular this package can deal with bounded kernel estimation using beta and gamma kernel and can choose bandwidth via cross valuation.

Fit a Generalized Linear Mixed Model with Gaussian mixture random effects and decide the number of components for Gaussian mixture. And further conduct a multiple test to detect heterogeneity while controlling the False Discovery Rate.

Implement a lot of useful and handy R functions in Julia. The purpose is to provide better statistical functions for Julia language as well as make it easy to translate R code into Julia.

Implement the Kasahara-Shimotsu Test to decide number of components in Gaussian Mixture Model.

R package to solve the nonnegative matrix factorization problem using coordinate descent.

Port the Yeppp! library into Julia, significantly speeding up several basic arithmetic operations.