题目：Statistical Modeling of Omics Data Using Two-Stage-PO2PLS
摘要：Many studies are interested in the relationship between omic variables and outcomes. While omic variables change with age, genetic components of these variables are time invariant. Our work is motivated by the cohort ORCADES with information on single nucleotide polymorphisms (SNPs), glycomics and metabolomics, and the outcome variable body mass index (BMI). To estimate the genetic part of the omics data, polygenic risk scores for each omic variable (Omics-PRS), linear combinations of SNPs weighted by regression coeﬀicients can be computed and included in a model for BMI. However, these methods ignore the genetic correlation between omic variables. An alternative would be the joint components of a latent variable model such as the PO2PLS model.
A simulation study is performed to compare the performance of Omics-PRS and PO2PLS. We generate data with various dimensions using different models. For computing Omics-PRS, we use Lasso and Ridge to deal with the correlation between genetic markers. For modeling the outcome, we apply Ridge regression to deal with the large number of Omics-PRS variables. We evaluate the performance of methods using R square. We will show the results of simulation study and data analysis. Preliminary results show that PO2PLS outperforms Omics-PRS.
报告人简介：李赫，硕士毕业于中国矿业大学（北京）151amjs澳金沙门・(中国)有限公司统计学专业，研究方向为函数型回归分析；后赴荷兰Radboud University生物统计学专业读博，研究方向为组学数据的统计建模. 博士期间参与ERC Advanced Grants一项，以及the 9th Channel Network Conference并作口头报告.