The Statistical Working Group

The EPIC study offers opportunities to conduct methodological research that makes secondary use of available information on a variety of exposures, i.e. dietary, lifestyle, metabolomics, proteomics, genetics, and other -omics, assessed through questionnaire-based methods and laboratory analyses, together with information on the onset of multiple end-points, including overall and cause-specific death, incident cancers, and also incident cardiovascular diseases and diabetes. This makes the EPIC study an ideal setting in which to address a variety of topical epidemiological research questions, while applying increasingly complex statistical modelling. In addition to ensuring support for the analysis of epidemiological data with conventional statistical techniques, the Statistical Working Group has developed methodological work on statistics, with a focus on the following areas:

  • Measurement errors in dietary exposure. The measurement error structure of self-reported dietary assessments was investigated through comparison with biomarkers of exposure. Calibration models were developed to correct the diet–disease association in a multicentre context. 
  • Survival analysis. Whereas standard epidemiological studies focusing on time-to-event outcomes refer to the estimation of hazard ratios, other quantities (including attributable fractions and absolute risks) have proven to be very useful, for example from a public health perspective. Methods based on flexible parametric models, possibly accounting for the presence of competing risks, and more generally in the context of multistate models, were implemented and applied to the EPIC data. Statistical methods were developed to fully exploit the increasing amount of information available in different nested case–control and case–cohort studies within EPIC.
  • Analysis of complex sets of data. The adaptation of long-existing multivariate techniques to the identification of the unwanted sources of variability (e.g. technical variability) in metabolomic profile data is a valuable strategy. These techniques can be applied to similar large-scale sets of data, i.e. proteomics, miRNAs, CpG islands (epigenetic), and genomic data. Also, L1 penalized methods dedicated to the analysis of large -omics data were developed and applied to the EPIC data. 
  • Causal mediation analysis. Causal mediation analysis has been a very active domain of research in the statistical literature lately. These modern tools have been applied to assess the biological pathways possibly underlying some well-established exposure–cancer relationships (e.g. obesity–breast cancer, alcohol–hepatocellular carcinoma). New methods for mediation analysis were further developed, in particular in the setting of time-to-event outcomes. Further methodological work on the general topic of causal inference was conducted.

The Statistical Working Group meets with the objectives of coordinating ongoing methodological activities, stimulating collaborations among statisticians, promoting statistical expertise within the EPIC network, and possibly seeking funding opportunities. Collaborations with worldwide statistical experts are established and also continually sought. This context is a stimulating environment for early-career scientists to develop and implement innovative research ideas with real data.

 

Selected publications

  1. Ballout N, Etievant L, Viallon V (2023). On the use of cross-validation for the calibration of the adaptive lasso. Biom J. e2200047. Epub ahead of print. https://doi.org/10.1002/bimj.202200047 PMID:36960476
  2. Botteri E, Peveri G, Berstad P, Bagnardi V, Chen SLF, Sandanger TM, et al. (2023). Changes in lifestyle and risk of colorectal cancer in the European Prospective Investigation into Cancer and Nutrition. Am J Gastroenterol. 118(4):702–11. https://doi.org/10.14309/ajg.0000000000002065 PMID:36227801
  3. Etievant L, Viallon V (2022). On some limitations of probabilistic models for dimension-reduction: illustration in the case of one particular probabilistic formulation of PLS. Stat Neerl. 76(3):331–46. https://doi.org/10.1111/stan.12262
  4. Breeur M, Ferrari P, Dossus L, Jenab M, Johansson M, Rinaldi S, et al. (2022). Pan-cancer analysis of pre-diagnostic blood metabolite concentrations in the European Prospective Investigation into Cancer and Nutrition. BMC Med. 20(1):351. https://doi.org/10.1186/s12916-022-02553-4 PMID:36258205
  5. Mayén AL, Viallon V, Botteri E, Proust-Lima C, Bagnardi V, Batista V, et al. (2022). A longitudinal evaluation of alcohol intake throughout adulthood and colorectal cancer risk. Eur J Epidemiol. 37(9):915–29. https://doi.org/10.1007/s10654-022-00900-6 PMID:36063305
  6. Étiévant L, Viallon V (2021). Causal inference under over-simplified longitudinal causal models. Int J Biostat. 18(2):421–37. https://doi.org/10.1515/ijb-2020-0081 PMID:34727585
  7. Ballout N, Garcia C, Viallon V (2021). Sparse estimation for case-control studies with multiple disease subtypes. Biostatistics. 22(4):738–55. https://doi.org/10.1093/biostatistics/kxz063 PMID:31977036
  8. Loftfield E, Stepien M, Viallon V, Trijsburg L, Rothwell JA, Robinot N, et al. (2021). Novel biomarkers of habitual alcohol intake and associations with risk of pancreatic and liver cancers and liver disease mortality. J Natl Cancer Inst. 113(11):1542–50. https://doi.org/10.1093/jnci/djab078 PMID:34010397
  9. Viallon V, His M, Rinaldi S, Breeur M, Gicquiau A, Hemon B, et al. (2021). A new pipeline for the normalization and pooling of metabolomics data. Metabolites. 11(9):631. https://doi.org/10.3390/metabo11090631 PMID:34564446
  10. Freisling H, Viallon V, Lennon H, Bagnardi V, Ricci C, Butterworth AS, et al. (2020). Lifestyle factors and risk of multimorbidity of cancer and cardiometabolic diseases: a multinational cohort study. BMC Med. 18(1):5. https://doi.org/10.1186/s12916-019-1474-7 PMID:31918762
  11. Assi N, Rinaldi S, Viallon V, Dashti SG, Dossus L, Fournier A, et al. (2020). Mediation analysis of the alcohol-postmenopausal breast cancer relationship by sex hormones in the EPIC cohort. Int J Cancer. 146(3):759–68. https://doi.org/10.1002/ijc.32324 PMID:30968961
  12. Fasanelli F, Giraudo MT, Ricceri F, Valeri L, Zugna D (2019). Marginal time-dependent causal effects in mediation analysis with survival data. Am J Epidemiol. 188(5):967–74. https://doi.org/10.1093/aje/kwz016 PMID:30689682
  13. Perrier F, Novoloaca A, Ambatipudi S, Baglietto L, Ghantous A, Perduca V, et al. (2018). Identifying and correcting epigenetics measurements for systematic sources of variation. Clin Epigenetics. 10(1):38. https://doi.org/10.1186/s13148-018-0471-6 PMID:29588806
  14. Assi N, Gunter MJ, Thomas DC, Leitzmann M, Stepien M, Chajès V, et al. (2018). Metabolic signature of healthy lifestyle and its relation with risk of hepatocellular carcinoma in a large European cohort. Am J Clin Nutr. 108(1):117–26. https://doi.org/10.1093/ajcn/nqy074 PMID:29924298
  15. Li K, Anderson G, Viallon V, Arveux P, Kvaskoff M, Fournier A, et al. (2018). Risk prediction for estrogen receptor-specific breast cancers in two large prospective cohorts. Breast Cancer Res. 20(1):147. https://doi.org/10.1186/s13058-018-1073-0 PMID:30509329
  16. Assi N, Thomas DC, Leitzmann M, Stepien M, Chajès V, Philip T, et al. (2018). Are metabolic signatures mediating the relationship between lifestyle factors and hepatocellular carcinoma risk? Results from a nested case-control study in EPIC. Cancer Epidemiol Biomarkers Prev. 27(5):531–40. https://doi.org/10.1158/1055-9965.EPI-17-0649 PMID:29563134
  17. Pittavino M, Plummer M, Johansson M, Riboli E, Ferrari P (2022). A Bayesian hierarchical framework to integrate dietary exposure and biomarker measurements into aetiological models. Preprint. https://doi.org/10.1101/2022.03.24.22272838


Contact details/Working Group leader

Vivian Viallon, PhD
Team Leader, Biostatistics and Data Integration Team (BDI), Nutrition and Metabolism Branch (NME)
International Agency for Research on Cancer (IARC/WHO)
25 avenue Tony Garnier
CS 90627
69366 LYON CEDEX 07
France
ViallonV@iarc.who.int