EPIC working groups

The Statistical Working Group

The EPIC study offers information on a variety of exposures, i.e. dietary, lifestyle, metabolomics, genetics, and other -omics, assessed through questionnaire-based methods and laboratory analyses, together with information on the onset of multiple end-points: overall and cause-specific death, incident cancer, but also incident cardiovascular diseases and diabetes. This makes the EPIC study an ideal setting to address a variety of topical epidemiological research questions, while applying increasingly complex statistical modelling. Other than ensuring support to the analysis of epidemiological data with conventional statistical techniques, members of the Statistical Working group have developed methodological work on statistics, with main focus on following areas:

  • Measurement errors in dietary exposure. The measurement error structure of self-reported dietary assessments was investigated through comparison with biomarkers of exposure. Calibration models were developed to correct the diet-disease association in a multicentric context.

  • Survival analysis. Although standard epidemiological studies focusing on time-to-event outcomes are still mostly concerned with the estimation of hazard ratios, other quantities (including attributable fractions, and absolute risks) have been proved possibly more useful, e.g. from a public health perspective. Methods based on flexible parametric models, possibly accounting for the presence of competing risks, and more generally in the context of multistate models, were implemented and applied to the EPIC data.

  • Analysis of complex sets of data. The adaptation of long-existing multivariate techniques to the identification of the unwanted sources of variability (e.g., technical variability) in metabolomic profile data represents a valuable strategy. These techniques can be applied to similar large-scale sets of data, i.e. miRNAs, CpG islands (epigenetic), and genomic data. Also, L1 penalized methods dedicated to the analysis of large -omics data were developed and applied to the EPIC data.

  • Causal mediation analysis. Causal mediation analysis has been a very active domain of research in the statistical literature lately. These modern tools have been applied to assess the biological pathways possibly underlying some well-established exposure-cancer relationship (such as the obesity-breast cancer, or alcohol-Hepatocellular Carcinoma). New methods for mediation analysis were further developed, in particular in the setting of time-to-event outcomes. Further methodological work on the general topic of causal inference was conducted.

The EPIC Statistical Working Group meets with the objectives of coordinating ongoing activities, stimulating collaborations among statisticians, promoting statistical expertise within the EPIC network, and possibly seeking funding opportunities. Collaborations with worldwide statistical experts are established and also continually sought. This context is a stimulating environment for early career scientists to develop and implement innovative research ideas with real data.

Selected publications

  1. Ballout N, Garcia C & Viallon V Sparse estimation for case-control studies with multiple subtypes of cases. Biostatistics. 2020; To appear.

  2. Freisling H, Viallon V, Lennon H, Bagnardi V, Ricci C, Butterworth AS, ..., Ferrari P. Lifestyle factors and risk of multimorbidity of cancer and cardiometabolic diseases: a multinational cohort study. BMC Med. 2020 Jan 10;18(1):5.

  3. Assi N, Rinaldi S, Viallon V, Dashti SG, Dossus L, Fournier A, ..., Ferrari P. Mediation analysis of the alcohol-postmenopausal breast cancer relationship by sex hormones in the EPIC cohort. Int J Cancer. 2020 Feb 1;146(3):759-768.

  4. Fasanelli F, Giraudo MT, Ricceri F, Valeri L & Zugna D. Marginal Time-Dependent Causal Effects in Mediation Analysis With Survival Data. American Journal of Epidemiology. 2019 May; 188(5): 967-974

  5. Perrier F, Novoloaca A, Ambatipudi S, Baglietto L, Ghantous A, Perduca V, ..., Ferrari P. Identifying and correcting epigenetics measurements for systematic sources of variation. Clin Epigenetics. 2018 Mar 21;10:38.

  6. Assi N, Gunter MJ, Thomas DC, Leitzmann M, Stepien M, Chajès V, ..., Ferrari P. Metabolic signature of healthy lifestyle and its relation with risk of hepatocellular carcinoma in a large European cohort. Am J Clin Nutr. 2018 Jul 1;108(1):117-126.

  7. Li K, Anderson G, Viallon V, Arveux P, Kvaskoff M, Fournier A, ..., Ferrari P. Risk prediction for estrogen receptor-specific breast cancers in two large prospective cohorts. Breast Cancer Res. 2018 Dec 3;20(1):147.

  8. Assi N, Thomas DC, Leitzmann M, Stepien M, Chajès V, Philip T, ..., Ferrari P, Viallon V. Are Metabolic Signatures Mediating the Relationship between Lifestyle Factors and Hepatocellular Carcinoma Risk? Results from a Nested Case-Control Study in EPIC. Cancer Epidemiol Biomarkers Prev. 2018 May;27(5):531-540.

  9. Muller DC, Murphy N, Johansson M, Ferrari P, et al Modifiable causes of premature death in middle-age in Western Europe: results from the EPIC cohort study. BMC medicine 2016; 14(1), 87.

  10. Assi N, Fages A, Vineis P, Chadeau-Hyam M, Stepien M, Duarte-Salles T, ..., Viallon V, Ferrari P. A statistical framework to model the meeting-in-the-middle principle using metabolomic data: application to hepatocellular carcinoma in the EPIC study. Mutagenesis. 2015 Nov;30(6):743-53.

  11. Francesca, Maria Teresa, Fulvio Ricceri, Linda Valeri, Daniela Zugna Sera F & Ferrari P. A multilevel model to estimate the within- and the between-centercomponents of the exposure/disease association in the EPIC study. PLoS One. 2015 Mar 18;10(3):e0117815.

  12. Fages A & Ferrari P, Monni S, Dossus L, Floegel A, Mode N, et al. Investigating sources of variability in metabolomic data in the EPIC study: the Principal Component Partial R-square (PC-PR2) method. Metabolomics. 2014 Dec, Volume 10, Issue 6, pp 1074-1083.


  1. Etievant L & Viallon V. Which practical interventions does the do-operator refer to in causal inference? Illustration on the example of obesity and cancer. Submitted. Preprint available at arxiv.org/abs/1901.00772

  2. Etievant L & Viallon V. Causal inference under over-simplified longitudinal causal models. Submitted. Preprint available at arxiv.org/abs/1810.01294

Contact details/Working Group leaders

Vivian Viallon, PhD
Nutritional Methodology and Biostatistics Group (NMB)
International Agency for Research on Cancer (IARC/WHO)
150 cours Albert Thomas, 69008 Lyon, France

Pietro Ferrari, PhD
Nutritional Methodology and Biostatistics Group (NMB)
International Agency for Research on Cancer (IARC/WHO)
150 cours Albert Thomas, 69008 Lyon, France