picture_as_pdf Download PDF

IARC 60th Anniversary - 19-21 May 2026

Session : Mutational Epidemiology for Cancer Prevention

The Confluence Project: Largest multi-ancestry GWAS of breast cancer identifies 469 susceptibility loci in over two million participants

GARCIA-CLOSAS M. 1

1 Institute of Cancer Research, Sutton, United Kingdom

Presented on behalf of The Confluence Project - National Cancer Institute, African-ancestry Breast Cancer Genetic Consortium, Breast Cancer Association Consortium, Consortium of Investigators of Modifiers of BRCA1/2, Latin America Genomics of Breast Cancer Consortium, Male Breast Cancer Genetics Consortium

Background: Genome-wide association studies (GWAS) of breast cancer have identified over 230 susceptibility loci, yet much of its heritability remains unexplained. Moreover, previous large GWAS included mostly European-ancestry samples, restricting the scope of genetic variation studied and limiting the generalizability of polygenic risk scores trained in these studies.

Objectives: The Confluence Project is a large, international collaboration of breast cancer consortia (including BRCA1/2 mutation carriers), biobanks, and other individual studies, studying breast cancer in both females and males. With over 300 studies from 61 countries, Confluence includes over 400,000 breast cancer cases and 1,600,000 controls, nearly tripling the effective sample size of previous GWAS and substantially increasing geographical and ancestral diversity (approximate number of cases / controls, by genetic ancestry group: African (AFR)= 24,000 / 67,500, East Asian (EAS) = 49,000 / 429,000, Admixed American (AMR) = 28,500 / 78,000). Confluence is designed as a resource to address a range of breast cancer genetic questions by contributors and the wider scientific community.

Methods: We conducted a multi-ancestry GWAS of overall breast cancer risk (47,790,460 variants analyzed, minimum minor allele frequency = 0.03%). Individual-level genotyping data was available on 322,332 cases and 269,357 controls, including 53,058 BRCA1/2 carriers, 24,883 of which were breast cancer cases. This data was processed through a harmonized quality-control pipeline and imputed (by array) using the TOPMed reference panel. GWAS were performed separately by array using REGENIE, adjusting for the first ten principal components. Resulting summary statistics were combined with similar statistics from 18 external biobanks (101,579 cases and 1,373,259 controls) through fixed-effect meta-analysis. Novel signals were declared if located more than +/-1Mb from any known locus identified in previously published GWAS.

Results: We identified 469 independent genome-wide significant loci (P<5x10-8) of which 249 (53%) were novel. Among the 240 loci previously reported, 220 (92%) were associated at P<5x10-8. Assuming common effect sizes across populations, while accounting for ancestry-specific allele frequency differences, the 249 discovered loci increased the logit-scale variance relative to previously known loci from approximately 29% (in AFR) to 36% (in EAS).

Conclusions/Implications for practice or policy: This study represents the largest and most ancestrally diverse GWAS of breast cancer. The novel loci identify new candidate genes, substantially improve the proportion of heritability explained, and lay the groundwork for an expanded understanding of breast cancer biology. Ongoing analyses will extend these findings to sex-, ancestry-, and subtype-specific risk, including BRCA1/2 carriers, and polygenic risk prediction. Work presented at the conference will include novel findings on improvements in risk stratification across ancestry groups. Improved polygenic risk scores generated by Confluence that better generalize across global populations will enable more accurate and equitable risk stratification and prevention strategies.