Welcome to the Han100K Project!
Han100K aims to:
1) facilitate understanding the population structure and history of Han Chinese;
2) screen AIMs panels for detecting and controlling population stratification in medical and evolutionary studies;
3) create a shared control panel for genotype-phenotype association studies (e.g., GWAS);
4) provide a Han-Chinese-specific reference panel for genotype imputation. Computational tools are implemented into the PGG.Han, and an online user interface is provided for data analysis and results visualization.
Fine-scale population structure
Han Chinese individuals formed a distinct cluster from the surrounding groups including minority groups in China and other neighboring countries, suggesting a full-identity of Han Chinese people in terms of overall genetic make-up. Sub-populations within Han Chinese are seen, which represent 6 sub-groups: North, Northeast, Central, South, Southwest, and Southeast. In spite of connections are also obvious among the groups, the northern Han Chinese have been influenced more by northern Chinese minorities, and southern Han Chinese by southern neighbors.
A shared control panel for GWAS
Similarly, shared control and a reference panel facilitate complex disease mapping using population-based association studies, which has been well-established and demonstrated its power for populations of European ancestry but lacks for Han Chinese. Here, taking into account the sub-population structure, we constructed several structure-aware controls for further population genetics analysis and association studies.
Ancestry Informative Markers (AIMs)
We screened nested AIMs panels for detecting population structure and controlling population stratification to improve association testing and population genetic analysis. Our analysis show that the AIMs panel had sufficient power to discern and control population stratification in Han Chinese, which could significantly reduce false positive rates in both genome-wide association studies (GWAS) and candidate gene association studies (CGAS). We suggest this AIMs panel be genotyped and used to control and correct population stratification in the study design or data analysis of future association studies, especially in CGAS which is the most popular approach to validate previous reports on genetic associations of diseases in post-GWAS era.
Han-Chinese-Specific Reference Panel for Genotype Imputation
A reference panel facilitate complex diseases mapping using population-based association studies, which has been well-established and demonstrated its power for populations of European ancestry but lacks for Han Chinese, the largest ethnic group in East Asia and the world. We develop a Han-Chinese-Specific Reference Panel and an online server for genotype imputation to facilitate further association studies. Notably, we provide a population structure-aware reference panel by which users can customize the imputation reference by selecting particular subpopulation samples with respect to population stratification.