x statistic (73) by recomputing the statistic for random sets of SNPs in matched 5% derived allele frequency bins (polarized using the chimpanzee reference gnome panTro2). For each bootstrap replicate, we keep the original effect sizes but replace the frequencies of each SNP with one randomly sampled from the same bin. Unlike the PRS calculations, we ignored missing data, since the Qx statistic uses only the population-level estimated allele frequencies and not individual-level data. We tested a series of nested sets of SNPs (x axis in Fig. 5), adding SNPs in 100 SNP batches, ordered by increasing P value, down to a P value of 0.1.
Artificial GWAS Investigation.
We simulated GWAS, generating causal effects at a subset of around 159,385 SNPs in the intersection of SNPs, which passed QC in the UK Trans dating Biobank GWAS, are part of the 1240 k capture, and are in the POBI dataset (84). We assumed that the variance of the effect size of an allele of frequency f was proportional to [f(1 ? f)] ? , where the parameter ? measures the relationship between frequency and effect size (85). We performed 100 simulations with ? = ?1 (the most commonly used model, where each SNP explains the same proportion of phenotypic variance) and 100 with ? = ?0.45 as estimated for height (85).