SumstatGenCorr: Genetic Correlation Calculation from GWAS Summary Statistics
Source:R/GXwasR_main_functions.R
SumstatGenCorr.Rd
This function calculates the genetic correlation between two summary statistics using a specified reference linkage disequilibrium (LD) matrix from the UK Biobank.
Arguments
- ResultDir
Directory where results should be saved.
- referenceLD
Reference LD matrix identifier. These are the LD matrices and their eigen-decomposition from 335,265 genomic British UK Biobank individuals. Two sets of reference panel are provided:
307,519 QCed UK Biobank Axiom Array SNPs. The size is about 7.5 GB after unzipping.
1,029,876 QCed UK Biobank imputed SNPs. The size is about 31 GB after unzipping. Although it takes more time, using the imputed panel provides more accurate estimates of genetic correlations. Therefore if the GWAS includes most of the HapMap3 SNPs, then it is recommend using the imputed reference panel.
- sumstat1
Data frame for the first set of summary statistics. The input data frame should include following columns: SNP, SNP ID; A1, effect allele; A2, reference allele; N, sample size; Z, z-score; If Z is not given, alternatively, you may provide: b, estimate of marginal effect in GWAS; se, standard error of the estimates of marginal effects in GWAS.
- sumstat2
Data frame for the second set of summary statistics. The input data frame should include following columns: SNP, SNP ID; A1, effect allele; A2, reference allele; N, sample size; Z, z-score; If Z is not given, alternatively, you may provide: b, estimate of marginal effect in GWAS; se, standard error of the estimates of marginal effects in GWAS.
- Nref
Sample size of the reference sample where LD is computed. If the default UK Biobank reference sample is used, Nref = 335265
- N0
Number of individuals included in both cohorts. The estimated genetic correlation is usually robust against misspecified N0. If not given, the default value is set to the minimum sample size across all SNPs in cohort 1 and cohort 2.
- eigen.cut
Which eigenvalues and eigenvectors in each LD score matrix should be used for HDL. Users are allowed to specify a numeric value between 0 and 1 for eigen.cut. For example, eigen.cut = 0.99 means using the leading eigenvalues explaining 99% of the variance and their correspondent eigenvectors. If the default 'automatic' is used, the eigen.cut gives the most stable heritability estimates will be used.
- lim
Tolerance limitation, default lim = exp(-18).
- parallel
Boolean value, TRUE or FALSE for whether to perform parallel computation. The default is FALSE
- numCores
The number of cores to be used. The default is 2.
Value
A list is returned with:
rg
: The estimated genetic correlation.rg.se
: The standard error of the estimated genetic correlation.P
: P-value based on Wald test.estimates.df
: A detailed matrix includes the estimates and standard errors of heritabilities, genetic covariance and genetic correlation.eigen.use
: The eigen.cut used in computation.
Details
This function requires access to the reference LD data via an environment variable. You must set one of the following environment variables to the appropriate directory:
UKB_ARRAY_PATH
for the Axiom Array reference (UKB_array_SVD_eigen90_extraction
)UKB_IMPUTED_PATH
for the full imputed reference (UKB_imputed_SVD_eigen99_extraction
)UKB_IMPUTED_HAPMAP2_PATH
for the imputed HapMap2 subset (UKB_imputed_hapmap2_SVD_eigen99_extraction
)
References
Ning Z, Pawitan Y, Shen X (2020). “High-definition likelihood inference of genetic correlations across human complex traits.” Nature Genetics, 52, 859–864. doi:10.1038/s41588-020-0676-4 .
Examples
sumstat1 <- GXwasR:::simulateSumstats()
sumstat2 <- GXwasR:::simulateSumstats()
if (nzchar(Sys.getenv("UKB_IMPUTED_HAPMAP2_PATH"))) {
res <- SumstatGenCorr(
ResultDir = tempdir(),
referenceLD = "UKB_imputed_hapmap2_SVD_eigen99_extraction",
sumstat1 = sumstat1,
sumstat2 = sumstat2,
parallel = TRUE
)
}
#> Analysis starts on Fri Aug 8 10:45:25 2025
#> ℹ 9276 out of 769306 (1.21%) SNPs in reference panel are available in GWAS 1.
#> ℹ 9458 out of 769306 (1.23%) SNPs in reference panel are available in GWAS 2.
#> ! Warning: More than 1% SNPs in reference panel are missed in GWAS 1. This may generate bias in estimation. Please make sure that you are using the correct reference panel.
#> ! Warning: More than 1% SNPs in reference panel are missed in GWAS 2. This may generate bias in estimation. Please make sure that you are using the correct reference panel.
#>
|
| | 0%
|
|= | 2%
|
|== | 3%
|
|=== | 5%
|
|===== | 6%
|
|====== | 8%
|
|======= | 10%
|
|======== | 11%
|
|========= | 13%
|
|========== | 15%
|
|=========== | 16%
|
|============ | 18%
|
|============== | 19%
|
|=============== | 21%
|
|================ | 23%
|
|================= | 24%
|
|================== | 26%
|
|=================== | 27%
|
|==================== | 29%
|
|===================== | 31%
|
|======================= | 32%
|
|======================== | 34%
|
|========================= | 35%
|
|========================== | 37%
|
|=========================== | 39%
|
|============================ | 40%
|
|============================= | 42%
|
|============================== | 44%
|
|================================ | 45%
|
|================================= | 47%
|
|================================== | 48%
|
|=================================== | 50%
|
|==================================== | 52%
|
|===================================== | 53%
|
|====================================== | 55%
|
|======================================== | 56%
|
|========================================= | 58%
|
|========================================== | 60%
|
|=========================================== | 61%
|
|============================================ | 63%
|
|============================================= | 65%
|
|============================================== | 66%
|
|=============================================== | 68%
|
|================================================= | 69%
|
|================================================== | 71%
|
|=================================================== | 73%
|
|==================================================== | 74%
|
|===================================================== | 76%
|
|====================================================== | 77%
|
|======================================================= | 79%
|
|======================================================== | 81%
|
|========================================================== | 82%
|
|=========================================================== | 84%
|
|============================================================ | 85%
|
|============================================================= | 87%
|
|============================================================== | 89%
|
|=============================================================== | 90%
|
|================================================================ | 92%
|
|================================================================= | 94%
|
|=================================================================== | 95%
|
|==================================================================== | 97%
|
|===================================================================== | 98%
|
|======================================================================| 100%
#>
#>
#> Integrating piecewise results
#> Point estimates:
#> • Heritability of phenotype 1: 0.00e+00
#> • Heritability of phenotype 2: 0.00e+00
#> • Genetic Covariance: -1.46e-05
#> • Genetic Correlation: -Inf
#> ! Warning: Heritability of one trait was estimated to be 0, which may be due to:
#> 1) The true heritability is very small;
#> 2) The sample size is too small;
#> 3) Many SNPs in the chosen reference panel are missing in the GWAS;
#> 4) There is a severe mismatch between the GWAS population and the population for computing the reference panel
#> ℹ Continuing computing standard error with jackknife
#>
|
| | 0%
|
|= | 2%
#>
#> • Heritability of phenotype 1: 0.00e+00 ( 0.00e+00 )
#> • Heritability of phenotype 2: 0.00e+00 ( 0.00e+00 )
#> • Genetic Covariance: -1.46e-05 ( 0.00e+00 )
#> • Genetic Correlation: -Inf ( NA )
#> • P: NA
#> ! Warning: Heritability of one trait was estimated to be 0, which may be due to:
#> 1) The true heritability is very small;
#> 2) The sample size is too small;
#> 3) Many SNPs in the chosen reference panel are missing in the GWAS.
#>
#> Analysis finished at Fri Aug 8 10:45:58 2025