Skip to contents

Filtering Pseudo-Autosomal Region (PAR), X-transposed region (XTR), Ampliconic, filter based on chromosome code or user-defined regions from input PLINK files. Only one type of filtering can be done from three types, either by region (using regionfile = TRUE), by chromosome (filterCHR) or by any combination of these three, filterPAR, filterXTR and filterAmpliconic.

Usage

FilterRegion(
  DataDir,
  ResultDir,
  finput,
  foutput,
  CHRX = TRUE,
  CHRY = FALSE,
  filterPAR = TRUE,
  filterXTR = TRUE,
  filterAmpliconic = TRUE,
  regionfile = FALSE,
  filterCHR = NULL,
  Hg = "19",
  exclude = TRUE
)

Arguments

DataDir

A character string for the file path of the input PLINK binary files.

ResultDir

A character string for the file path where all output files will be stored. The default is tempdir().

finput

Character string, specifying the prefix of the input PLINK binary files.

foutput

Character string, specifying the prefix of the output PLINK binary files if the filtering option for the SNPs is chosen. The default is "FALSE".

CHRX

Boolean value, TRUE or FALSE to filter/flag regions from chromosome X. The default is TRUE. Note: CHRX only in effect if one of filterPAR, filterXTR or filterAmpliconic filter is in effect.

CHRY

Boolean value, TRUE or FALSE to filter/flag regions from chromosome X. The default is FALSE. Note: CHRY only in effect if one of filterPAR, filterXTR or filterAmpliconic filter is in effect.

filterPAR

Boolean value, TRUE or FALSE to filter out PARs from input PLINK file. The default is TRUE.

filterXTR

Boolean value, TRUE or FALSE to filter out XTRs from input PLINK file. The default is TRUE.

filterAmpliconic

Boolean value, TRUE or FALSE to filter out Ampliconic regions from input PLINK file. The default is TRUE.

regionfile

Character string, specifying the name of the .txt file containing the user-defined regions to be filtered out from input PLINK file in bed format. The default is FALSE. If regionfile = TRUE, only this filtering will be in effect. Also, PAR, XTR and Ampliconic SNPs from X-chomosome will be flagged and returned.

filterCHR

Vector value with positive integer, specifying the chromosome code to filter/flag the SNPs. The default is 0, means no filtering based on chromosome code. For non-zero values of this argument, the function will only consider the chromosome code to filter or flag. All other filtering will not work. If filterCHR = TRUE, only this filtering will be in effect. Also, PAR, XTR and Ampliconic SNPs from X-chomosome will be flagged and returned.

Hg

Character value, '19', or '38', specifying which genome build to use for PAR, XTR and Ampliconic regions. The default is Hg = "19".

exclude

Boolean value, TRUE or FALSE to filter and flag or only flag the SNPs. The default is TRUE.

Value

A list of three dataframes: PAR containing SNPs from PAR regions; XTR containing SNPs from XTR region and Ampliconic containing SNPs from Ampliconic region.

For non-zero value of filterCHR, a dataframe containing the excluded/flagged SNPs will be returned.

For exclude = TRUE, two sets of PLINK binary files will be produced in ResultDir. One set will have the remaining SNPs after filtering and other one will have the discarded SNPs.

Author

Banabithi Bose

Examples

DataDir <- GXwasR:::GXwasR_data()
ResultDir <- tempdir()
finput <- "GXwasR_example"
foutput <- "PostimputeEX_QC1"
x <- FilterRegion(
    DataDir = DataDir, ResultDir = ResultDir,
    finput = finput, foutput = foutput, CHRX = TRUE, CHRY = FALSE,
    filterPAR = TRUE, filterXTR = TRUE, filterAmpliconic = TRUE,
    regionfile = FALSE, filterCHR = NULL, Hg = "38", exclude = TRUE
)
#>  chrX
#>  There is no PAR region in the input data. Argument filterPAR cannot set to be TRUE.
#>  Changing filterPAR to FALSE
#>  There is no XTR region in the input data. Argument filterXTR cannot set to be TRUE.
#>  Changing filterXTR to FALSE
#>  Ampliconic SNPs:9
#>  9 SNPs are discarded.
#>  PLINK files with passed SNPs are in /var/folders/d6/gtwl3_017sj4pp14fbfcbqjh0000gp/T//RtmpO7c0S8 prefixed as PostimputeEX_QC1
#>  PLINK files with discarded SNPs are in /var/folders/d6/gtwl3_017sj4pp14fbfcbqjh0000gp/T//RtmpO7c0S8 prefixed as PostimputeEX_QC1_snps_extracted