Detects top discriminators that contribute to group separation based on the fixation index (Fst).
top.discriminator( cluster.obj = NULL, group1, group2, bim.file, use.node.number = FALSE, num.top = 100, percentile = 0.9, use.percentile = FALSE, use.path = FALSE, result.path = NULL )
cluster.obj | The object which is returned from |
---|---|
group1 | To specify the first group number to be compared. (also see
|
group2 | To specify the second group number to be compared. (also see
|
bim.file | Option: In case that SNP information is not provided to
|
use.node.number | To specify whether a group number or a node number is be used. If TRUE, a node nubmer is used instead. Default = FALSE. |
num.top | A number of top Fst SNPs to be returned. This parameter is
used when |
percentile | A percentile for top Fst SNPs to be returned. This parameter
is used when |
use.percentile | A logical value to indicate whether |
use.path | A logical value to indicate whether |
result.path | A path to an result directory of IPCAPS. This parameter is
used when |
The returned value is a data.frame of SNP information sorting by Fst in descending order, which contains 7 columns, chr, SNP, centimorgans, position, allele1, allele2, and Fst. The column 1-6 are SNP information from the bim file. The column Fst contains estimated Fst between group1 and group2.
# Importantly, bed file, bim file, and fam file are required # Use the example files embedded in the package BED.file <- system.file("extdata","ipcaps_example.bed",package="IPCAPS") LABEL.file <- system.file("extdata","ipcaps_example_individuals.txt.gz",package="IPCAPS") my.cluster <- ipcaps(bed=BED.file,label.file=LABEL.file,lab.col=2,out=tempdir())#> Running ... IPCAPS #> output: /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk #> label file: /private/var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T/RtmpYTlCbr/temp_libpath1351142c69ef5/IPCAPS/extdata/ipcaps_example_individuals.txt.gz #> label column: 2 #> threshold: 0.18 #> minimum Fst: 8e-04 #> maximum thread: 1 #> method: mix #> bed: /private/var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T/RtmpYTlCbr/temp_libpath1351142c69ef5/IPCAPS/extdata/ipcaps_example.bed #> minimum in group: 20 #> data type: snp #> missing: NA #> In preprocessing step #> Note: the directory '/var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk' is existed. #> The result files will be saved to this directory: /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8 #> Input data: 103 individuals, 2000 markers #> Start calculating #> Node 1: Start the process #> Node 1: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 1: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 1: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 1: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node1.RData #> Node 1: Check for splitting #> Node 1: Reducing matrix #> Node 1: EigenFit = 0.652486925995028, Threshold = 0.18, no. significant PCs = 3 #> Node 1: Start clustering #> Node 1: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node1.RData #> Node 1: done for clustering #> Time difference of 0.1103411 secs #> Node 1: Checking Fst of group 1 and group 2 #> Node 1: Checking Fst of group 1 and group 3 #> Node 1: Checking Fst of group 1 and group 4 #> Node 1: Checking Fst of group 2 and group 3 #> Node 1: Checking Fst of group 2 and group 4 #> Node 1: Checking Fst of group 3 and group 4 #> Node 1 times took 0.34123706817627 secs #> Node 1: 103 individuals were splitted into: 4 groups #> Node 1: Return status 0 #> Node 1: Split to sub-nodes #> Node 1: Saving /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node2.RData #> Node 1: Saving /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node3.RData #> Node 1: Saving /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node4.RData #> Node 1: Saving /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node5.RData #> Node 1: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 1: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 1: Done! #> Time difference of 0.484498 secs #> Node 2: Start the process #> Node 2: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 2: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 2: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 2: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node2.RData #> Node 2: Check for splitting #> Node 2: A number of node is lower than the minimum number (20), therefore split was not performed #> Node 2: Return status 1 #> Node 2: No split was performed, Status=1 #> Node 2: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/leafnode.RData #> Node 2: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 2: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 2: Done! #> Time difference of 0.03994012 secs #> Node 3: Start the process #> Node 3: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 3: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 3: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 3: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node3.RData #> Node 3: Check for splitting #> Node 3: A number of node is lower than the minimum number (20), therefore split was not performed #> Node 3: Return status 1 #> Node 3: No split was performed, Status=1 #> Node 3: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/leafnode.RData #> Node 3: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 3: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 3: Done! #> Time difference of 0.03689599 secs #> Node 4: Start the process #> Node 4: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 4: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 4: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 4: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node4.RData #> Node 4: Check for splitting #> Node 4: A number of node is lower than the minimum number (20), therefore split was not performed #> Node 4: Return status 1 #> Node 4: No split was performed, Status=1 #> Node 4: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/leafnode.RData #> Node 4: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 4: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 4: Done! #> Time difference of 0.03421998 secs #> Node 5: Start the process #> Node 5: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 5: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 5: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 5: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node5.RData #> Node 5: Check for splitting #> Node 5: Reducing matrix #> Node 5: EigenFit = 0.343228072167861, Threshold = 0.18, no. significant PCs = 3 #> Node 5: Start clustering #> Node 5: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node5.RData #> Node 5: done for clustering #> Time difference of 0.09402514 secs #> Node 5: Checking Fst of group 1 and group 2 #> Node 5 times took 0.126171827316284 secs #> Node 5: 100 individuals were splitted into: 2 groups #> Node 5: Return status 0 #> Node 5: Split to sub-nodes #> Node 5: Saving /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node6.RData #> Node 5: Saving /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node7.RData #> Node 5: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 5: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 5: Done! #> Time difference of 0.2726109 secs #> Node 6: Start the process #> Node 6: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 6: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 6: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 6: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node6.RData #> Node 6: Check for splitting #> Node 6: Reducing matrix #> Node 6: EigenFit = 0.0367738545252401, Threshold = 0.18, no. significant PCs = #> Node 6: No split was performed because EigenFit is lower than threshold #> Node 6: Return status 1 #> Node 6: No split was performed, Status=1 #> Node 6: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/leafnode.RData #> Node 6: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 6: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 6: Done! #> Time difference of 0.1290462 secs #> Node 7: Start the process #> Node 7: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 7: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/condition.RData #> Node 7: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/rawdata.RData #> Node 7: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/node7.RData #> Node 7: Check for splitting #> Node 7: Reducing matrix #> Node 7: EigenFit = 0.0221170820186458, Threshold = 0.18, no. significant PCs = #> Node 7: No split was performed because EigenFit is lower than threshold #> Node 7: Return status 1 #> Node 7: No split was performed, Status=1 #> Node 7: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/leafnode.RData #> Node 7: Loading /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 7: Updating /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/RData/tree.RData #> Node 7: Done! #> Time difference of 0.1021471 secs #> In post process step #> Exporting node 2 as group 1 #> Exporting node 3 as group 2 #> Exporting node 4 as group 3 #> Exporting node 6 as group 4 #> Exporting node 7 as group 5 #> Note: save as /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8/groups.txt#> The result files were saved at: /var/folders/sp/hhmj9xvx53z4g4dktf5f503r0000gp/T//Rtmpvv9uwk/cluster_output8 #> Total runtime is 2.11641001701355 sec#> #> 1 2 3 4 5 #> outlier3 1 1 1 0 0 #> pop1 0 0 0 0 50 #> pop2 0 0 0 50 0# 1 2 3 4 5 6 # outlier4 5 4 1 0 0 0 # pop1 0 0 0 0 250 0 # pop2 0 0 0 0 0 250 # pop3 0 0 0 250 0 0 #Identify top discriminators between groups, for example, group 4 and group 5 top.snp <-top.discriminator(my.cluster,4,5) #or, specify the bim file #top.snp <-top.discriminator(my.cluster,4,5,bim.file="ipcaps_example.bim") head(top.snp)#> chr SNP centimorgans position allele1 allele2 Fst #> V1628 1 marker1628 0 16280000 A T 0.2285229 #> V1843 1 marker1843 0 18430000 A T 0.2203918 #> V111 1 marker111 0 1110000 A T 0.2129322 #> V1817 1 marker1817 0 18170000 A T 0.1669118 #> V1350 1 marker1350 0 13500000 A T 0.1665826 #> V1974 1 marker1974 0 19740000 A T 0.1562058# chr SNP centimorgans position allele1 allele2 Fst #V5452 1 marker5452 0 54520000 A T 0.11337260 #V2348 1 marker2348 0 23480000 A T 0.11194490 #V8244 1 marker8244 0 82440000 A T 0.09556580 #V5972 1 marker5972 0 59720000 A T 0.08747794 #V3561 1 marker3561 0 35610000 A T 0.08725860 #V8419 1 marker8419 0 84190000 A T 0.08293494 #Alternatively, specify the previous result directory of IPCAPS and identify #top discriminators between groups, for example, group 4 and group 5 previous.res.path <- my.cluster$output.dir top.snp <-top.discriminator(result.path = previous.res.path, use.path = TRUE, group1 = 4, group2 = 5) head(top.snp)#> chr SNP centimorgans position allele1 allele2 Fst #> V1628 1 marker1628 0 16280000 A T 0.2285229 #> V1843 1 marker1843 0 18430000 A T 0.2203918 #> V111 1 marker111 0 1110000 A T 0.2129322 #> V1817 1 marker1817 0 18170000 A T 0.1669118 #> V1350 1 marker1350 0 13500000 A T 0.1665826 #> V1974 1 marker1974 0 19740000 A T 0.1562058#Identify top discriminators between groups, for example, group 4 and group 5 top.snp <-top.discriminator(my.cluster,4,5) #or, specify the bim file #top.snp <-top.discriminator(my.cluster,4,5,bim.file="ipcaps_example.bim") dim(top.snp)#> [1] 100 7#> chr SNP centimorgans position allele1 allele2 Fst #> V1628 1 marker1628 0 16280000 A T 0.2285229 #> V1843 1 marker1843 0 18430000 A T 0.2203918 #> V111 1 marker111 0 1110000 A T 0.2129322 #> V1817 1 marker1817 0 18170000 A T 0.1669118 #> V1350 1 marker1350 0 13500000 A T 0.1665826 #> V1974 1 marker1974 0 19740000 A T 0.1562058#Here, it is possible to select the top Fst SNPs based on a percentile. top.snp <-top.discriminator(my.cluster,4,5, percentile = 0.9, use.percentile = TRUE) dim(top.snp)#> [1] 200 7#> chr SNP centimorgans position allele1 allele2 Fst #> V1628 1 marker1628 0 16280000 A T 0.2285229 #> V1843 1 marker1843 0 18430000 A T 0.2203918 #> V111 1 marker111 0 1110000 A T 0.2129322 #> V1817 1 marker1817 0 18170000 A T 0.1669118 #> V1350 1 marker1350 0 13500000 A T 0.1665826 #> V1974 1 marker1974 0 19740000 A T 0.1562058#Identify top discriminators between groups, for example, node 7 and node 8 top.snp2 <-top.discriminator(my.cluster,7,8,use.node.number=TRUE) head(top.snp2)#> chr SNP centimorgans position allele1 allele2 Fst #> V1 1 marker1 0 10000 A T NaN #> V2 1 marker2 0 20000 A T NaN #> V3 1 marker3 0 30000 A T NaN #> V4 1 marker4 0 40000 A T NaN #> V5 1 marker5 0 50000 A T NaN #> V6 1 marker6 0 60000 A T NaN# chr SNP centimorgans position allele1 allele2 Fst #V5452 1 marker5452 0 54520000 A T 0.11337260 #V2348 1 marker2348 0 23480000 A T 0.11194490