Title: | Simultaneous Enrichment Analysis |
---|---|
Description: | SEA performs simultaneous feature-set testing for (gen)omics data. It tests the unified null hypothesis and controls the family-wise error rate for all possible pathways. The unified null hypothesis is defined as: "The proportion of true features in the set is less than or equal to a threshold." Family-wise error rate control is provided through use of closed testing with Simes test. There are some practical functions to play around with the pathways of interest. |
Authors: | Mitra Ebrahimpoor |
Maintainer: | Mitra Ebrahimpoor<[email protected]> |
License: | GPL (>= 2) |
Version: | 2.1.2 |
Built: | 2024-10-31 20:27:12 UTC |
Source: | https://github.com/mitra-ep/rsea |
This package uses raw p-values of genomic features as input and evaluates any given list of feature-sets or pathways. For each set the adjusted p-value and TDP lower-bound are calculated. The type of test can be defined by arguments and can be refined as necessary. The p-values are corrected for every possible set of features, making the method flexible in choice of pathway list and test type. For more details see: Ebrahimpoor, M (2019) <doi:10.1093/bib/bbz074>
The unified null hypothesis is tested using closed testing procedure and all-resolutions inference. It combines the self-contained and ompetitive approaches in one framework. In short, using p-values of the individual features as input, the package can provide an FWER-adjusted p-value along with a lower bound and a point estimate for the proportion of true discoveries per feature-set. The flexibility in revising the choice of feature-sets without inflating type-I error is the most important property of SEA.
Mitra Ebrahimpoor.
Maintainer: Mitra Ebrahimpoor<[email protected]>
Mitra Ebrahimpoor, Pietro Spitali, Kristina Hettne, Roula Tsonaka, Jelle Goeman, Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods, Briefings in Bioinformatics,bbz074 https://doi.org/10.1093/bib/bbz074
returns a plotof SEA-chart which illustrates proportion of discoveries per pathway.
plotSEA(object, by = "TDP.estimate", threshold = 0.005, n = 20)
plotSEA(object, by = "TDP.estimate", threshold = 0.005, n = 20)
object |
A SEA-chart object which is the output of |
by |
the Variable which will we mapped. It should be either the TDP estimate or TDP bound.The default is TDP bound. |
threshold |
A real number between 0 and 1. Which will be used as a visual aid to distinguish significant pathways |
n |
Integer. Number of rows from SEA-chart object to be plotted. |
Returns a plot of SEA_chart according to the selected arguments
Mitra Ebrahimpoor
Mitra Ebrahimpoor, Pietro Spitali, Kristina Hettne, Roula Tsonaka, Jelle Goeman, Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods, Briefings in Bioinformatics,bbz074
#See the examples for \code{\link{SEA}}
#See the examples for \code{\link{SEA}}
returns SEA chart (a data.frame) including the test results and estimates for the specified
feature-sets from pathlist
.
SEA( pvalue, featureIDs, data, pathlist, select, tdphat = TRUE, selfcontained = TRUE, competitive = TRUE, thresh = NULL, alpha = 0.05 )
SEA( pvalue, featureIDs, data, pathlist, select, tdphat = TRUE, selfcontained = TRUE, competitive = TRUE, thresh = NULL, alpha = 0.05 )
pvalue |
Vector of p-values. It can be the name of the covariate representing the Vector of
all raw p-values in the |
featureIDs |
Vector of feature IDs. It can be the name of the covariate representing the IDs in the
|
data |
Optional data frame or matrix containing the variables in |
pathlist |
A list containing pathways defined by |
select |
A vector. Number or names of pathways of interest from the |
tdphat |
Logical. If |
selfcontained |
Logical. If |
competitive |
Logical. If |
thresh |
A real number between 0 and 1. If specified, the competitive null hypothesis will be tested against this threshold for each pathway and the corresponding adj. p-value is returned |
alpha |
The type I error allowed for TDP bound. The default is 0.05. |
A data.frame is returned including a list of pathways with corresponding TDP bound estimate, and if specified, TDP point estimate and adjusted p-values
Mitra Ebrahimpoor
Mitra Ebrahimpoor, Pietro Spitali, Kristina Hettne, Roula Tsonaka, Jelle Goeman, Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods, Briefings in Bioinformatics, , bbz074, https://doi.org/10.1093/bib/bbz074
## Not run: ##Generate a vector of pvalues for a toy example set.seed(159) m<- 100 pvalues <- runif(m,0,1)^5 featureIDs <- as.character(1:m) # perform a self-contained test for all features setTest(pvalues, featureIDs, testype = "selfcontained") # create 3 random pathway of size 60, 20 and 45 randpathlist=list(A=as.character(c(sample(1:m, 60))), B=as.character(c(sample(1:m, 20))), C=as.character(c(sample(1:m, 45)))) # get the seachart for the whole pathlist S1<-SEA(pvalues, featureIDs, pathlist=randpathlist) S1 # get the seachart for only first two pathways of the randpathlist S2<-SEA(pvalues, featureIDs, pathlist=randpathlist, select=1:2) S2 #sort the list by competitve p-value and select top 2 topSEA(S2, by=Comp.adjP, descending = FALSE, n=2) #make an enrichment plot based on TDP.estimated of pathways plotSEA(S1,n=3) ## End(Not run)
## Not run: ##Generate a vector of pvalues for a toy example set.seed(159) m<- 100 pvalues <- runif(m,0,1)^5 featureIDs <- as.character(1:m) # perform a self-contained test for all features setTest(pvalues, featureIDs, testype = "selfcontained") # create 3 random pathway of size 60, 20 and 45 randpathlist=list(A=as.character(c(sample(1:m, 60))), B=as.character(c(sample(1:m, 20))), C=as.character(c(sample(1:m, 45)))) # get the seachart for the whole pathlist S1<-SEA(pvalues, featureIDs, pathlist=randpathlist) S1 # get the seachart for only first two pathways of the randpathlist S2<-SEA(pvalues, featureIDs, pathlist=randpathlist, select=1:2) S2 #sort the list by competitve p-value and select top 2 topSEA(S2, by=Comp.adjP, descending = FALSE, n=2) #make an enrichment plot based on TDP.estimated of pathways plotSEA(S1,n=3) ## End(Not run)
Estimates the TDP of the specified set of features.
setTDP(pvalue, featureIDs, data, set, alpha = 0.05)
setTDP(pvalue, featureIDs, data, set, alpha = 0.05)
pvalue |
The vector of p-values. It can be the name of the covariate representing the Vector of
raw p-values in the |
featureIDs |
The vector of feature IDs. It can be the name of the covariate representing the IDs in the
|
data |
Optional data frame or matrix containing the variables in |
set |
The selection of features defining the feature-set based on the the |
alpha |
The type I error allowed. The default is 0.05. NOTE: this shouls be consistent across the study |
A named vector including the lower bound and point estimate for the true discovery proportion (TDP) of the specified test for the feature-set is returned.
Mitra Ebrahimpoor
Mitra Ebrahimpoor, Pietro Spitali, Kristina Hettne, Roula Tsonaka, Jelle Goeman, Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods, Briefings in Bioinformatics, , bbz074, https://doi.org/10.1093/bib/bbz074
## Not run: set.seed(159) #generate random p-values with pseudo IDs m<- 100 pvalues <- runif(m,0,1)^5 featureIDs <- as.character(1:m) # perform a self-contained test for all features settest(pvalues, featureIDs, testype = "selfcontained") # estimate the proportion of true discoveries among all m features settdp(pvalues, featureIDs) # create a random pathway of size 60 randset=as.character(c(sample(1:m, 60))) # estimate the proportion of true discoveries in a random set of size 50 settdp(pvalues, featureIDs, set=randset) ## End(Not run)
## Not run: set.seed(159) #generate random p-values with pseudo IDs m<- 100 pvalues <- runif(m,0,1)^5 featureIDs <- as.character(1:m) # perform a self-contained test for all features settest(pvalues, featureIDs, testype = "selfcontained") # estimate the proportion of true discoveries among all m features settdp(pvalues, featureIDs) # create a random pathway of size 60 randset=as.character(c(sample(1:m, 60))) # estimate the proportion of true discoveries in a random set of size 50 settdp(pvalues, featureIDs, set=randset) ## End(Not run)
calculates the adjusted p-value for the local hypothesis as defined by testtype
and testvalue
.
setTest(pvalue, featureIDs, data, set, testype, testvalue)
setTest(pvalue, featureIDs, data, set, testype, testvalue)
pvalue |
The vector of p-values. It can be the name of the covariate representing the Vector of
raw p-values in the |
featureIDs |
The vector of feature IDs. It can be the name of the covariate representing the IDs in the
|
data |
Optional data frame or matrix containing the variables in |
set |
The selection of features defining the feature-set based on the the |
testype |
Character, type of the test: "selfcontained" or "competitive". Choosing the self-contained
option will automatically set the threshold to zero and the |
testvalue |
Optional value to test against. Setting this value to c along with
|
The adjusted p-value of the specified test for the feature-set is returned.
Mitra Ebrahimpoor
Mitra Ebrahimpoor, Pietro Spitali, Kristina Hettne, Roula Tsonaka, Jelle Goeman, Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods, Briefings in Bioinformatics, , bbz074, https://doi.org/10.1093/bib/bbz074
## Not run: #Generate a vector of pvalues set.seed(159) m<- 100 pvalues <- runif(m,0,1)^5 featureIDs <- as.character(1:m) # perform a self-contained test for all features settest(pvalues, featureIDs, testype = "selfcontained") # create a random pathway of size 60 randset=as.character(c(sample(1:m, 60))) # perform a competitive test for the random pathway settest(pvalues, featureIDs, set=randset, testype = "competitive") # perform a unified null hypothesis test against 0.2 for a set of size 50 settest(pvalues, featureIDs, set=randset, testype = "competitive", testvalue = 0.2 ) ## End(Not run)
## Not run: #Generate a vector of pvalues set.seed(159) m<- 100 pvalues <- runif(m,0,1)^5 featureIDs <- as.character(1:m) # perform a self-contained test for all features settest(pvalues, featureIDs, testype = "selfcontained") # create a random pathway of size 60 randset=as.character(c(sample(1:m, 60))) # perform a competitive test for the random pathway settest(pvalues, featureIDs, set=randset, testype = "competitive") # perform a unified null hypothesis test against 0.2 for a set of size 50 settest(pvalues, featureIDs, set=randset, testype = "competitive", testvalue = 0.2 ) ## End(Not run)
returns a permutation of SEA-chart which rearranges the feature-sets according to the selected argument into ascending or descending order.
topSEA(object, by, thresh = NULL, descending = TRUE, n = 20, cover)
topSEA(object, by, thresh = NULL, descending = TRUE, n = 20, cover)
object |
A SEA-chart object which is the output of |
by |
Variable name by which the ordering should happen. It should be a column of SEA-chart. The default is TDP_bound. |
thresh |
A real number between 0 and 1. If specified the values of the variable defined in |
descending |
Logical. If |
n |
Integer. Number of raws of the output chart |
cover |
An optional threshold for coverage, which must be a real number between 0 and 1. If specified, feature-sets with a coverage lower than or equal to this value are removed. |
Returns a subset of SEA_chart sorted according to the arguments
Mitra Ebrahimpoor
Mitra Ebrahimpoor, Pietro Spitali, Kristina Hettne, Roula Tsonaka, Jelle Goeman, Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods, Briefings in Bioinformatics,bbz074
#See the examples for \code{\link{SEA}}
#See the examples for \code{\link{SEA}}