Methods
adjustFdr(pvalues, optionsopt) → {Float64Array}
- Description:
Adjust p-values to control the false discovery rate using the Benjamini-Hochberg method. This is primarily intended for use with p-values from
testEnrichment, typically using the total number of sets fromnumberOfSetsastotalTests.
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pvalues |
Float64Array | Array of p-values. |
||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
Array of length equal to pvalues, containing the BH-adjusted p-values.
- Type
- Float64Array
computeEnrichmentCurve(ranking, setMembers, optionsopt) → {object}
- Description:
Compute an enrichment curve from a gene ranking. At each position in the ranking, the value of the curve is defined as the proportion of genes with the same or higher rank that are present in the gene set. This can be used to visualize the change in enrichment as the ranking changes, typically with respect to some kind of decreasing importance.
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ranking |
Array | TypedArray | Ranking of genes, where earlier entries are considered to be more highly ranked.
Each entry may either be an integer representing a gene (typically a gesel gene ID),
or an array of such integers, e.g., as produced by |
||||||||||||
setMembers |
Set | Array | TypedArray | Array of integers specifying the genes (typically gesel gene IDs) belonging to the gene set. A preconstructed Set may also be supplied. |
||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
Object containing the following properties:
proportions: a Float64Array of length equal toranking. Each entry contains the proportion of genes with equal or higher ranks that belong to the set.found: a Uint32Array containing the indices ofrankingcorresponding to the genes that were found in the set.
- Type
- object
countSetOverlaps(setsForSomeGenes) → {Array}
- Description:
This is a utility function that is called internally by
findOverlappingSets. However, it can be used directly to obtain overlap counts if the gene-to-set mappings are manually obtained.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
setsForSomeGenes |
Array | Array where each entry corresponds to a gene and contains an array of the set IDs containing that gene.
Each inner array is typically the result of calling |
Returns:
An array of objects, where each object corresponds to a set that is present in at least one entry of setsForSomeGenes.
Each object contains:
id: the ID of the set infetchAllSets.count: the number of genes in the set that overlap with genes ingenes.
- Type
- Array
effectiveNumberOfGenes(species, config) → {number}
- Description:
Count the number of genes in the Gesel database that belong to at least one set.
The return value should be used as the total number of balls when performing a hypergeometric test for gene set enrichment, instead of the length of the array returned by
fetchAllGenes. This ensures that uninteresting genes like pseudo-genes or predicted genes are ignored during the calculation. Otherwise, unknown genes would inappropriately increase the number of balls and understate the enrichment p-values.See also the documentation for
fetchSetsForSomeGenesfor some comments about caching.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Number of genes that belong to at least one set for species.
This can be used as a more appropriate universe size in testEnrichment.
- Type
- number
(async) fetchAllCollections(species, config) → {Array}
- Description:
Fetch information about all gene set collections in the Gesel database.
If this function is called once, the data frame will be cached in memory and re-used in subsequent calls to this function. The cached data will also be used to speed up calls to
fetchSomeCollections.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Array of objects where each entry corresponds to a gene set collection and contains details about that collection. Each object can be expected to contain:
title, the title for the collection.description, the description for the collection.species, the species for all gene identifiers in the collection. This should contain the full scientific name, e.g.,"Homo sapiens","Mus musculus".maintainer, the maintainer of this collection.source, the source of this set, usually a link to some external resource.start, the index for the first set in the collection in the output ofsets. All sets from the same collection are stored contiguously.size, the number of sets in the collection.
In a gesel context, the identifier for a collection (i.e., the "collection ID") is defined as the index of the collection in this array.
- Type
- Array
(async) fetchAllGenes(species, config, optionsopt) → {Map}
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
||||||||||||
config |
object | Configuration object, see |
||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
Object where each key is named after an identifier type in types.
Each value is an array where each element corresponds to a gene and is itself an array of strings containing all identifiers of the current type for that gene.
The arrays for different identifier types are all of the same length, and corresponding elements across these arrays describe the same gene. gesel's identifier for each gene (i.e., the "gene ID") is defined as the index of that gene in any of these arrays.
- Type
- Map
(async) fetchAllSets(species, config) → {Array}
- Description:
Fetch information about all gene sets in the Gesel database.
If this function is called once, the data frame will be cached in memory and re-used in subsequent calls to this function. The cached data will also be used to speed up calls to
fetchSomeSets.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Array of objects where each entry corresponds to a set and contains the details about that set. Each object can be expected to contain:
name, the name of the set.description, the description of the set.size, the number of genes in the set.collection, the index of the collection containing the set.number, the number of the set within the collection.
In a gesel context, the identifier for a set (i.e., the "set ID") is defined as the index of the set in this array.
- Type
- Array
(async) fetchCollectionSizes(species, config) → {Array}
- Description:
Get the size of each gene set collection.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Number of sets in each collection.
Each value corresponds to a collection in fetchAllCollections.
- Type
- Array
(async) fetchGenesForAllSets(species, config) → {Array}
- Description:
Fetch the gene membership of all sets in the Gesel database.
If this function is called once, the returned list will be cached in memory and re-used in subsequent calls to this function. The cached data will also be used to speed up calls to
fetchGenesForSomeSets.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Array of length equal to the total number of sets for this species.
Each element corresponds to an entry in fetchAllSets and is a Uint32Array containing the IDs for all genes belonging to that set.
Gene IDs refer to indices in fetchAllGenes.
- Type
- Array
(async) fetchGenesForSomeSets(species, sets, config) → {Array}
- Description:
Fetch the gene membership of some sets in the Gesel database. This can be more efficient than
fetchGenesForAllSetsif only a few sets are of interest.Every time this function is called, information from the requested
setswill be added to an in-memory cache. Subsequent calls to this function will re-use as many of the cached sets as possible before making new requests to the Gesel database.If
fetchGenesForAllSetswas previously called, its cached data will be directly used byfetchGenesForSomeSetsto avoid performing extra requests to the database. Ifsetsis large, it may be more efficient to callfetchGenesForAllSetsto prepare the cache before calling this function.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
sets |
Array | Array of set IDs.
Each ID is a row index in the array returned by |
config |
object | Configuration object, see |
Returns:
Array of length equal to sets.
Each entry is a Uint32Array containing the IDs for all genes belonging to the corresponding set in sets.
Gene IDs refer to indices in fetchAllGenes.
- Type
- Array
(async) fetchSetSizes(species, config) → {Array}
- Description:
Get the size of each gene set.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Number of genes in each set.
Each value corresponds to a set in fetchAllSets.
- Type
- Array
(async) fetchSetsForAllGenes(species, config) → {Array}
- Description:
Fetch the identities of the sets that contain each gene in the Gesel database.
If this function is called once, the returned list will be cached in memory and re-used in subsequent calls to this function. The cached data will also be used to speed up calls to
fetchSetsForSomeGenes.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Array of length equal to the total number of genes for this species.
Each element corresponds to an entry in fetchAllGenes and is a Uint32Array containing the IDs for all sets containing that gene.
Set IDs refer to indices in fetchAllSets.
- Type
- Array
(async) fetchSetsForSomeGenes(species, genes, config) → {Array}
- Description:
Fetch the identities of sets that contain some genes in the Gesel database. This can be more efficient than
fetchSetsForAllGenesif only a few genes are of interest.Every time this function is called, information from the requested
geneswill be added to an in-memory cache. Subsequent calls to this function will re-use as many of the cached genes as possible before making new requests to the Gesel database.If
fetchSetsForAllGenesis called, its cached data will be directly used byfetchSetsForSomeGenesto avoid extra requests to the database. Ifgenesis large, it may be more efficient to callfetchSetsForAllGenesto prepare the cache before calling this function.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
genes |
Array | Array of gene IDs.
Each ID is a row index in any of the arrays returned by |
config |
object | Configuration object, see |
Returns:
Array of length equal to genes.
Each entry is a Uint32Array containing the IDs for all sets containing to the corresponding gene in genes.
Set IDs refer to indices in fetchAllSets.
- Type
- Array
(async) fetchSomeCollections(species, collections, config) → {Array}
- Description:
Fetch the details of some gene set collections from the Gesel database. This can be more efficient than
fetchAllCollectionswhen only a few collections are of interest.Every time this function is called, information from the requested
collectionswill be added to an in-memory cache. Subsequent calls to this function will re-use as many of the cached collections as possible before making new requests to the Gesel database.If
fetchAllCollectionswas previously called, its cached data will be used byfetchSomeCollectionsto avoid extra requests to the database. Ifcollectionsis large, it may be more efficient to callfetchAllCollectionsto prepare the cache before calling this function.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
collections |
Array | Array of collection IDs.
Each entry is a row index into the array returned by |
config |
object | Configuration object, see |
Returns:
Array of length equal to collections.
Each entry is an object containing details about the corresponding collection in collections.
- Type
- Array
(async) fetchSomeSets(species, sets, config) → {Array}
- Description:
Fetch the details of some gene sets from the Gesel database. This can be more efficient than calling
fetchAllSetswhen only a few sets are of interest.Every time this function is called, information from the requested
setswill be added to an in-memory cache. Subsequent calls to this function will re-use as many of the cached sets as possible before making new requests to the Gesel database.If
fetchAllSetswas previously called, its cached data will be directly used byfetchSomeSetsto avoid performing extra requests to the database. Ifsetsis large, it may be more efficient to callfetchAllSetsto prepare the cache before calling this function.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
sets |
Array | Array of set IDs.
Each ID is a row index in the array returned by |
config |
object | Configuration object, see |
Returns:
Array of length equal to sets.
Each entry is an object containing the set information for the corresponding set in sets.
- Type
- Array
(async) findOverlappingSets(species, genes, config, optionsopt) → {Array}
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
|||||||||||||||||
genes |
Array | Array of unique integers containing user-supplied gene IDs, see |
|||||||||||||||||
config |
object | Configuration object, see |
|||||||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
An array of objects, where each object corresponds to a set that has non-zero overlaps with genes.
Each object contains:
id: the ID of the set infetchAllSets.count: the number of genes in the set that overlap with genes ingenes.size: the size of each set. Only included ifincludeSize = true.pvalue: the enrichment p-value. Only included iftestEnrichment = true.
- Type
- Array
flushMemoryCache(config)
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
config |
object | Configuration object, see Flush all cached objects in |
intersect(arrays) → {Array}
Parameters:
| Name | Type | Description |
|---|---|---|
arrays |
Array | Array of arrays over which to compute the intersection. |
Returns:
Intersection of all arrays in arrays.
- Type
- Array
(async) mapGenesByIdentifier(species, type, config, optionsopt) → {Map}
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
||||||||||||
type |
string | Type of the identifier to use as the key of the map, e.g., |
||||||||||||
config |
object | Configuration object, see |
||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
Map where each key is a string containing a (possibly lower-cased) identifier of the specified type and each value is an array.
Each array contains the gesel gene IDs associated with the type identifier, see fetchAllGenes for ore details.
- Type
- Map
newConfig(fetchGene, fetchFile, fetchRanges, optionsopt) → {object}
- Description:
Create a new configuration object to specify how the Gesel database should be queried. This can be used in each gesel function to point to a different Gesel database from the default.
The configuration object also contains a cache of data structures that can be populated by gesel functions. This avoids unnecessary fetch requests upon repeated calls to the same function. If the cache becomes stale or too large, it can be cleared by calling
flushMemoryCache.
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
fetchGene |
function | Function that accepts the name of a Gesel gene description file and returns an ArrayBuffer of its contents. This may be async. |
||||||||||||
fetchFile |
function | Function that accepts the name of a Gesel database file and returns an ArrayBuffer of its contents. This may be async. |
||||||||||||
fetchRanges |
function | Function that accepts three arguments:
It should return an array of ArrayBuffers of the same length as |
||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
A configuration object.
- Type
- object
(async) numberOfCollections(species, config) → {number}
- Description:
Get the total number of gene set collections.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Total number of collections for this species.
- Type
- number
(async) numberOfSets(species, config) → {number}
- Description:
Get the total number of gene sets.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
config |
object | Configuration object, see |
Returns:
Total number of sets for this species.
- Type
- number
reindexGenesForAllSets(geneMapping, genesForSets) → {Array}
- Description:
Reindex the gene sets for a user-defined gene universe. This is helpful for applications that know their own gene universe and want to convert the gesel gene IDs to indices within that universe.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
geneMapping |
Array | Array of length equal to the number of genes in a user-defined gene universe.
Each entry corresponds to one gene in the user's universe and should be an array containing the corresponding gesel gene ID(s) (see |
genesForSets |
Array | Array of length equal to the number of reference gene sets.
Each entry corresponds to a set and is an array containing gesel gene IDs for all genes in that set.
This is typically obtained from |
Returns:
Array of length equal to genesForSets.
Each entry corresponds to a reference gene set and is a Uint32Array where the elements are indices into geneMapping, specifying the genes in the user's universe that belong to that set.
If a gene in geneMapping maps to multiple gesel IDs, it is considered to belong to all sets containing any of its mapped gesel gene IDs.
- Type
- Array
reindexSetsForAllGenes(geneMapping, setsForGenes) → {Array}
- Description:
Reindex the gene-to-set mappings for a user-defined gene universe. This is helpful for applications that know their own gene universe and want to create a mapping of all sets containing each of their own genes.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
geneMapping |
Array | Array of length equal to the number of genes in a user-defined gene universe.
Each entry corresponds to one gene in the user's universe and should be an array containing the corresponding gesel gene ID(s) (see |
setsForGenes |
Array | Array of length equal to the number of gesel gene IDs.
Each entry corresponds to a gesel gene ID and is an array containing the set IDs for all sets containing that gene.
This is typically obtained from |
Returns:
Array of length equal to geneMapping.
Each entry corresponds to a gene in the user-supplied universe and is a Uint32Array where the elements are the gesel set IDs containing that gene.
If a gene in geneMapping maps to multiple gesel IDs, we report all sets containing any of its mapped gesel gene IDs.
- Type
- Array
(async) searchGenes(species, queries, config, optionsopt) → {Array}
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
species |
string | Taxonomy ID of the species of interest, e.g., |
|||||||||||||||||
queries |
Array | Array of strings containing gene identifiers of some kind (e.g., Ensembl, symbol, Entrez). |
|||||||||||||||||
config |
object | Configuration object, see |
|||||||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
An array of length equal to queries.
Each element of the array is an array containing the gesel gene IDs with any identifiers that match the corresponding search string.
See fetchAllGenes for more details on the interpretation of these IDs.
- Type
- Array
(async) searchSetText(species, query, config, optionsopt) → {Array}
- Source:
Parameters:
| Name | Type | Attributes | Default | Description | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
species |
string | The taxonomy ID of the species of interest, e.g., |
|||||||||||||||||
query |
string | Query string containing multiple words to search in the names and/or descriptions of each set. Each stretch of alphanumeric characters and dashes is treated as a single word. All other characters are treated as punctuation between words, except for the following wildcards:
A set's name and/or description must contain all words in |
|||||||||||||||||
config |
object | Configuration object, see |
|||||||||||||||||
options |
object |
<optional> |
{}
|
Optional parameters. Properties
|
Returns:
Array of indices of the sets with names and/or descriptions that match query.
- Type
- Array
testEnrichment(overlap, listSize, setSize, universe) → {number}
- Description:
Hypergeometric test for gene set enrichment, based on the overlap between a user-supplied list and the gene set.
- Source:
Parameters:
| Name | Type | Description |
|---|---|---|
overlap |
number | Number of overlapping genes between the user's list and the gene set, typically obtained from |
listSize |
number | Size of the user's list. |
setSize |
number | Size of the gene set, see the |
universe |
number | Size of the gene universe (i.e., the total number of genes for this species).
This can either be obtained from the arrays in |
Returns:
P-value for the enrichment of the user's list in the gene set.
This may be NaN if the inputs are inconsistent, e.g., overlap is greater than listSize or setSize.
- Type
- number