fetchSetsForSomeGenes.RdFetch the identities of sets that contain some genes in the Gesel database.
This can be more efficient than fetchSetsForAllGenes if only a few genes are of interest.
fetchSetsForSomeGenes(species, genes, config = NULL)String containing the NCBI taxonomy ID of the species of interest.
Integer vector containing gene indices.
Each gene index refers to a row of the data frame returned by fetchAllGenes).
Configuration list, typically created by newConfig.
If NULL, the default configuration is used.
List of integer vectors.
Each vector corresponds to a gene in genes and contains the identities of the sets containing that gene.
Each set is defined by its set index, which refers to a row of the data frame returned by fetchAllSets.
Every time this function is called, information from the requested genes will be added to an in-memory cache.
Subsequent calls to this function will re-use as many of the cached genes as possible before making new requests to the Gesel database.
If fetchSetsForAllGenes is called, its cached data will be directly used by fetchSomeSets to avoid extra requests to the database.
If genes is large, it may be more efficient to call fetchSetsForAllGenes to prepare the cache before calling this function.
first.gene <- fetchSetsForSomeGenes("9606", 1:5)
str(first.gene)
#> List of 5
#> $ : int [1:68] 1327 2337 2366 3538 6639 8166 13182 13273 14384 17635 ...
#> $ : int [1:205] 413 605 701 920 1999 2000 2127 2311 2337 2366 ...
#> $ : int [1:7] 18984 20717 22134 27718 28006 40230 40391
#> $ : int [1:160] 1512 2483 3071 19193 19377 20087 20344 20669 20741 21035 ...
#> $ : int [1:64] 1512 2311 2483 3071 19193 19377 20680 20717 21388 22215 ...
# Sets containing the first gene.
all.set.info <- fetchAllSets("9606")
head(all.set.info[first.gene[[1]],])
#> name description size collection number
#> 1327 GO:0003674 molecular_function 710 1 1327
#> 2337 GO:0005576 extracellular region 1916 1 2337
#> 2366 GO:0005615 extracellular space 1865 1 2366
#> 3538 GO:0008150 biological_process 561 1 3538
#> 6639 GO:0031093 platelet alpha granule lumen 67 1 6639
#> 8166 GO:0034774 secretory granule lumen 115 1 8166
# Identities of the requested genes.
fetchAllGenes("9606")[1:5,]
#> symbol entrez ensembl
#> 1 A1BG 1 ENSG0000....
#> 2 A2M 2 ENSG0000....
#> 3 A2MP1 3 ENSG0000....
#> 4 NAT1 9 ENSG0000....
#> 5 NAT2 10 ENSG0000....