Deprecated in favour of gemma.R
This is an R wrapper for Gemma’s restful API.
To cite Gemma, please use: Zoubarev, A., et al., Gemma: A resource for the re-use, sharing and meta-analysis of expression profiling data. Bioinformatics, 2012.
devtools::install_github('PavlidisLab/gemmaAPI.R')
For basic api calls see ?endpointFunctions
. These functions return
mostly unaltered data from a given API endpoint.
For high level functions see ?highLevelFunctions
. These functions
return data compiled from multiple api calls.
Download data for a dataset
data =
datasetInfo('GSE107999',
request='data', # we want this endpoint to return data. see documentation
filter = FALSE, # data request accepts filter argument we want non filtered data
return = TRUE, # TRUE by default, all functions have this. if false there'll be no return
file = NULL # NULL by default, all functions have this. If specificed, output will be saved.
)
head(data) %>% knitr::kable(format ='markdown')
Probe | Sequence | GeneSymbol | GeneName | GemmaId | NCBIid | GSE107999_Biomat_9___BioAssayId=427205Name=LUHMEScells,untreated,proliferatingprecursorstaterep4 | GSE107999_Biomat_8___BioAssayId=427206Name=LUHMEScells,untreated,proliferatingprecursorstaterep3 | GSE107999_Biomat_12___BioAssayId=427207Name=LUHMEScells,untreated,proliferatingprecursorstaterep2 | GSE107999_Biomat_10___BioAssayId=427208Name=LUHMEScells,untreated,proliferatingprecursorstaterep1 | GSE107999_Biomat_5___BioAssayId=427201Name=LUHMEScells,untreated,day3ofdifferentiationrep4 | GSE107999_Biomat_4___BioAssayId=427202Name=LUHMEScells,untreated,day3ofdifferentiationrep3 | GSE107999_Biomat_7___BioAssayId=427203Name=LUHMEScells,untreated,day3ofdifferentiationrep2 | GSE107999_Biomat_6___BioAssayId=427204Name=LUHMEScells,untreated,day3ofdifferentiationrep1 | GSE107999_Biomat_11___BioAssayId=427197Name=LUHMEScells,untreated,day6ofdifferentiationrep4 | GSE107999_Biomat_2___BioAssayId=427198Name=LUHMEScells,untreated,day6ofdifferentiationrep3 | GSE107999_Biomat_1___BioAssayId=427199Name=LUHMEScells,untreated,day6ofdifferentiationrep2 | GSE107999_Biomat_3___BioAssayId=427200Name=LUHMEScells,untreated,day6ofdifferentiationrep1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1007_s_at | 1007_s_at_collapsed | DDR1 | discoidin domain receptor tyrosine kinase 1 | 16908 | 780 | 8.360044 | 8.347570 | 8.384220 | 8.631552 | 9.426037 | 9.332862 | 9.556137 | 9.571225 | 9.830016 | 9.534368 | 9.644813 | 9.638160 |
1053_at | 1053_at_collapsed | RFC2 | replication factor C subunit 2 | 139878 | 5982 | 8.321700 | 8.441607 | 8.538243 | 8.223463 | 6.900833 | 7.811239 | 7.362803 | 7.487110 | 6.727149 | 6.781015 | 6.871821 | 6.822983 |
117_at | 117_at_collapsed | HSPA6 | HSPA7 | heat shock protein family A (Hsp70) member 6 | heat shock protein family A (Hsp70) member 7 | 73420 | 73442 | 3310 | 3311 | 5.640347 | 4.309247 | 4.561608 | 4.412733 | 4.274228 | 4.109736 | 4.466428 | 4.262011 |
121_at | 121_at_collapsed | PAX8 | paired box 8 | 173107 | 7849 | 6.915072 | 7.001704 | 6.886536 | 6.995852 | 6.789746 | 6.988139 | 6.950670 | 6.897583 | 6.632473 | 6.872863 | 6.892053 | 6.845294 |
1255_g_at | 1255_g_at_collapsed | GUCA1A | guanylate cyclase activator 1A | 58787 | 2978 | 2.328086 | 2.683368 | 2.292127 | 2.395157 | 2.267915 | 2.371985 | 2.148122 | 2.219700 | 2.078340 | 2.243999 | 2.376379 | 2.238994 |
1294_at | 1294_at_collapsed | UBA7 | ubiquitin like modifier activating enzyme 7 | 165857 | 7318 | 4.436209 | 4.315595 | 4.434729 | 4.505724 | 4.182772 | 4.334539 | 4.278525 | 4.204030 | 4.105466 | 4.410392 | 4.382536 | 4.151413 |
Get metadata for first 10 mouse studies.
mouseStudies = taxonInfo('mouse',request = 'datasets',limit = 0)
studyIDs = mouseStudies %>% purrr::map_int('id')
mouseMetadata = studyIDs[1:10] %>% lapply(compileMetadata,outputType = 'list')
# default outputType is data.frame, which returns a single data frame with study and sample data all together.
mouseMetadata[[1]]$sampleData %>% head %>% knitr::kable(format ='markdown')
id | sampleName | accession | sampleBiomaterialID | sampleAnnotCategory | sampleAnnotCategoryOntoID | sampleAnnotCategoryURI | sampleAnnotBroadCategory | sampleAnnotBroadCategoryOntoID | sampleAnnotBroadCategoryURI | sampleAnnotation | sampleAnnotationOntoID | sampleAnnotType | sampleAnnotationURI | otherCharacteristics | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Brain_C57 Wildtype_affs275-1099 | 48 | Brain_C57 Wildtype_affs275-1099 | GSM101416 | 48 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | wild type genotype | EFO_0005168 | factor | http://www.ebi.ac.uk/efo/EFO_0005168 | total RNA |
Brain_C57 Wildtype_affs275-1100 | 47 | Brain_C57 Wildtype_affs275-1100 | GSM101417 | 47 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | wild type genotype | EFO_0005168 | factor | http://www.ebi.ac.uk/efo/EFO_0005168 | total RNA |
Brain_Melanotransferrin Knockout_affs275-1096 | 52 | Brain_Melanotransferrin Knockout_affs275-1096 | GSM101412 | 52 | genotype;genotype | EFO_0000513;EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5;Homozygous negative | GENE_30060;TGEMO_00001 | factor | http://purl.org/commons/record/ncbi_gene/30060;http://purl.obolibrary.org/obo/TGEMO_00001 | total RNA |
Brain_Melanotransferrin Knockout_affs275-1097 | 51 | Brain_Melanotransferrin Knockout_affs275-1097 | GSM101413 | 51 | genotype;genotype | EFO_0000513;EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5;Homozygous negative | GENE_30060;TGEMO_00001 | factor | http://purl.org/commons/record/ncbi_gene/30060;http://purl.obolibrary.org/obo/TGEMO_00001 | total RNA |
Brain_Melanotransferrin Knockout_affs275-1098 | 50 | Brain_Melanotransferrin Knockout_affs275-1098 | GSM101414 | 50 | genotype;genotype | EFO_0000513;EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5;Homozygous negative | GENE_30060;TGEMO_00001 | factor | http://purl.org/commons/record/ncbi_gene/30060;http://purl.obolibrary.org/obo/TGEMO_00001 | total RNA |
Brain_Melanotransferrin Knockout_affs275-1101 | 49 | Brain_Melanotransferrin Knockout_affs275-1101 | GSM101415 | 49 | genotype;genotype | EFO_0000513;EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 | genotype | EFO_0000513 | http://www.ebi.ac.uk/efo/EFO_0000513 | Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5;Homozygous negative | GENE_30060;TGEMO_00001 | factor | http://purl.org/commons/record/ncbi_gene/30060;http://purl.obolibrary.org/obo/TGEMO_00001 | Melanotransferrin Knockout Mouse #1101 Brain |
Download expression data a study
studyIDs %>% sapply(function(x){datasetInfo(x,request= 'data',return= FALSE, file = paste0('data/',x))})
17 September 2018:
- Start writing changelog…
compileMetadata
function now returns all quality information in geeq. Existing columnames for batch effect information has been altered to better explain what they are.compileMetadata
now returns a list instead of a data frame for experiment specific information if the desired output is a list.- endpoint functions are fine if their naming variable is NULL. For most cases this shouldn’t happen but names are for interactive usage and should not be relied on.
- Started using proper semantic versioning
- TOC added to readme