| makeTranscriptDbFromBiomart {GenomicFeatures} | R Documentation |
The makeTranscriptDbFromBiomart function allows the user
to make a TranscriptDb object from transcript annotations
available on a BioMart database.
getChromInfoFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl")
makeTranscriptDbFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl",
transcript_ids=NULL,
circ_seqs=DEFAULT_CIRC_SEQS)
biomart |
which BioMart database to use.
Get the list of all available BioMart databases with the
|
dataset |
which dataset from BioMart. For example:
|
transcript_ids |
optionally, only retrieve transcript annotation data for the specified set of transcript ids. If this is used, then the meta information displayed for the resulting TranscriptDb object will say 'Full dataset: no'. Otherwise it will say 'Full dataset: yes'. |
circ_seqs |
a character vector to list out which chromosomes should be marked as circular. |
makeTranscriptDbFromBiomart is a convenience function that feeds
data from a BioMart database to the lower level makeTranscriptDb
function.
See ?makeTranscriptDbFromUCSC for a similar function
that feeds data from the UCSC source.
BioMart databases that are known to have compatible transcript annotations are:
the most recent ensembl: ENSEMBL GENES (SANGER UK)
the most recent bacterial_mart: ENSEMBL BACTERIA (EBI UK)
the most recent fungal_mart: ENSEMBL FUNGAL (EBI UK)
the most recent metazoa_mart: ENSEMBL METAZOA (EBI UK)
the most recent plant_mart: ENSEMBL PLANT (EBI UK)
the most recent protist_mart: ENSEMBL PROTISTS (EBI UK)
the most recent ensembl_expressionmart: EURATMART (EBI UK)
Not all annotations will have CDS information.
A TranscriptDb object.
M. Carlson and H. Pages
listMarts,
useMart,
listDatasets,
DEFAULT_CIRC_SEQS,
makeTranscriptDbFromUCSC,
makeTranscriptDb
## Discover which datasets are available in the "ensembl" BioMart
## database:
library(biomaRt)
listDatasets(useMart("ensembl"))
## Retrieving an incomplete transcript dataset for Human from the
## "ensembl" BioMart database:
transcript_ids <- c(
"ENST00000268655",
"ENST00000313243",
"ENST00000341724",
"ENST00000400839",
"ENST00000400840",
"ENST00000435657",
"ENST00000478783"
)
txdb <- makeTranscriptDbFromBiomart(transcript_ids=transcript_ids)
txdb # note that these annotations match the GRCh37 genome assembly