You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking to set the Seqinfo on a DNAStringSet (created by importing a FASTA file) but found some inconsistent/incomplete behaviour, especially when rtracklayer was loaded and attached.
I'm not actually sure whether DNAStringSet is meant to support Seqinfo, but I'd really like it if it did (and had reliable/consistent behaviour).
I think this example should illustrate the issue(s).
Cheers,
Pete
suppressPackageStartupMessages(library(Biostrings))
suppressPackageStartupMessages(library(GenomeInfoDb))
suppressPackageStartupMessages(library(GenomicRanges))
y<- DNAStringSet(x= c("s1"="CAG", "s2"="GGGGGT"))
seqinfo(y)
#> Seqinfo object with 2 sequences from an unspecified genome; no seqlengths:#> seqnames seqlengths isCircular genome#> s1 NA NA <NA>#> s2 NA NA <NA>
seqnames(y) # Shouldn't this give same result as seqnames(seqinfo(y))?#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'seqnames' for signature '"DNAStringSet"'
seqlevels(y)
#> [1] "s1" "s2"
isCircular(y)
#> s1 s2 #> NA NA
genome(y)
#> s1 s2 #> NA NA
seqlengths(y)
#> s1 s2 #> NA NA
seqinfo(y) <- Seqinfo(
seqnames= as.character(seq_along(y)),
seqlengths= lengths(y) +100, # Unsure if changing length should be allowed.isCircular= c(FALSE, TRUE),
genome= rep("fake", length(y)))
seqinfo(y)
#> Seqinfo object with 2 sequences (1 circular) from fake genome:#> seqnames seqlengths isCircular genome#> 1 103 FALSE fake#> 2 106 TRUE fake
seqnames(y)
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'seqnames' for signature '"DNAStringSet"'
seqlevels(y)
#> [1] "1" "2"
isCircular(y)
#> 1 2 #> FALSE TRUE
genome(y)
#> 1 2 #> "fake" "fake"
seqlengths(y) # Unsure if changing length should be allowed.#> 1 2 #> 103 106
suppressPackageStartupMessages(library(rtracklayer))
# Now get completely different results!
seqinfo(y)
#> Seqinfo object with 2 sequences from an unspecified genome:#> seqnames seqlengths isCircular genome#> s1 3 NA <NA>#> s2 6 NA <NA>
seqnames(y)
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'seqnames' for signature '"DNAStringSet"'
seqlevels(y)
#> [1] "s1" "s2"
isCircular(y)
#> s1 s2 #> NA NA
genome(y)
#> s1 s2 #> NA NA
seqlengths(y)
#> s1 s2 #> 3 6
Cleaning up older bug reports and coming back to this now--sorry for the lack of activity thus far. Some things have changed since this was initially reported, but there are still a few outstanding issues.
seqlengths are correctly inferred and consistent in the current build (all results return 3,6 regardless of rtracklayer)
seqnames(x) still throws an error when x is a DNAStringSet
changes lengths is not allowed on seqinfo<-
seqnames(x) should not throw an error for XStringSet objects, so I'll add that to the list of bugs to resolve. The rest of this should be good to go in the latest release.
Hi @hpages and @lawremi,
I was looking to set the Seqinfo on a DNAStringSet (created by importing a FASTA file) but found some inconsistent/incomplete behaviour, especially when rtracklayer was loaded and attached.
I'm not actually sure whether DNAStringSet is meant to support Seqinfo, but I'd really like it if it did (and had reliable/consistent behaviour).
I think this example should illustrate the issue(s).
Cheers,
Pete
Created on 2018-11-07 by the reprex package (v0.2.1)
Session info
The text was updated successfully, but these errors were encountered: