Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency of seqinfo getter/setter on DNAStringSet #18

Open
PeteHaitch opened this issue Nov 7, 2018 · 1 comment
Open

Consistency of seqinfo getter/setter on DNAStringSet #18

PeteHaitch opened this issue Nov 7, 2018 · 1 comment

Comments

@PeteHaitch
Copy link

PeteHaitch commented Nov 7, 2018

Hi @hpages and @lawremi,

I was looking to set the Seqinfo on a DNAStringSet (created by importing a FASTA file) but found some inconsistent/incomplete behaviour, especially when rtracklayer was loaded and attached.
I'm not actually sure whether DNAStringSet is meant to support Seqinfo, but I'd really like it if it did (and had reliable/consistent behaviour).
I think this example should illustrate the issue(s).

Cheers,
Pete

suppressPackageStartupMessages(library(Biostrings))
suppressPackageStartupMessages(library(GenomeInfoDb))
suppressPackageStartupMessages(library(GenomicRanges))

y <- DNAStringSet(x = c("s1" = "CAG", "s2" = "GGGGGT"))
seqinfo(y)
#> Seqinfo object with 2 sequences from an unspecified genome; no seqlengths:
#>   seqnames seqlengths isCircular genome
#>   s1               NA         NA   <NA>
#>   s2               NA         NA   <NA>
seqnames(y) # Shouldn't this give same result as seqnames(seqinfo(y))?
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'seqnames' for signature '"DNAStringSet"'
seqlevels(y)
#> [1] "s1" "s2"
isCircular(y)
#> s1 s2 
#> NA NA
genome(y)
#> s1 s2 
#> NA NA
seqlengths(y)
#> s1 s2 
#> NA NA

seqinfo(y) <- Seqinfo(
  seqnames = as.character(seq_along(y)), 
  seqlengths = lengths(y) + 100, # Unsure if changing length should be allowed.
  isCircular = c(FALSE, TRUE),
  genome = rep("fake", length(y)))
seqinfo(y)
#> Seqinfo object with 2 sequences (1 circular) from fake genome:
#>   seqnames seqlengths isCircular genome
#>   1               103      FALSE   fake
#>   2               106       TRUE   fake
seqnames(y)
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'seqnames' for signature '"DNAStringSet"'
seqlevels(y)
#> [1] "1" "2"
isCircular(y)
#>     1     2 
#> FALSE  TRUE
genome(y)
#>      1      2 
#> "fake" "fake"
seqlengths(y) # Unsure if changing length should be allowed.
#>   1   2 
#> 103 106

suppressPackageStartupMessages(library(rtracklayer)) 
# Now get completely different results!
seqinfo(y)
#> Seqinfo object with 2 sequences from an unspecified genome:
#>   seqnames seqlengths isCircular genome
#>   s1                3         NA   <NA>
#>   s2                6         NA   <NA>
seqnames(y)
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'seqnames' for signature '"DNAStringSet"'
seqlevels(y)
#> [1] "s1" "s2"
isCircular(y)
#> s1 s2 
#> NA NA
genome(y)
#> s1 s2 
#> NA NA
seqlengths(y)
#> s1 s2 
#>  3  6

Created on 2018-11-07 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.5.1 (2018-07-02)
#>  os       macOS Sierra 10.12.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Adelaide          
#>  date     2018-11-07                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package              * version   date       lib source        
#>  assertthat             0.2.0     2017-04-11 [1] CRAN (R 3.5.0)
#>  backports              1.1.2     2017-12-13 [1] CRAN (R 3.5.0)
#>  base64enc              0.1-3     2015-07-28 [1] CRAN (R 3.5.0)
#>  Biobase                2.42.0    2018-10-30 [1] Bioconductor  
#>  BiocGenerics         * 0.28.0    2018-10-30 [1] Bioconductor  
#>  BiocParallel           1.16.0    2018-10-30 [1] Bioconductor  
#>  Biostrings           * 2.50.0    2018-10-30 [1] Bioconductor  
#>  bitops                 1.0-6     2013-08-17 [1] CRAN (R 3.5.0)
#>  callr                  3.0.0     2018-08-24 [1] CRAN (R 3.5.0)
#>  cli                    1.0.1     2018-09-25 [1] CRAN (R 3.5.0)
#>  crayon                 1.3.4     2017-09-16 [1] CRAN (R 3.5.0)
#>  debugme                1.1.0     2017-10-22 [1] CRAN (R 3.5.0)
#>  DelayedArray           0.8.0     2018-10-30 [1] Bioconductor  
#>  desc                   1.2.0     2018-05-01 [1] CRAN (R 3.5.0)
#>  devtools               2.0.1     2018-10-26 [1] CRAN (R 3.5.1)
#>  digest                 0.6.18    2018-10-10 [1] CRAN (R 3.5.1)
#>  evaluate               0.12      2018-10-09 [1] CRAN (R 3.5.1)
#>  fs                     1.2.6     2018-08-23 [1] CRAN (R 3.5.0)
#>  GenomeInfoDb         * 1.18.0    2018-10-30 [1] Bioconductor  
#>  GenomeInfoDbData       1.2.0     2018-10-06 [1] Bioconductor  
#>  GenomicAlignments      1.18.0    2018-10-30 [1] Bioconductor  
#>  GenomicRanges        * 1.34.0    2018-10-30 [1] Bioconductor  
#>  glue                   1.3.0     2018-07-17 [1] CRAN (R 3.5.0)
#>  htmltools              0.3.6     2017-04-28 [1] CRAN (R 3.5.0)
#>  IRanges              * 2.16.0    2018-10-30 [1] Bioconductor  
#>  knitr                  1.20      2018-02-20 [1] CRAN (R 3.5.0)
#>  lattice                0.20-38   2018-11-04 [1] CRAN (R 3.5.0)
#>  magrittr               1.5       2014-11-22 [1] CRAN (R 3.5.0)
#>  Matrix                 1.2-15    2018-11-01 [1] CRAN (R 3.5.0)
#>  matrixStats            0.54.0    2018-07-23 [1] CRAN (R 3.5.1)
#>  memoise                1.1.0     2017-04-21 [1] CRAN (R 3.5.0)
#>  pkgbuild               1.0.2     2018-10-16 [1] CRAN (R 3.5.1)
#>  pkgload                1.0.2     2018-10-29 [1] CRAN (R 3.5.1)
#>  prettyunits            1.0.2     2015-07-13 [1] CRAN (R 3.5.0)
#>  processx               3.2.0     2018-08-16 [1] CRAN (R 3.5.0)
#>  ps                     1.2.1     2018-11-06 [1] CRAN (R 3.5.1)
#>  R6                     2.3.0     2018-10-04 [1] CRAN (R 3.5.0)
#>  Rcpp                   0.12.19   2018-10-01 [1] CRAN (R 3.5.1)
#>  RCurl                  1.95-4.11 2018-07-15 [1] CRAN (R 3.5.0)
#>  remotes                2.0.2     2018-10-30 [1] CRAN (R 3.5.0)
#>  rlang                  0.3.0.1   2018-10-25 [1] CRAN (R 3.5.0)
#>  rmarkdown              1.10      2018-06-11 [1] CRAN (R 3.5.0)
#>  rprojroot              1.3-2     2018-01-03 [1] CRAN (R 3.5.0)
#>  Rsamtools              1.34.0    2018-10-30 [1] Bioconductor  
#>  rtracklayer          * 1.42.0    2018-10-30 [1] Bioconductor  
#>  S4Vectors            * 0.20.0    2018-10-30 [1] Bioconductor  
#>  sessioninfo            1.1.1     2018-11-05 [1] CRAN (R 3.5.1)
#>  stringi                1.2.4     2018-07-20 [1] CRAN (R 3.5.1)
#>  stringr                1.3.1     2018-05-10 [1] CRAN (R 3.5.0)
#>  SummarizedExperiment   1.12.0    2018-10-30 [1] Bioconductor  
#>  testthat               2.0.1     2018-10-13 [1] CRAN (R 3.5.0)
#>  usethis                1.4.0     2018-08-14 [1] CRAN (R 3.5.0)
#>  withr                  2.1.2     2018-03-15 [1] CRAN (R 3.5.0)
#>  XML                    3.98-1.16 2018-08-19 [1] CRAN (R 3.5.0)
#>  XVector              * 0.22.0    2018-10-30 [1] Bioconductor  
#>  yaml                   2.2.0     2018-07-25 [1] CRAN (R 3.5.1)
#>  zlibbioc               1.28.0    2018-10-30 [1] Bioconductor  
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library
@PeteHaitch PeteHaitch changed the title Cleaning up seqinfo getter/setter on DNAStringSet Consistency of seqinfo getter/setter on DNAStringSet Nov 7, 2018
@ahl27
Copy link
Collaborator

ahl27 commented Jun 13, 2024

Cleaning up older bug reports and coming back to this now--sorry for the lack of activity thus far. Some things have changed since this was initially reported, but there are still a few outstanding issues.

  • seqlengths are correctly inferred and consistent in the current build (all results return 3,6 regardless of rtracklayer)
  • seqnames(x) still throws an error when x is a DNAStringSet
  • changes lengths is not allowed on seqinfo<-

seqnames(x) should not throw an error for XStringSet objects, so I'll add that to the list of bugs to resolve. The rest of this should be good to go in the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants