Skip to content

Commit

Permalink
Merge pull request #836 from daisymut/sammyseq
Browse files Browse the repository at this point in the history
Added test data for SAMMYseq data, mapping on part of human chr22
  • Loading branch information
daisymut authored Oct 17, 2023
2 parents 3b6777b + 18c0ab1 commit 6a78a93
Show file tree
Hide file tree
Showing 6 changed files with 26 additions and 41 deletions.
41 changes: 13 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,23 @@
# ![nfcore/test-datasets](docs/images/test-datasets_logo.png)
Test data to be used for automated testing with the nf-core pipelines

> ⚠️ **Do not merge your test data to `master`! Each pipeline has a dedicated branch (and a special one for modules)**
# test-datasets: `sammyseq`

## Introduction
This branch contains data to be used for automated testing with the [nf-core/sammyseq](https://github.com/daisymut/sammyseq) pipeline.

nf-core is a collection of high quality Nextflow pipelines. This repository contains various files for CI and unit testing of nf-core pipelines and infrastructure.
## Content of this repository

The principle for nf-core test data is as small as possible, as large as necessary. Please see the [guidelines](https://nf-co.re/docs/contributing/test_data_guidelines) for more detailed information. Always ask for guidance on the [nf-core slack](https://nf-co.re/join) before adding new test data.
`testdata/CTRL004_S*_chr22only.fq.gz`: Human fibroblast single-end test data for pipeline sub-sampled to map on part of chr22.

## Documentation
## Minimal test dataset origin

nf-core/test-datasets comes with documentation in the `docs/` directory:
_H. sapiens_ fibroblast, 50bp single-end 3-fraction SAMMY-seq sequences was obtained from:

01. [Add a new test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/ADD_NEW_DATA.md)
02. [Use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
> Sebestyén, E., Marullo, F., Lucini, F. et al. SAMMY-seq reveals early alteration of heterochromatin and deregulation of bivalent genes in Hutchinson-Gilford Progeria Syndrome. Nat Commun 11, 6274 (2020). https://doi.org/10.1038/s41467-020-20048-9. [Pubmed](https://pubmed.ncbi.nlm.nih.gov/33293552/) [GEO](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE118633)
## Downloading test data
### Sampling information

Due the large number of large files in this repository for each pipeline, we highly recommend cloning only the branches you would use.

```bash
git clone <url> --single-branch --branch <pipeline/modules/branch_name>
```

To subsequently clone other branches[^1]

```bash
git remote set-branches --add origin [remote-branch]
git fetch
```

## Support

For further information or help, don't hesitate to get in touch on our [Slack organisation](https://nf-co.re/join/slack) (a tool for instant messaging).

[^1]: From [stackoverflow](https://stackoverflow.com/a/60846265/11502856)
| GEO_sample | run_accession | read_count | SRA_experiment | sample_title |
| ---------- | ------------- | ---------- | -------------- | -------------------- |
| GSM3335763 | SRR7610706 | 78683296 | SRX4475555 | CTRL004 SAMMY-seq S2 |
| GSM3335764 | SRR7610707 | 60438514 | SRX4475554 | CTRL004 SAMMY-seq S3 |
| GSM3335765 | SRR7610708 | 54864540 | SRX4475553 | CTRL004 SAMMY-seq S4 |
20 changes: 10 additions & 10 deletions docs/ADD_NEW_DATA.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested when adding a new test dataset.

- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) that there isn't already a branch containing data that could be used
- If this is the case, follow the [documentation on how to use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
- [ ] Fork the [nf-core/test-datasets repository](https://github.com/nf-core/test-datasets) to your GitHub account
- [ ] Create a new branch on your fork
- [ ] Check your proposed test data follows the [guidelines](https://nf-co.re/docs/contributing/test_data_guidelines)
- [ ] Add your test dataset
- [ ] If you clone it locally use `git clone <url> --branch <branch> --single-branch`
- [ ] Make a PR on a new branch with a relevant name
- [ ] Wait for the PR to be merged
- [ ] Use this newly created branch for your tests
- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) that there isn't already a branch containing data that could be used
- If this is the case, follow the [documentation on how to use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
- [ ] Fork the [nf-core/test-datasets repository](https://github.com/nf-core/test-datasets) to your GitHub account
- [ ] Create a new branch on your fork
- [ ] Check your proposed test data follows the [guidelines](https://nf-co.re/docs/contributing/test_data_guidelines)
- [ ] Add your test dataset
- [ ] If you clone it locally use `git clone <url> --branch <branch> --single-branch`
- [ ] Make a PR on a new branch with a relevant name
- [ ] Wait for the PR to be merged
- [ ] Use this newly created branch for your tests
6 changes: 3 additions & 3 deletions docs/USE_EXISTING_DATA.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested when adding a new test dataset.

- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) to find the branch corresponding to the test dataset you want to use
- [ ] Specify in the `conf/test.config` the path to the files from the test dataset
- [ ] Set up your CI tests following the nf-core best practices (cf [.github/workflows/ci.yml template](https://github.com/nf-core/tools/blob/dev/nf_core/pipeline-template/.github/workflows/ci.yml))
- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) to find the branch corresponding to the test dataset you want to use
- [ ] Specify in the `conf/test.config` the path to the files from the test dataset
- [ ] Set up your CI tests following the nf-core best practices (cf [.github/workflows/ci.yml template](https://github.com/nf-core/tools/blob/dev/nf_core/pipeline-template/.github/workflows/ci.yml))
Binary file added testdata/CTRL004_S2_chr22_tinier.fq.gz
Binary file not shown.
Binary file added testdata/CTRL004_S3_chr22_tinier.fq.gz
Binary file not shown.
Binary file added testdata/CTRL004_S4_chr22_tinier.fq.gz
Binary file not shown.

0 comments on commit 6a78a93

Please sign in to comment.