- Fix up CLI tool to take command line options instead of being interactive ✅
- Come up with a suitable architecture for the web service ✅
- Think about relation between web service and cli tool ✅
- Remove dependence on AWS (for ro-data) ✅
- Create database container or data storage that is permanent to store gene analysis results ✅
- Consider hosting bedfiles directly on running machine for UCSC Genome Browser ✅
- Look into
robots.txt
and consider creating one
- Prune suitable requirements for each of the services
- Reduce image size (base from Alpine, remove apt-caches, multi-stage build)
- Create multiple containers for different processes (computation, web serving, database) ✅
- Orchestrate with compose ✅ (read up on diff between compose and Kubernetes)
- Mount docker volume for large read-only binding site data (separate data from logic) ✅ (Data is now handled by cli tool, and supplied through volume)
- Push to Docker Hub or other suitable registry ✅
- Finish up unit tests
- Test binding site accuracy
- The following should hold for the BED file outputs, when analyzing Malat1:
- Attract: A1CF should have a binding site on
chr11 65503843 65503850
- POSTAR: AGGF1 should have a binding site on
chr11 65499702 65499791
- RBPDB: ELAVL2 should have a binding site on
chr11 65498029 65498033
- Attract: A1CF should have a binding site on
- Implement methods straight from paper methodologies so each data source matches database exactly
- Make use of binding strength; add shade on UCSC
- Make binding density plots a weighted sum based on strengths
- Consider not merging binding sites at all
- Migrate from Heroku to personal server, ensure easy docker set up ✅
- Modify GitLab CI accordingly
- Figure out daemonization of Django process, fault tolerance, etc.
- Get d for web ✅ (webd is a process now)
- Add chromosome diagrams for anaysis completion view
- Perform background continuous analysis of RNA binding sites
- Create a single UCSC track for binding site densities for each method, host on S3 or server
- Add explanation on "About" section ✅ (Perhaps room for improvement)
- Convert jobs on GitLab to GitHub Actions
- Disable mirroring on GitLab ✅
- If gene has no binding sites it could raise errors (at least for POSTAR, e.g. SPRR4) ✅ (SPRR4 case onl check for zero binding sites from all sources)
Unrecognized 'db' errors (no idea how to reproduce)
- Direct to HTTPS for UCSC Genome Browser ✅
- Make personal server use SSL as well ✅
- Allow generating a track dedicated for a particular RBP showing binding sites for all genes analyzed so far
- Consider using a different genome browser (e.g. (JBrowse)[https://github.com/GMOD/jbrowse]) or (IGV)[https://github.com/igvteam/igv.js/]
- Consider the scalability of the current set up
- Consider security against DOS or otherwise
- Set up production vs development envs ✅ (Refer to Makefile)
- Automate production-push
- Consider serving static files via Apache / nginx instead of Django ✅