Skip to content

Final project for 20.440 on TCR diversity in the setting of neo-adjuvant chemotherapy treatment of colorectal cancer liver metastasis

Notifications You must be signed in to change notification settings

tru489/20_440_TCR_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Title

TCR repertoire analysis reveals chemotherapy induced clonal expansion in colorectal liver metastasis

Overview

This repo contains code used for analysis of TCRB sequencing data from patients with colorectal liver metastasis (CLM) treated with long or short doses of chemotherapy compared to non-treated patients [1]. The data are deposited on the immuneACCESS database at the following link with DOI: 10.21417/EH2023GS. The goal of this project is to understand the role of chemotherapy in reshaping the adaptive immune response in cancer.

This repo was created by Dennis Gong and Thomas Usherwood as part of the Biological Networks class at MIT (20.440). Please direct questions to [email protected] and [email protected].

Data

The data was generated using the immunoSEQ hsTCRB Kit with samples obtained by Høye et al., with an abstract published in Cancer Research, accessible here. The dataset contains 92 samples, with 35 patients receiving neoadjuvant chemotherapy (NACT) for short interval and 15 patients receiving long interval NACT. An additional 35 patients did not receive NACT. All repertoires are stored in Adaptive ImmunoSEQ format.

NOTE: The data files are large, so they are not included in the repository. When first cloning the repository, generate the required data folder structure using:

$ mkdir data
$ cd data/
$ mkdir analysis
$ mkdir processed
$ mkdir raw
$ cd ../

From the data listed above, populate ./data/raw/ with the sequencing files for each patient, and populate ./data/analysis with the sample overview file (both downloaded from the ImmunoSeq link above).

Folder Structure

20.440-TCR-Analysis/
|__ README.md		
|__ notebooks/						
|__ src/ 						
	|__ data/ 					
	|__ analysis/ 				
	|__ visualization/ 			
	|__ util/ 					
|__ data/						
	|__ raw/					
		|__ ChemoProjTCRs/	
	|__ processed/				
	|__ analysis/				
		|__ SampleOverview.tsv		
|__ fig/ 						
	|__ main_fig/				
	|__ tables/					
	|__ supp_fig/				

'src' contains all scripts for generating results, which include scripts for cleaning data (data/), scripts for analyzing this cleaned data (analysis/), scripts for visualizing results (visualization/), and commonly reused scripts and helper functions (util/). In the data/ folder in the parent directory, raw data are stored in raw/ which contains all ImmunoSEQ formatted repertoires. Processed datasets and any intermediates are stored in processed/, and preprocessed data that are directly used for plotting are stored in analysis/. Exploratory notebooks are stored in the notebook/ directory in the parent directory. In the fig/ directory, there are three subdirectories used to store main figures, tables, and supplemental figures.

Installation

The entire pipeline for analysis and visualization can be run from main.py:

$ python main.py

This will perform all calculations and save plots to the appropriate folders. For example, the figure I generated for pset 4 is in ./fig/main_fig/unique_vdjgenes.jpg. Alt text

Note that in the main function of main.py, there is a setting to toggle whether saved data generated previously (stored in ./data/processed/) should be used to create plots, or whether data should be generated and saved from scratch. Toggle this by changing the load_precalculated variable.

Python package dependencies can be found in "requirements.txt". To intialize a virtual enviroment with these dependencies:

$ python -m venv venv
$ source ./venv/bin/activate
(venv) $ pip install -r requirements.txt

References

  1. Høye, E. et al. Abstract 1346: T cell receptor repertoire sequencing reveals chemotherapy-driven clonal expansion in colorectal liver metastases. Cancer Research 82, 1346 (2022).

About

Final project for 20.440 on TCR diversity in the setting of neo-adjuvant chemotherapy treatment of colorectal cancer liver metastasis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages