Title

TCR repertoire analysis reveals chemotherapy induced clonal expansion in colorectal liver metastasis

Overview

This repo contains code used for analysis of TCRB sequencing data from patients with colorectal liver metastasis (CLM) treated with long or short doses of chemotherapy compared to non-treated patients [1]. The data are deposited on the immuneACCESS database at the following link with DOI: 10.21417/EH2023GS. The goal of this project is to understand the role of chemotherapy in reshaping the adaptive immune response in cancer.

This repo was created by Dennis Gong and Thomas Usherwood as part of the Biological Networks class at MIT (20.440). Please direct questions to [email protected] and [email protected].

Data

The data was generated using the immunoSEQ hsTCRB Kit with samples obtained by Høye et al., with an abstract published in Cancer Research, accessible here. The dataset contains 92 samples, with 35 patients receiving neoadjuvant chemotherapy (NACT) for short interval and 15 patients receiving long interval NACT. An additional 35 patients did not receive NACT. All repertoires are stored in Adaptive ImmunoSEQ format.

NOTE: The data files are large, so they are not included in the repository. When first cloning the repository, generate the required data folder structure using:

$ mkdir data
$ cd data/
$ mkdir analysis
$ mkdir processed
$ mkdir raw
$ cd ../

From the data listed above, populate ./data/raw/ with the sequencing files for each patient, and populate ./data/analysis with the sample overview file (both downloaded from the ImmunoSeq link above).

Folder Structure

20.440-TCR-Analysis/
|__ README.md		
|__ notebooks/						
|__ src/ 						
	|__ data/ 					
	|__ analysis/ 				
	|__ visualization/ 			
	|__ util/ 					
|__ data/						
	|__ raw/					
		|__ ChemoProjTCRs/	
	|__ processed/				
	|__ analysis/				
		|__ SampleOverview.tsv		
|__ fig/ 						
	|__ main_fig/				
	|__ tables/					
	|__ supp_fig/

'src' contains all scripts for generating results, which include scripts for cleaning data (data/), scripts for analyzing this cleaned data (analysis/), scripts for visualizing results (visualization/), and commonly reused scripts and helper functions (util/). In the data/ folder in the parent directory, raw data are stored in raw/ which contains all ImmunoSEQ formatted repertoires. Processed datasets and any intermediates are stored in processed/, and preprocessed data that are directly used for plotting are stored in analysis/. Exploratory notebooks are stored in the notebook/ directory in the parent directory. In the fig/ directory, there are three subdirectories used to store main figures, tables, and supplemental figures.

Installation

The entire pipeline for analysis and visualization can be run from main.py:

$ python main.py

This will perform all calculations and save plots to the appropriate folders. For example, the figure I generated for pset 4 is in ./fig/main_fig/unique_vdjgenes.jpg.

Note that in the main function of main.py, there is a setting to toggle whether saved data generated previously (stored in ./data/processed/) should be used to create plots, or whether data should be generated and saved from scratch. Toggle this by changing the load_precalculated variable.

Python package dependencies can be found in "requirements.txt". To intialize a virtual enviroment with these dependencies:

$ python -m venv venv
$ source ./venv/bin/activate
(venv) $ pip install -r requirements.txt

References

Høye, E. et al. Abstract 1346: T cell receptor repertoire sequencing reveals chemotherapy-driven clonal expansion in colorectal liver metastases. Cancer Research 82, 1346 (2022).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Title

Overview

Data

Folder Structure

Installation

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
fig/main_fig		fig/main_fig
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

tru489/20_440_TCR_Project

Folders and files

Latest commit

History

Repository files navigation

Title

Overview

Data

Folder Structure

Installation

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages