Predicting the orthopoxvirus-mammal network with viral genetic features
This repository contains data and code to reproduce all analyses from "Viral genomic features predict orthopoxvirus reservoir hosts" by Tseng et al (2025).
To run the analysis, you can fork and clone this repository or download the "Tseng2020" folder directly to your local desktop (see File organization (recommended)). The code is separated into two markdown files, each contained within their corresponding folders: one for the host prediction model – HostPrediction_Code.Rmd – and the other for the link prediction model – LinkPrediction_Code.Rmd. Both markdown files are organized similarly into five parts (see Code organization) and draw data from the same source file, Data_raw.RData.
-
PoxHost > data > ... All source data can be found here with the exception of MAMMALS.shp, a large shape file (>1GB) of mammal geographical range required for mapping host distributions in the section titled "3. Mapping Host Distribution" (PoxHost > scripts > 3_Results&Figures.Rmd). We recommend you download this file separatel and save it directly to the the working directory of your local desktop. The file can be obtained from the IUCN Red List Spatial Database https://www.iucnredlist.org/resources/spatial-data-download.
-
PoxHost > figures > ... All figures, tables, and other outputs are saved to this file, which contain the following additional subfolders:
other > hosttrait other > linkpred > pca > ... supplementary > ...
-
PoxHost > renv > ... This folder contains the reproducible environment (package versions) for running the scripts in R.
-
PoxHost > scripts > ... All code for data cleaning, analysis, and visualizations can be found in the markdown files below and progress in the order as listed:
1_HostTraitModel.Rmd 2_LinkPredictionModel.Rmd 3_Results&Figures.Rmd
For HostTraitModel_Code.Rmd:
- Data Preparation
- Input: hosttrait_rawdata.RData
- Output: hosttrait_cleandata.RData
- Phylogenetic analysis
- BRT Model
- Input: hosttrait_cleandata.RData
- Output: pcr_brts.rds, comp_brts.rds, pm_brts.rds
For LinkPredictionModel_Code.Rmd:
- Dimensionality Reduction
- Input: OPVnew_nowwithVirus.xlsx
- Output: wgs_PCs.RData
- Data Preparation
- Input: linkpred_rawdata, wgs_PCs.RData
- Output: linkpred_cleandata.RData
- BRT Model
- Input: LinkData_clean.RData
To run the BRT models on a high performance computing (HPC) cluster (highly recommended), sample scripts and examples are available in the folder ~/PoxHost/hpc.