allelicBalance and checkHetsIndvVCF on WGS data #17

npatel-ah · 2024-04-30T15:53:12Z

Hello,

I've come across this informative article https://speciationgenomics.github.io/allelicBalance/ which brings me to the "allelicBalance" and "checkHetsIndvVCF" scripts.

As I read through the article it suggests that the script is not designed for WGS data but while reviewing the presentation here "https://github.com/speciationgenomics/presentations/blob/master/PDFs_2022/AllelicBalance_PCRduplication.pdf" it shows an example of WGS data.

What are the downsides of using the "checkHetsIndvVCF.sh" for WGS analysis? I am also curious how haploid individuals and the hard region of the genome like repeats will affect the results.

Regards,
Nihir

joanam · 2025-01-18T18:28:17Z

Hi Nihir,

This tools is absolutely for NGS data. I would use it on SNPs only. It starts with a vcf file. The tool only works for diploid individuals. Repeats (if collapsed in the reference genome) will generate wrong heterozygous sites and can thus in theory affect the results. If two repeat copies map to a single collapsed one in the reference genome, it would create false heterozygotes with a read frequency of 0.5 (at every site where the two repeat copies differ). This would not cause wrong signatures of contamination. If more than two repeat regions map to the collapsed region in the reference genome, it could in principle cause biased read frequencies. However, I have never seen this as a problem in all datasets I have worked with. Perhaps if the reference genome is quite poorly assembled or of a distant relative and there is a recent TE expansion, it could cause problems.

Best,
Joana

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allelicBalance and checkHetsIndvVCF on WGS data #17

allelicBalance and checkHetsIndvVCF on WGS data #17

npatel-ah commented Apr 30, 2024 •

edited

Loading

joanam commented Jan 18, 2025

allelicBalance and checkHetsIndvVCF on WGS data #17

allelicBalance and checkHetsIndvVCF on WGS data #17

Comments

npatel-ah commented Apr 30, 2024 • edited Loading

joanam commented Jan 18, 2025

npatel-ah commented Apr 30, 2024 •

edited

Loading