Opportunity Preview

CGT: COVID-19 Genotyping Tool

Technology

Genotyping the viral evolution of SARS-CoV-2 via machine learning

Background

The COVID-19 global pandemic is the greatest healthcare challenge of our time. Whole-genome sequencing (WGS) has provided a means to track SARS-CoV-2 viral genome evolution, which has direct implications for vaccine and therapeutic development. The unprecedented research community effort to make COVID-19 related data accessible has resulted in over 2000 publicly available SARS-CoV-2 WGS sequences on GISAID.

Technology Overview

Researchers at UHN sought to analyze these sequences to gain genomic and epidemiological insights, and deliver them through a web-based R-Shiny application - the COVID-19 Genotyping Tool (CGT).

CGT not only summarizes complex WGS data through informative visualizations such as UMAP and minimum spanning-tree networks, but also allows for users to upload in-house SARS-CoV-2 genome sequencing data for concurrent analysis with public data. Using CGT with GISAID data, the team discovered sequence clusters of mixed and homogenous origin, outbreak hubs in sequence networks, and frequent single-nucleotide polymorphisms in structural protein coding genes of SARS-CoV-2.

The CGT application provides a user-friendly interactive platform for genotyping and epidemiological surveillance of SARS-CoV-2.