Data-Driven Virus Discovery: Characterizing Viromes and Increasing Pandemic Preparedness
Data-Driven Virus Discovery (DDVD) is revolutionizing the way novel viruses are discovered. Being independent of the collection and processing of biological samples, DDVD allows for screening massive amounts of next generation sequencing (NGS) data for the presence of known and unknown viral genome sequences. We utilize DDVD to analyze 1+ million of public NGS datasets from the Sequence Read Archive (SRA) and find 150+ thousand sequences of viral origin. We use these data to assess the risk of spillover into humans across the RNA viruses and to study various aspects of viral evolution across geologic time scales.