Visanu Wanchai, Preecha Patumcharoenpol, Intawat Nookaew, and David Ussery
Background: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for more than 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy- to-use database.
Conclusion: dBBQs provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web- interface. These scores can be used as cut-offs to get a high- quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly as number of genomes in public databases growing rapidly and is freely use for non-commercial purpose. This work is funded in part from the Arkansas Research Alliance and the Helen Adams & Arkansas Research Alliance Professor & Chair.
Arkansas Center for Genomic Epidemiology & Medicine and The Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas 72205