The conclusion of the Human Genome Project in 2003 marked the first time that a human genome had been sequenced, with roughly 3 billion base pairs being recorded. Since then, the amount of genomic data has increased dramatically, so much so that in the 15 years between then and now, we’ve expanded the number of sequenced human genomes into the tens of thousands.
With this explosion of genomic data, the strain on our computational resources has likewise spiked and bioinformaticians have not always had the technology to cope with the increased demand. The expense of new hardware and flawed development approaches have restricted researchers’ ability to handle genomic data efficiently. Further, long wait times have increased both the running and maintenance costs of in-house networks and the time commitment necessary of each researcher.
More recently, technology capable of handling large datasets has started to become available to researchers. At the same time, the cost of computing has been coming down as a result of improved hardware production and efficiency. The increased availability of this technology has med it easier than ever to move into the genomics space and start working with biological data.
One year ago, we produced the first edition of our Genomic Data 101 to present you with an introduction to the technology and hardware available to facilitate data storage and analysis. Now, we’ve written a new guide with broader, deeper content to help you understand what considerations you need to make when designing a computational genomics workflow or platform.
With help from our sponsors, we’ve worked to bring you a clear, unbiased introduction to computational genomics and genomic data handling. We hope that you find this guide to be interesting, and most importantly, useful.Download