It’s safe to say that DNA is an important part of our daily lives. Not only does DNA code for our characteristics as humans, but it differentiates us from other species, and is frequently used to solve crimes. Despite it’s biological usefulness, you might be wondering what our genetic code has to do with Shakespeare and his sonnets. The answer may very well be the future of data storage.

It might sound as if it’s straight from the pages of a science fiction novel, but scientists have recently discovered a way to synthesize DNA encoded with information of all sorts–including all 154 of Shakespeare’s sonnets. Researcher Nick Goldman, of the European Bioinformatics Institute of Hinxton, England, is one of the leading experts in the field. As Goldman explains, DNA has the potential to replace hard drives and disks because of its long lifespan and durability. DNA can also store enormous amounts of data in a small space. In fact, Goldman estimates that all of the information in the world today–about one billion trillion bytes–would fit “in the back of your station wagon.”

Nick Goldman points to a segment of DNA storing approximately one billion megabytes of data.


It’s a bit difficult to fathom at first, but the idea behind storing data in DNA is relatively simple. The secret lies in the double helix shape of the DNA molecule. While technology now allows us to store information in two-dimensional items like CDs and microchips, DNA is a tightly wound three-dimensional molecule, enabling the storage of more information in a compact space. The structure of DNA is also responsible for the synthetic process used to encode it with information. DNA is composed of two sets of base pairs: adenine (A) with thymine (T), and guanine (G) with cytosine (C). Each base is a nitrogen containing compound, and when a base bonds to its pair, it helps to hold the double helix of the DNA molecule together.

The structure of DNA is the key to its ability to store data.

In using DNA as a storage device, scientists like Nick Goldman have taken advantage of the A, T, and G, C base pairings. By representing the bases with ones and zeros, scientists have been able to use binary–the system responsible for data encoding in computers–within molecules of DNA. In order to convert between DNA code and binary, Goldman and his colleagues wrote software capable of doing so. Shakespeare’s sonnets, for example, were first represented in binary in a computer. The software then converted the binary into a series of A’s, T’s, G’s, and C’s, which could be made into a strand of DNA. (This is a diagram that makes it a little easier to visualize.)


Containing information in DNA doesn’t come without its problems however. With current technology, synthesizing and encoding information is extremely expensive, and therefore isn’t practical outside of a lab. Because it is so dense, DNA is also heavy, making large quantities somewhat difficult to transport. But, that’s not to say that there isn’t hope for this field of science. Given the pace at which technology is progressing, it’s probable that software and tools similar to those available to Nick Goldman will one day accessible to the public. Though the process isn’t perfect now, Goldman speculates that in as little as 10 years, it may be “economically viable.” The decreasing cost of DNA synthesis and the success of the project thus far are good indicators that scientists like Goldman are on to something. Before we know it, we may be storing much more than our genetic code in a double helix.

Nick Goldman was featured on NPR’s Science Friday last month. You can listen to his interview here.


Print Friendly

« »