The Sequences That Make Up Our World


“..151630-54902-05099-56397…”, one evening in early May of 1942, such a message fell on the desk of Captain J.J Rochefort. The message was written in code, and had to be painstakingly deciphered character by character. Rochefort and his team of cryptographers had been deciphering coded messages for years, but this one seemed to be special. It hinted of an attack on a mysterious location referred to as “AF”. After a few more weeks of research, Rochefort realized “AF” referred to an attack on an important U.S. naval base located on the island of Midway, in the Pacific. Less than a month later, the legendary Battle of Midway between the Japanese and US forces took place on this very same naval base, shaking the world. 

The U.S.’s  victory at Midway wouldn’t have been possible without the help of Rochefort and his team. The crucial information was coded in JN-25, a Japanese naval cipher used in the 1940s. All information needs to be coded in some form in order to store and transfer it – whether it is for cellphones, inside your computer, or inside cells. Today, new innovations are helping us understand how information is encoded and stored. 

From cassettes to the modern solid-state disks, the encoding and storage of digital data has evolved over the past century. The first form of digital data storage was punch cards, which encoded data by using the presence or absence of holes in the cards. In 1956, the hard disk drive was introduced. At first, it weighed over a ton and could only hold 3.75MB of data, but as the hard disk drive developed, it became smaller and was able to store much more data. These days, the hard disk drive is the most popular way to store digital data and can store up to 12 terabytes of data. Approximately 2.7 zettabytes (one zettabyte is one trillion terabytes) of data exists in the digital world today, with 90 percent of that being stored in the past 5 years. We generate around 2.5 quintillion bytes of data every day, and will create over 160 zettabytes of data by 2025. Where will we store all this data in the future?  

Perhaps we need to learn from biology. In 1944, Avery, Macleod, and McCarty at  Rockefeller University discovered that the molecule that encodes and stores genetic information is DNA. This paved the way for Francis Crick and James Watson to discover the structure of DNA – the double helix. The information in DNA is encoded in sequences of four molecules: Adenosine, Thymine, Guanine, and Cytosine. A gene is a sequence of DNA that encodes the instructions for manufacturing a specific protein. Every living organism on the planet has information stored inside DNA, which shapes nearly everything about that organism. Today, we have not only deciphered the sequence that makes us, but there are plans to revive species that have gone extinct. Thousands of years ago, magnificent creatures called wooly mammoths roamed the Earth. Today, a man named George Church plans to revive them. 

DNA not only encodes the blueprint for life, it may also hold the solution to our growing need for data storage. Scientists at Harvard University have already encoded DNA with digital information. A single gram of DNA can hold up to 2.2 petabytes (around 2.2 thousand terabytes) of digital data. This means that all of the digital data in the world today would fit in just a small dish! Compare this to the 8,000 big data centers that currently exist. Next time you click save on your computer just think about it.