There is a data storage medium that will last and will be readable far into the future: DNA. And, storing patient data in living cells on the patient’s body is the ultimate way to keep important information with the patient.
A central point of contention in the ongoing big data privacy debate is concern over data being stored forever, and therefore retrievable for unknown future uses.
But given how fast storage media and related reader technologies become obsolete or deteriorate over time—and therefore become unusable—there’s a very good chance that much of the data will eventually be irretrievable. While that may be welcomed news for individuals who desire privacy, it’s a challenge for researchers who want to preserve important medical information for future generations.
Medical researchers want to be able to store data for long periods of time as it could prove invaluable for future cures and treatments—even thousands or millions of years from now. More currently, healthcare providers would like to see patient data follow the patient—rather than be stored in a multitude of databases, not all of which are digitally connected.
Fortunately, answers have been found, albeit not yet perfected.
There is a data storage medium that will last and will be readable far into the future: DNA. And, storing patient data in living cells on the patient’s body, most likely on the skin, is the ultimate way to keep the information with the patient. Both of these things have already been accomplished.
How the Swiss did it this year
“Just 1 gram of DNA is theoretically capable of holding 455 exabytes – enough for all the data held by Google, Facebook and every other major tech company, with room to spare. It's also incredibly durable: DNA has been extracted and sequenced from 700,000-year-old horse bones. But conditions have to be right for it to last,” writes Jacob Aron in a post in New Scientist.
Aron went on to explain that Robert Grass of the Swiss Federal Institute of Technology in Zurich and his colleagues created DNA strands, treated A and C in the DNA bases as a “0” and G and T as a “1” to store the data. To prevent data loss from damage to the DNA, they used a Reed-Solomon code, which includes redundancies for backup. They also encapsulated the DNA in “microscopic spheres of glass” to prevent corruption of the data or deterioration of the DNA strands from moisture or other environmental factors.
Then they decided to test the longevity of DNA storage handled in this way.
“To test how long this storage system might last, they encoded two venerable documents, totalling 83 kilobytes: the Swiss federal charter from 1291, and the Archimedes Palimpsest, a 10th-century version of ancient Greek texts. DNA versions of these texts were kept at 60, 65 and 70 °C for a week to simulate ageing. They remained readable without any errors (Angewandte Chemie, doi.org/f23gmf),” Aron reported.
“The results suggest that data in DNA form could last 2000 years if kept at a temperature of around 10 °C. The Global Seed Vault in the Arctic could preserve it for over 2 million years at a chilly -18 °C, offering truly long-term storage.”
How Harvard did it in 2012
George Church and Sri Kosuri at Harvard stored 5.5 petabytes of data, roughly 700 terabytes, in DNA in 2012. They too substituted A and C for “0” and T and G for “1” – the binary code.
“To read the data stored in DNA, you simply sequence it — just as if you were sequencing the human genome — and convert each of the TGAC bases back into binary. To aid with sequencing, each strand of DNA has a 19-bit address block at the start — so a whole vat of DNA can be sequenced out of order, and then sorted into usable data using the addresses,” reported Sebastian Anthony in a post in ExtremeTech that year.
I recommend you read that entire post and note the chart and video within to understand their work in more detail. I will, however, highlight one more thing from that post here as it too is apropos to storing data in organic data bases:
“It’s also worth noting that it’s possible to store data in the DNA of living cells — though only for a short time. Storing data in your skin would be a fantastic way of transferring data securely.”
As work progresses along this front by many scientists in several countries, you can expect the costs of storing data in DNA and in living cells to become less expensive and eventually commonplace.
For a quick glance at storing data in living cells, consider this excerpt from a published research paper in PNAS by researchers at the University of Washington:
“Our core [rewriteable recombinase addressable data] RAD memory element is capable of passive information storage in the absence of heterologous gene expression for over 100 cell divisions and can be switched repeatedly without performance degradation, as is required to support combinatorial data storage. We also demonstrate how programmed stochasticity in RAD system performance arising from bidirectional recombination can be achieved and tuned by varying the synthesis and degradation rates of recombinase proteins. The serine recombinase functions used here do not require cell-specific cofactors and should be useful in extending computing and control methods to the study and engineering of many biological systems.”
You can expect data storage in living cells to improve over time.
How the European Bioinformatics Institute did it in 2013
Nick Goldman, a molecular biologist at the European Bioinformatics Institute in Hinxton, England and his colleagues also successfully stored data on DNA and in addition developed a new technique that included error-correction software.
“We encoded computer files totaling 739 kilobytes of hard-disk storage and with an estimated Shannon information10 of 5.2 × 106 bits into a DNA code, synthesized this DNA, sequenced it and reconstructed the original files with 100% accuracy,” wrote Goldman in his research paper published in Nature.
“Theoretical analysis indicates that our DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving. In fact, current trends in technological advances are reducing DNA synthesis costs at a pace that should make our scheme cost-effective for sub-50-year archiving within a decade.”
Feel free to read the entire research report for more technical details. For those who would rather see the information in a less technical, more easily digested form, watch Dr. Goldman’s TEDxPrague video.
What data storage on DNA will likely mean to healthcare
In the long-term, data stored on DNA means that little to no information will be lost to the ravages of time and the passing of modern day technologies. This will prove invaluable to future medical research in terms of finding new treatments and cures, but also in finding patch treatments when treatments in that time period may fail.
For example, consider our current worldwide antibiotic crisis. As antimicrobial resistance continues to rise, physicians are turning to much older antibiotics to treat patients. This approach is working, at least for now, because the older antibiotics have been out of circulation for quite some time and the microbes are therefore not prepared to counter them. However, the older antibiotics tend to be more toxic and therefore best used only as a patch measure until new and better antibiotics are found.
Future generations may encounter a similar situation. Having massive amounts of data available to them—even ancient data—could prove to be the saving grace for their predicaments.
What data storage in living cells will likely mean to healthcare
Shorter-term data storage on living cells could prove to be an effective and cost efficient means for patients to carry their medical data with them. In this way, physicians and healthcare providers can merely read the data off the patient’s body to get a complete medical history in seconds. This would be particularly helpful in treating unconscious or mentally impaired patients successfully.
When coupled with other technologies, such as nanodiagnostics—which can do numerous things, including examining and reporting on the body’s condition— the correct treatments could be applied rapidly as there will be no need to delay for testing through conventional means to find the source of the problem. This means that patient outcomes will be vastly improved and medical costs will dramatically fall (at least in theory), since many of the testing costs no longer apply.
Stay tuned for more news on what’s happening—and what to expect down the road—with healthcare data.
The nuviun industry network is intended to contribute to discussion and stimulate debate on important issues in global digital health. The views are solely those of the author.