高级搜索

DNA数据存储

毛秀海 李凡 左小磊

引用本文: 毛秀海, 李凡, 左小磊. DNA数据存储[J]. 电子与信息学报, doi: 10.11999/JEIT190852 shu
Citation:  Xiuhai MAO, Fan LI, Xiaolei ZUO. DNA Data Storage[J]. Journal of Electronics and Information Technology, doi: 10.11999/JEIT190852 shu

DNA数据存储

    作者简介: 毛秀海: 男,1986年生,副研究员,研究方向为DNA纳米技术;
    李凡: 男,1983年生,副研究员,研究方向为分子医学及DNA纳米技术;
    左小磊: 男,1980年生,研究员,研究方向为DAN电化学传感器,3DDNA探针和癌症早期诊断
    通讯作者: 左小磊
  • 基金项目: 中国科学技术部国家重点研发计划(2018YFA0902600),国家自然科学基金(21804019, 21804088),上海市浦江人才计划(19PJ1407300)

摘要: 分子数据存储作为一种稳定性强、存储密度高的数据存储方式,表现出巨大的潜力。它有望解决当今日益增长的巨大信息量与存储能力之间差距不断扩大的问题。作为一种典型的分子数据存储方式,DNA数据存储可以作为一种替代性、变革性的存储介质,用于突破现用存储方式的物理极限,满足不断增加的数据存储需求。该综述将对DNA数据存储的历史、工作流程、及当前的发展状态进行概述,同时讨论现今DNA数据存储存在的问题、挑战及发展趋势。

English

    1. [1]

      GANTZ J and REINSEL D. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far East[R]. IDC iView, 2012: 1–16.

    2. [2]

      EXTANCE A. How DNA could store all the world’s data[J]. Nature, 2016, 537(7618): 22–24. doi: 10.1038/537022a

    3. [3]

      ZHIRNOV V, ZADEGAN R M, SANDHU G S, et al. Nucleic acid memory[J]. Nature Materials, 2016, 15(4): 366–370. doi: 10.1038/nmat4594

    4. [4]

      COLQUHOUN H and LUTZ J F. Information-containing macromolecules[J]. Nature Chemistry, 2014, 6(6): 455–456. doi: 10.1038/nchem.1958

    5. [5]

      王君珂, 印珏, 牛人杰, 等. DNA计算与DNA纳米技术[J]. 电子与信息学报. doi: 10.11999/JEIT190826.
      WANG Junke, YIN Jue, NIU Renjie, et al. DNA computing and DNA nanotechnology[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT190826.

    6. [6]

      许进, 强小利, 张凯, 等. 基于探针图的并行型图顶点着色DNA计算模型(英文)[J]. 工程, 2018, 4(1): 61–77. doi: 10.1016/j.eng.2018.02.011
      XU Jin, QIANG Xiaoli, ZHANG Kai, et al. A DNA computing model for the graph vertex coloring problem based on a probe graph[J]. Engineering, 2018, 4(1): 61–77.(本条文献为英文文献, 请联系作者确认) doi: 10.1016/j.eng.2018.02.011

    7. [7]

      蓝雯飞, 邢志宝, 黄俊, 等. DNA自组装计算模型求解二部图完美匹配问题[J]. 计算机研究与发展, 2016, 53(11): 2583–2593. doi: 10.7544/issn1000-1239.2016.20150312
      LAN Wenfei, XING Zhibao, HUANG Jun, et al. The DNA self-assembly computing model for solving perfect matching problem of bipartite graph[J]. Journal of Computer Research and Development, 2016, 53(11): 2583–2593. doi: 10.7544/issn1000-1239.2016.20150312

    8. [8]

      朱维军, 周清雷, 张钦宪. 基于DNA计算的线性时序逻辑模型检测方法[J]. 计算机学报, 2016, 39(12): 2578–2597. doi: 10.11897/SP.J.1016.2016.02578
      ZHU Weijun, ZHOU Qinglei, and ZHANG Qinxian. A LTL model checking approach based on DNA computing[J]. Chinese Journal of Computers, 2016, 39(12): 2578–2597. doi: 10.11897/SP.J.1016.2016.02578

    9. [9]

      夏宏, 张实君. 基于分子计算的逻辑模型构建[J]. 科技通报, 2016, 32(5): 11–15. doi: 10.3969/j.issn.1001-7119.2016.05.003
      XIA Hong and ZHANG Shijun. Constructing the logical model based on molecular computing[J]. Bulletin of Science and Technology, 2016, 32(5): 11–15. doi: 10.3969/j.issn.1001-7119.2016.05.003

    10. [10]

      周旭, 周炎涛, 欧阳艾嘉, 等. 一种最大团问题的tile自组装高效模型[J]. 计算机研究与发展, 2014, 51(6): 1253–1262. doi: 10.7544/issn1000-1239.2014.20120904
      ZHOU Xu, ZHOU Yantao, OUYANG Aijia, et al. An efficient tile assembly model for maximum clique problem[J]. Journal of Computer Research and Development, 2014, 51(6): 1253–1262. doi: 10.7544/issn1000-1239.2014.20120904

    11. [11]

      周旭, 周炎涛, 李肯立, 等. 基于tile自组装模型的最大匹配问题算法研究[J]. 电子学报, 2015, 43(2): 262–268. doi: 10.3969/j.issn.0372-2112.2015.02.009
      ZHOU Xu, ZHOU Yantao, LI Kenli, et al. Efficient maximum matching problem algorithms in the tile assembly model[J]. Acta Electronica Sinica, 2015, 43(2): 262–268. doi: 10.3969/j.issn.0372-2112.2015.02.009

    12. [12]

      ORGANICK L, ANG S D, CHEN Y J, et al. Random access in large-scale DNA data storage[J]. Nature Biotechnology, 2018, 36(3): 242–248. doi: 10.1038/nbt.4079

    13. [13]

      RUTTEN M G T A, VAANDRAGER F W, ELEMANS J A A W, et al. Encoding information into polymers[J]. Nature Reviews Chemistry, 2018, 2(11): 365–381. doi: 10.1038/s41570-018-0051-5

    14. [14]

      DNA to the rescue for data storage[J]. Chemical & Engineering News, 2015, 93(35): 40-41.

    15. [15]

      陈为刚, 黄刚, 李炳志, 等. 音视频文件的DNA信息存储[J]. 中国科学: 生命科学, 2020, 50(1): 81–85. doi: 10.1360/SSV-2019-0211
      CHEN Weigang, HUANG Gang, LI Bingzhi, et al. DNA information storage for audio and video files[J]. Scientia Sinica Vitae, 2020, 50(1): 81–85. doi: 10.1360/SSV-2019-0211

    16. [16]

      GREENGARD S. Cracking the code on DNA storage[J]. Communications of the ACM, 2017, 60(7): 16–18. doi: 10.1145/3088493

    17. [17]

      GRASS R N, HECKEL R, PUDDU M, et al. Robust chemical preservation of digital information on DNA in silica with error-correcting codes[J]. Angewandte Chemie International Edition, 2015, 54(8): 2552–2555. doi: 10.1002/anie.201411378

    18. [18]

      LUNT B M. How long is long-term data storage?[C]. Proceedings of Archiving Conference, Society for Imaging Science and Technology, 2011: 29-33. (查阅所有网上资料, 未找到出版地信息, 请联系作者确认).

    19. [19]

      SHRIVASTAVA S and BADLANI R. Data storage in DNA[J]. International Journal of Electrical Energy, 2014, 2(2): 119–124.

    20. [20]

      GREENBERG A, HAMILTON J, MALTZ D A, et al. The cost of a cloud: Research problems in data center networks[J]. ACM SIGCOMM Computer Communication Review, 2008, 39(1): 68–73. doi: 10.1145/1496091.1496103

    21. [21]

      SHETH R U and WANG H H. DNA-based memory devices for recording cellular events[J]. Nature Reviews Genetics, 2018, 19(11): 718–732. doi: 10.1038/s41576-018-0052-8

    22. [22]

      WIENER N. Interview: Machines smarter than men[J]. US News World Report, 1964, 56: 84–86.

    23. [23]

      NEIMAN M S. On the molecular memory systems and the directed mutations[J]. Radiotekhnika, 1965, 6: 1–8.

    24. [24]

      DAVIS J. Microvenus[J]. Art Journal, 1996, 55(1): 70–74. doi: 10.1080/00043249.1996.10791743

    25. [25]

      CLELLAND C T, RISCA V, and BANCROFT C. Hiding messages in DNA microdots[J]. Nature, 1999, 399(6736): 533–534. doi: 10.1038/21092

    26. [26]

      BANCROFT C, BOWLER T, BLOOM B, et al. Long-term storage of information in DNA[J]. Science, 2001, 293(5536): 1763–1765.

    27. [27]

      AILENBERG M and ROTSTEIN O D. An improved huffman coding method for archiving text, images, and music characters in DNA[J]. BioTechniques, 2009, 47(3): 747–754. doi: 10.2144/000113218

    28. [28]

      WONG P C, WONG K K, and FOOTE H. Organic data memory using the DNA approach[J]. Communications of the ACM, 2003, 46(1): 95–98. doi: 10.1145/602421.602426

    29. [29]

      ARITA M and OHASHI Y. Secret signatures inside genomic DNA[J]. Biotechnology Progress, 2004, 20(5): 1605–1607. doi: 10.1021/bp049917i

    30. [30]

      YACHIE N, SEKIYAMA K, SUGAHARA J, et al. Alignment-based approach for durable data storage into living organisms[J]. Biotechnology Progress, 2007, 23(2): 501–505. doi: 10.1021/bp060261y

    31. [31]

      CHURCH G M, GAO Yuan, and KOSURI S. Next-generation digital information storage in DNA[J]. Science, 2012, 337(6102): 1628. doi: 10.1126/science.1226355

    32. [32]

      GOLDMAN N, BERTONE P, CHEN Siyuan, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA[J]. Nature, 2013, 494(7435): 77–80. doi: 10.1038/nature11875

    33. [33]

      GIBSON D G, GLASS J I, LARTIGUE C, et al. Creation of a bacterial cell controlled by a chemically synthesized genome[J]. Science, 2010, 329(5987): 52–56. doi: 10.1126/science.1190719

    34. [34]

      HECKEL R, SHOMORONY I, RAMCHANDRAN K, et al. Fundamental limits of DNA storage systems[C]. Proceedings of 2017 IEEE International Symposium on Information Theory, Aachen, Germany, 2017: 3130–3134.

    35. [35]

      KOSURI S and CHURCH G M. Large-scale de novo DNA synthesis: Technologies and applications[J]. Nature Methods, 2014, 11(5): 499–507. doi: 10.1038/nmeth.2918

    36. [36]

      BORNHOLT J, LOPEZ R, CARMEAN D M, et al. A DNA-based archival storage system[J]. ACM SIGPLAN Notices, 2016, 50(4): 637–649.

    37. [37]

      YAZDI S M H T, YUAN Yongbo, MA Jian, et al. A rewritable, random-access DNA-based storage system[J]. Scientific Reports, 2015, 5: 14138. doi: 10.1038/srep14138

    38. [38]

      ERLICH Y and ZIELINSKI D. DNA fountain enables a robust and efficient storage architecture[J]. Science, 2017, 355(6328): 950–954. doi: 10.1126/science.aaj2038

    39. [39]

      谭丽, 孙季丰, 郭礼华. 基于memetic算法的DNA序列数据压缩方法[J]. 电子与信息学报, 2014, 36(1): 121–127.
      TAN Li, SUN Jifeng, and GUO Lihua. DNA sequence data compression method based on memetic algorithm[J]. Journal of Electronics &Information Technology, 2014, 36(1): 121–127.

    40. [40]

      SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27(3): 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x

    41. [41]

      HECKEL R, MIKUTIS G, and GRASS R N. A characterization of the DNA data storage channel[J]. Scientific Reports, 2019, 9(1): 9663. doi: 10.1038/s41598-019-45832-6

    42. [42]

      REED I S and SOLOMON G. Polynomial codes over certain finite fields[J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300–304. doi: 10.1137/0108018

    43. [43]

      YAZDI S M H T, GABRYS R, and MILENKOVIC O. Portable and error-free DNA-based data storage[J]. Scientific Reports, 2017, 7: 5011. doi: 10.1038/s41598-017-05188-1

    44. [44]

      BLAWAT M, GAEDKE K, HÜTTER I, et al. Forward error correction for DNA data storage[J]. Procedia Computer Science, 2016, 80: 1011–1022. doi: 10.1016/j.procs.2016.05.398

    45. [45]

      ANAVY L, VAKNIN I, ATAR O, et al. Improved DNA based storage capacity and fidelity using composite DNA letters[J]. bioRxiv, 2018. doi: 10.1101/433524

    46. [46]

      CHOI Y, RYU T, LEE A C, et al. Addition of degenerate bases to DNA-based data storage for increased information capacity[J]. bioRxiv, 2018. doi: 10.1101/367052

    47. [47]

      LEE H H, KALHOR R, GOELA N, et al. Enzymatic DNA synthesis for digital information storage[J]. bioRxiv, 2018. doi: 10.1101/348987

    48. [48]

      BAUM E. Building an associative memory vastly larger than the brain[J]. Science, 1995, 268(5210): 583–585. doi: 10.1126/science.7725109

    49. [49]

      CARUTHERS M H. The chemical synthesis of DNA/RNA: Our gift to science[J]. Journal of Biological Chemistry, 2013, 288(2): 1420–1427. doi: 10.1074/jbc.X112.442855

    50. [50]

      GOODWIN S, MCPHERSON J D, and MCCOMBIE W R. Coming of age: Ten years of next-generation sequencing technologies[J]. Nature Reviews Genetics, 2016, 17(6): 333–351. doi: 10.1038/nrg.2016.49

    51. [51]

      SHENDURE J, BALASUBRAMANIAN S, CHURCH G M, et al. DNA sequencing at 40: Past, present and future[J]. Nature, 2017, 550(7676): 345–353. doi: 10.1038/nature24286

    52. [52]

      DEAMER D, AKESON M, and BRANTON D. Three decades of nanopore sequencing[J]. Nature Biotechnology, 2016, 34(5): 518–524. doi: 10.1038/nbt.3423

    53. [53]

      FONTANA JR R E and DECAD G M. Moore’s law realities for recording systems and memory storage components: HDD, tape, NAND, and optical[J]. AIP Advances, 2018, 8(5): 056506. doi: 10.1063/1.5007621

    54. [54]

      BONNET J, COLOTTE M, COUDY D, et al. Chain and conformation stability of solid-state DNA: Implications for room temperature storage[J]. Nucleic Acids Research, 2010, 38(5): 1531–1546. doi: 10.1093/nar/gkp1060

    55. [55]

      PRAKADAN S M, SHALEK A K, and WEITZ D A. Scaling by shrinking: Empowering single-cell 'omics' with microfluidic devices[J]. Nature Reviews Genetics, 2017, 18(6): 345–361. doi: 10.1038/nrg.2017.15

    56. [56]

      NEWMAN S, STEPHENSON A P, WILLSEY M, et al. High density DNA data storage library via dehydration with digital microfluidic retrieval[J]. Nature Communications, 2019, 10(1): 1706. doi: 10.1038/s41467-019-09517-y

  • 图 1  DNA数据存储整体框架图

    表 1  体外DNA数据存储比较研究

    文献数据容量合成方法测序方法物理冗余
    (覆盖率)
    重新组装链长
    (碱基数)
    逻辑密度
    (bit/碱基)
    逻辑密度
    (有效载荷)
    是否能
    随机访问
    文献[31]650 kB亚磷酰胺(沉积)合成测序3000×索引序列连接1150.600.83
    文献[32]630 kB亚磷酰胺(沉积)合成测序51×重叠序列连接1170.190.29
    文献[17]80 kB亚磷酰胺(电化学)合成测序372×索引序列连接1580.861.16
    文献[37,43]3 kB亚磷酰胺(沉积)纳米孔测序200×索引序列连接880~10001.711.74
    文献[38]2 MB亚磷酰胺(沉积)合成测序10.5×种子序列连接1521.181.55
    文献[44]22 MB亚磷酰胺(沉积)合成测序160×索引序列连接2300.891.08
    文献[36]150 kB亚磷酰胺(电化学)合成测序40×索引序列连接1170.570.85
    文献[12]200 MB亚磷酰胺(沉积)合成测序索引序列连接150~2000.811.10
    文献[45]8.5 MB亚磷酰胺(沉积)合成测序164×索引序列连接1941.942.64
    文献[46]854 kB亚磷酰胺(柱子)合成测序250×索引序列连接851.783.37
    文献[12]33 kB亚磷酰胺(沉积)纳米孔测序36×索引序列连接1500.811.10
    文献[47]18 B酶(柱基)纳米孔测序175×无(单体)150~2001.571.57
    下载: 导出CSV
  • 加载中
图(1)表(1)
计量
  • PDF下载量:  2
  • 文章访问数:  130
  • HTML全文浏览量:  35
文章相关
  • 通讯作者:  左小磊
  • 收稿日期:  2019-11-01
  • 录用日期:  2020-05-18
  • 网络出版日期:  2020-05-21
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

返回文章