高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

DNA数据存储

毛秀海 李凡 左小磊

毛秀海, 李凡, 左小磊. DNA数据存储[J]. 电子与信息学报, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852
引用本文: 毛秀海, 李凡, 左小磊. DNA数据存储[J]. 电子与信息学报, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852
Xiuhai MAO, Fan LI, Xiaolei ZUO. DNA Data Storage[J]. Journal of Electronics and Information Technology, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852
Citation: Xiuhai MAO, Fan LI, Xiaolei ZUO. DNA Data Storage[J]. Journal of Electronics and Information Technology, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852

DNA数据存储

doi: 10.11999/JEIT190852
基金项目: 中国科学技术部国家重点研发计划(2018YFA0902600),国家自然科学基金(21804019, 21804088),上海市浦江人才计划(19PJ1407300)
详细信息
    作者简介:

    毛秀海:男,1986年生,副研究员,研究方向为DNA纳米技术

    李凡:男,1983年生,副研究员,研究方向为分子医学及DNA纳米技术

    左小磊:男,1980年生,研究员,研究方向为DAN电化学传感器、3D DNA探针和癌症早期诊断

    通讯作者:

    左小磊 zuoxiaolei@sjtu.edu.cn

  • 中图分类号: TP391

DNA Data Storage

Funds: The Ministry of Science and Technology of China (2018YFA0902600), The National Natural Science Foundation of China (21804019, 21804088), Shanghai Pujiang Program (19PJ1407300)
  • 摘要: 分子数据存储作为一种稳定性强、存储密度高的数据存储方式,表现出巨大的潜力。它有望解决当今日益增长的巨大信息量与存储能力之间差距不断扩大的问题。作为一种典型的分子数据存储方式,DNA数据存储可以作为一种替代性、变革性的存储介质,用于突破现用存储方式的物理极限,满足不断增加的数据存储需求。该综述将对DNA数据存储的历史、工作流程、及当前的发展状态进行概述,同时讨论现今DNA数据存储存在的问题、挑战及发展趋势。
  • 图  1  DNA数据存储整体框架图

    表  1  体外DNA数据存储比较研究

    文献数据容量合成方法测序方法物理冗余
    (覆盖率)
    重新组装链长
    (碱基数)
    逻辑密度
    (bit/碱基)
    逻辑密度
    (有效载荷)
    是否能
    随机访问
    文献[31]650 kB亚磷酰胺(沉积)合成测序3000×索引序列连接1150.600.83
    文献[32]630 kB亚磷酰胺(沉积)合成测序51×重叠序列连接1170.190.29
    文献[17]80 kB亚磷酰胺(电化学)合成测序372×索引序列连接1580.861.16
    文献[37,45]3 kB亚磷酰胺(沉积)纳米孔测序200×索引序列连接880~10001.711.74
    文献[38]2 MB亚磷酰胺(沉积)合成测序10.5×种子序列连接1521.181.55
    文献[46]22 MB亚磷酰胺(沉积)合成测序160×索引序列连接2300.891.08
    文献[36]150 kB亚磷酰胺(电化学)合成测序40×索引序列连接1170.570.85
    文献[12]200 MB亚磷酰胺(沉积)合成测序索引序列连接150~2000.811.10
    文献[43]8.5 MB亚磷酰胺(沉积)合成测序164×索引序列连接1941.942.64
    文献[44]854 kB亚磷酰胺(柱子)合成测序250×索引序列连接851.783.37
    文献[12]33 kB亚磷酰胺(沉积)纳米孔测序36×索引序列连接1500.811.10
    文献[47]18 B酶(柱基)纳米孔测序175×无(单体)150~2001.571.57
    下载: 导出CSV
  • [1] GANTZ J and REINSEL D. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far East[R]. IDC iView, 2012: 1–16.
    [2] EXTANCE A. How DNA could store all the world’s data[J]. Nature, 2016, 537(7618): 22–24. doi:  10.1038/537022a
    [3] ZHIRNOV V, ZADEGAN R M, SANDHU G S, et al. Nucleic acid memory[J]. Nature Materials, 2016, 15(4): 366–370. doi:  10.1038/nmat4594
    [4] COLQUHOUN H and LUTZ J F. Information-containing macromolecules[J]. Nature Chemistry, 2014, 6(6): 455–456. doi:  10.1038/nchem.1958
    [5] 王君珂, 印珏, 牛人杰, 等. DNA计算与DNA纳米技术[J]. 电子与信息学报, 2020, 42(6): 1313–1325. doi: 10.11999/JEIT190826.

    WANG Junke, YIN Jue, NIU Renjie, et al. DNA computing and DNA nanotechnology[J]. Journal of Electronics & Information Technology, 2020, 42(6): 1313–1325. doi: 10.11999/JEIT190826.
    [6] 许进, 强小利, 张凯, 等. 基于探针图的并行型图顶点着色DNA计算模型(英文)[J]. 工程, 2018, 4(1): 61–77. doi:  10.1016/j.eng.2018.02.011

    XU Jin, QIANG Xiaoli, ZHANG Kai, et al. A DNA computing model for the graph vertex coloring problem based on a probe graph[J]. Engineering, 2018, 4(1): 61–77. doi:  10.1016/j.eng.2018.02.011
    [7] 蓝雯飞, 邢志宝, 黄俊, 等. DNA自组装计算模型求解二部图完美匹配问题[J]. 计算机研究与发展, 2016, 53(11): 2583–2593. doi:  10.7544/issn1000-1239.2016.20150312

    LAN Wenfei, XING Zhibao, HUANG Jun, et al. The DNA self-assembly computing model for solving perfect matching problem of bipartite graph[J]. Journal of Computer Research and Development, 2016, 53(11): 2583–2593. doi:  10.7544/issn1000-1239.2016.20150312
    [8] 朱维军, 周清雷, 张钦宪. 基于DNA计算的线性时序逻辑模型检测方法[J]. 计算机学报, 2016, 39(12): 2578–2597. doi:  10.11897/SP.J.1016.2016.02578

    ZHU Weijun, ZHOU Qinglei, and ZHANG Qinxian. A LTL model checking approach based on DNA computing[J]. Chinese Journal of Computers, 2016, 39(12): 2578–2597. doi:  10.11897/SP.J.1016.2016.02578
    [9] 夏宏, 张实君. 基于分子计算的逻辑模型构建[J]. 科技通报, 2016, 32(5): 11–15. doi:  10.3969/j.issn.1001-7119.2016.05.003

    XIA Hong and ZHANG Shijun. Constructing the logical model based on molecular computing[J]. Bulletin of Science and Technology, 2016, 32(5): 11–15. doi:  10.3969/j.issn.1001-7119.2016.05.003
    [10] 周旭, 周炎涛, 欧阳艾嘉, 等. 一种最大团问题的tile自组装高效模型[J]. 计算机研究与发展, 2014, 51(6): 1253–1262. doi:  10.7544/issn1000-1239.2014.20120904

    ZHOU Xu, ZHOU Yantao, OUYANG Aijia, et al. An efficient tile assembly model for maximum clique problem[J]. Journal of Computer Research and Development, 2014, 51(6): 1253–1262. doi:  10.7544/issn1000-1239.2014.20120904
    [11] 周旭, 周炎涛, 李肯立, 等. 基于tile自组装模型的最大匹配问题算法研究[J]. 电子学报, 2015, 43(2): 262–268. doi:  10.3969/j.issn.0372-2112.2015.02.009

    ZHOU Xu, ZHOU Yantao, LI Kenli, et al. Efficient maximum matching problem algorithms in the tile assembly model[J]. Acta Electronica Sinica, 2015, 43(2): 262–268. doi:  10.3969/j.issn.0372-2112.2015.02.009
    [12] ORGANICK L, ANG S D, CHEN Y J, et al. Random access in large-scale DNA data storage[J]. Nature Biotechnology, 2018, 36(3): 242–248. doi:  10.1038/nbt.4079
    [13] RUTTEN M G T A, VAANDRAGER F W, ELEMANS J A A W, et al. Encoding information into polymers[J]. Nature Reviews Chemistry, 2018, 2(11): 365–381. doi:  10.1038/s41570-018-0051-5
    [14] DNA to the rescue for data storage[J]. Chemical & Engineering News, 2015, 93(35): 40-41.
    [15] 陈为刚, 黄刚, 李炳志, 等. 音视频文件的DNA信息存储[J]. 中国科学: 生命科学, 2020, 50(1): 81–85. doi:  10.1360/SSV-2019-0211

    CHEN Weigang, HUANG Gang, LI Bingzhi, et al. DNA information storage for audio and video files[J]. Scientia Sinica Vitae, 2020, 50(1): 81–85. doi:  10.1360/SSV-2019-0211
    [16] GREENGARD S. Cracking the code on DNA storage[J]. Communications of the ACM, 2017, 60(7): 16–18. doi:  10.1145/3088493
    [17] GRASS R N, HECKEL R, PUDDU M, et al. Robust chemical preservation of digital information on DNA in silica with error-correcting codes[J]. Angewandte Chemie International Edition, 2015, 54(8): 2552–2555. doi:  10.1002/anie.201411378
    [18] LUNT B M. How long is long-term data storage?[C]. Archiving Conference, Society for Imaging Science and Technology, 2011: 29–33.
    [19] SHRIVASTAVA S and BADLANI R. Data storage in DNA[J]. International Journal of Electrical Energy, 2014, 2(2): 119–124.
    [20] GREENBERG A, HAMILTON J, MALTZ D A, et al. The cost of a cloud: Research problems in data center networks[J]. ACM SIGCOMM Computer Communication Review, 2008, 39(1): 68–73. doi:  10.1145/1496091.1496103
    [21] SHETH R U and WANG H H. DNA-based memory devices for recording cellular events[J]. Nature Reviews Genetics, 2018, 19(11): 718–732. doi:  10.1038/s41576-018-0052-8
    [22] WIENER N. Interview: Machines smarter than men[J]. US News World Report, 1964, 56: 84–86.
    [23] NEIMAN M S. On the molecular memory systems and the directed mutations[J]. Radiotekhnika, 1965, 6: 1–8.
    [24] DAVIS J. Microvenus[J]. Art Journal, 1996, 55(1): 70–74. doi:  10.1080/00043249.1996.10791743
    [25] CLELLAND C T, RISCA V, and BANCROFT C. Hiding messages in DNA microdots[J]. Nature, 1999, 399(6736): 533–534. doi:  10.1038/21092
    [26] BANCROFT C, BOWLER T, BLOOM B, et al. Long-term storage of information in DNA[J]. Science, 2001, 293(5536): 1763–1765.
    [27] AILENBERG M and ROTSTEIN O D. An improved huffman coding method for archiving text, images, and music characters in DNA[J]. BioTechniques, 2009, 47(3): 747–754. doi:  10.2144/000113218
    [28] WONG P C, WONG K K, and FOOTE H. Organic data memory using the DNA approach[J]. Communications of the ACM, 2003, 46(1): 95–98. doi:  10.1145/602421.602426
    [29] ARITA M and OHASHI Y. Secret signatures inside genomic DNA[J]. Biotechnology Progress, 2004, 20(5): 1605–1607. doi:  10.1021/bp049917i
    [30] YACHIE N, SEKIYAMA K, SUGAHARA J, et al. Alignment-based approach for durable data storage into living organisms[J]. Biotechnology Progress, 2007, 23(2): 501–505. doi:  10.1021/bp060261y
    [31] CHURCH G M, GAO Yuan, and KOSURI S. Next-generation digital information storage in DNA[J]. Science, 2012, 337(6102): 1628. doi:  10.1126/science.1226355
    [32] GOLDMAN N, BERTONE P, CHEN Siyuan, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA[J]. Nature, 2013, 494(7435): 77–80. doi:  10.1038/nature11875
    [33] GIBSON D G, GLASS J I, LARTIGUE C, et al. Creation of a bacterial cell controlled by a chemically synthesized genome[J]. Science, 2010, 329(5987): 52–56. doi:  10.1126/science.1190719
    [34] HECKEL R, SHOMORONY I, RAMCHANDRAN K, et al. Fundamental limits of DNA storage systems[C]. 2017 IEEE International Symposium on Information Theory, Aachen, Germany, 2017: 3130–3134.
    [35] KOSURI S and CHURCH G M. Large-scale de novo DNA synthesis: Technologies and applications[J]. Nature Methods, 2014, 11(5): 499–507. doi:  10.1038/nmeth.2918
    [36] BORNHOLT J, LOPEZ R, CARMEAN D M, et al. A DNA-based archival storage system[J]. ACM SIGPLAN Notices, 2016, 50(4): 637–649.
    [37] YAZDI S M H T, YUAN Yongbo, MA Jian, et al. A rewritable, random-access DNA-based storage system[J]. Scientific Reports, 2015, 5: 14138. doi:  10.1038/srep14138
    [38] ERLICH Y and ZIELINSKI D. DNA fountain enables a robust and efficient storage architecture[J]. Science, 2017, 355(6328): 950–954. doi:  10.1126/science.aaj2038
    [39] 谭丽, 孙季丰, 郭礼华. 基于memetic算法的DNA序列数据压缩方法[J]. 电子与信息学报, 2014, 36(1): 121–127.

    TAN Li, SUN Jifeng, and GUO Lihua. DNA sequence data compression method based on memetic algorithm[J]. Journal of Electronics &Information Technology, 2014, 36(1): 121–127.
    [40] SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27(3): 379–423. doi:  10.1002/j.1538-7305.1948.tb01338.x
    [41] HECKEL R, MIKUTIS G, and GRASS R N. A characterization of the DNA data storage channel[J]. Scientific Reports, 2019, 9(1): 9663. doi:  10.1038/s41598-019-45832-6
    [42] REED I S and SOLOMON G. Polynomial codes over certain finite fields[J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300–304. doi:  10.1137/0108018
    [43] ANAVY L, VAKNIN I, ATAR O, et al. Improved DNA based storage capacity and fidelity using composite DNA letters[J]. bioRxiv, 2018. doi:  10.1101/433524
    [44] CHOI Y, RYU T, LEE A C, et al. Addition of degenerate bases to DNA-based data storage for increased information capacity[J]. bioRxiv, 2018. doi:  10.1101/367052
    [45] YAZDI S M H T, GABRYS R, and MILENKOVIC O. Portable and error-free DNA-based data storage[J]. Scientific Reports, 2017, 7: 5011. doi:  10.1038/s41598-017-05188-1
    [46] BLAWAT M, GAEDKE K, HÜTTER I, et al. Forward error correction for DNA data storage[J]. Procedia Computer Science, 2016, 80: 1011–1022. doi:  10.1016/j.procs.2016.05.398
    [47] LEE H H, KALHOR R, GOELA N, et al. Enzymatic DNA synthesis for digital information storage[J]. bioRxiv, 2018. doi:  10.1101/348987
    [48] BAUM E. Building an associative memory vastly larger than the brain[J]. Science, 1995, 268(5210): 583–585. doi:  10.1126/science.7725109
    [49] CARUTHERS M H. The chemical synthesis of DNA/RNA: Our gift to science[J]. Journal of Biological Chemistry, 2013, 288(2): 1420–1427. doi:  10.1074/jbc.X112.442855
    [50] GOODWIN S, MCPHERSON J D, and MCCOMBIE W R. Coming of age: Ten years of next-generation sequencing technologies[J]. Nature Reviews Genetics, 2016, 17(6): 333–351. doi:  10.1038/nrg.2016.49
    [51] SHENDURE J, BALASUBRAMANIAN S, CHURCH G M, et al. DNA sequencing at 40: Past, present and future[J]. Nature, 2017, 550(7676): 345–353. doi:  10.1038/nature24286
    [52] DEAMER D, AKESON M, and BRANTON D. Three decades of nanopore sequencing[J]. Nature Biotechnology, 2016, 34(5): 518–524. doi:  10.1038/nbt.3423
    [53] FONTANA JR R E and DECAD G M. Moore’s law realities for recording systems and memory storage components: HDD, tape, NAND, and optical[J]. AIP Advances, 2018, 8(5): 056506. doi:  10.1063/1.5007621
    [54] BONNET J, COLOTTE M, COUDY D, et al. Chain and conformation stability of solid-state DNA: Implications for room temperature storage[J]. Nucleic Acids Research, 2010, 38(5): 1531–1546. doi:  10.1093/nar/gkp1060
    [55] PRAKADAN S M, SHALEK A K, and WEITZ D A. Scaling by shrinking: Empowering single-cell 'omics' with microfluidic devices[J]. Nature Reviews Genetics, 2017, 18(6): 345–361. doi:  10.1038/nrg.2017.15
    [56] NEWMAN S, STEPHENSON A P, WILLSEY M, et al. High density DNA data storage library via dehydration with digital microfluidic retrieval[J]. Nature Communications, 2019, 10(1): 1706. doi:  10.1038/s41467-019-09517-y
  • [1] 贾连印, 陈明鲜, 李孟娟, 游进国, 丁家满.  基于状态视图的高效Hilbert编码和解码算法, 电子与信息学报. doi: 10.11999/JEIT190501
    [2] 许鹏, 方刚, 石晓龙, 刘文斌.  DNA存储及其研究进展, 电子与信息学报. doi: 10.11999/JEIT190863
    [3] 王勇臻, 陈燕, 于莹莹.  求解多旅行商问题的改进分组遗传算法, 电子与信息学报. doi: 10.11999/JEIT160211
    [4] 吴金, 江琦, 郑丽霞, 孙东辰, 宋科, 孙伟锋.  数据锁存处理的低误码率编码方法研究, 电子与信息学报. doi: 10.11999/JEIT151104
    [5] 纪倩, 杨超, 赵文红, 张俊伟.  一种新的云存储数据容错存储方式检验方法, 电子与信息学报. doi: 10.11999/JEIT151344
    [6] 邝继顺, 周颖波, 蔡烁.  一种用于测试数据压缩的自适应EFDR编码方法, 电子与信息学报. doi: 10.11999/JEIT150177
    [7] 谭丽, 孙季丰, 郭礼华.  基于Memetic算法的DNA序列数据压缩方法, 电子与信息学报. doi: 10.3724/SP.J.1146.2013.00303
    [8] 蔡明, 乔文孝, 鞠晓东, 车小花, 卢俊强, 贾安学.  一种新的数据无损压缩编码方法, 电子与信息学报. doi: 10.3724/SP.J.1146.2013.00863
    [9] 仇晓兰, 韩传钊, 胡丁晟, 丁赤飚.  一种基于动态解码的SAR原始数据饱和校正方法, 电子与信息学报. doi: 10.3724/SP.J.1146.2012.01270
    [10] 王晓涛, 钱骅, 康凯.  基于Viterbi-双向搜索的咬尾码最大似然译码算法, 电子与信息学报. doi: 10.3724/SP.J.1146.2012.01219
    [11] 万武南, 王拓, 索望.  一种三容错数据布局, 电子与信息学报. doi: 10.3724/SP.J.1146.2013.00153
    [12] 陈金平, 王梦丽, 钱曙光.  现代化GNSS导航电文设计分析, 电子与信息学报. doi: 10.3724/SP.J.1146.2010.00584
    [13] 张晔, 王申.  小波编码图像联合信源信道解码算法的研究, 电子与信息学报. doi: 10.3724/SP.J.1146.2009.01145
    [14] 管武, 董明科, 项海格.  一种LDPC编码高阶调制系统的联合解调解码方法, 电子与信息学报. doi: 10.3724/SP.J.1146.2008.01287
    [15] 张碧军, 朱光喜, 何业军.  新的时变信道下空时分组编码多用户系统解码器设计, 电子与信息学报.
    [16] 梅魁志, 郑南宁, 刘跃虎, 姚霁, 黄宇, 王勇.  一种高效流水低存储的JPEG2000编码芯片设计, 电子与信息学报.
    [17] 曾雁星, 殷勤业, 张一闻, 罗铭.  空时分组编码多载波码分多址系统的直接解码, 电子与信息学报.
    [18] 吴晓军, 李星, 王继龙, 王常吉.  多载波垂直分层空时系统的移不变性编码及其解码, 电子与信息学报.
    [19] 沈汀.  基于INS/GPS数据的机载SAR图像地理编码系统研究, 电子与信息学报.
    [20] 隋厚堂.  存储器的编码增益, 电子与信息学报.
  • 加载中
  • 图(1) / 表ll (1)
    计量
    • 文章访问数:  938
    • HTML全文浏览量:  306
    • PDF下载量:  41
    • 被引次数: 0
    出版历程
    • 收稿日期:  2019-11-01
    • 修回日期:  2020-05-18
    • 网络出版日期:  2020-05-21
    • 刊出日期:  2020-06-22

    目录

      /

      返回文章
      返回

      官方微信,欢迎关注