高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

DNA数据存储

毛秀海 李凡 左小磊

毛秀海, 李凡, 左小磊. DNA数据存储[J]. 电子与信息学报, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852
引用本文: 毛秀海, 李凡, 左小磊. DNA数据存储[J]. 电子与信息学报, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852
Xiuhai MAO, Fan LI, Xiaolei ZUO. DNA Data Storage[J]. Journal of Electronics & Information Technology, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852
Citation: Xiuhai MAO, Fan LI, Xiaolei ZUO. DNA Data Storage[J]. Journal of Electronics & Information Technology, 2020, 42(6): 1303-1312. doi: 10.11999/JEIT190852

DNA数据存储

doi: 10.11999/JEIT190852
基金项目: 中国科学技术部国家重点研发计划(2018YFA0902600),国家自然科学基金(21804019, 21804088),上海市浦江人才计划(19PJ1407300)
详细信息
    作者简介:

    毛秀海:男,1986年生,副研究员,研究方向为DNA纳米技术

    李凡:男,1983年生,副研究员,研究方向为分子医学及DNA纳米技术

    左小磊:男,1980年生,研究员,研究方向为DAN电化学传感器、3D DNA探针和癌症早期诊断

    通讯作者:

    左小磊 zuoxiaolei@sjtu.edu.cn

  • 中图分类号: TP391

DNA Data Storage

Funds: The Ministry of Science and Technology of China (2018YFA0902600), The National Natural Science Foundation of China (21804019, 21804088), Shanghai Pujiang Program (19PJ1407300)
  • 摘要: 分子数据存储作为一种稳定性强、存储密度高的数据存储方式,表现出巨大的潜力。它有望解决当今日益增长的巨大信息量与存储能力之间差距不断扩大的问题。作为一种典型的分子数据存储方式,DNA数据存储可以作为一种替代性、变革性的存储介质,用于突破现用存储方式的物理极限,满足不断增加的数据存储需求。该综述将对DNA数据存储的历史、工作流程、及当前的发展状态进行概述,同时讨论现今DNA数据存储存在的问题、挑战及发展趋势。
  • 图  1  DNA数据存储整体框架图

    表  1  体外DNA数据存储比较研究

    文献数据容量合成方法测序方法物理冗余
    (覆盖率)
    重新组装链长
    (碱基数)
    逻辑密度
    (bit/碱基)
    逻辑密度
    (有效载荷)
    是否能
    随机访问
    文献[31]650 kB亚磷酰胺(沉积)合成测序3000×索引序列连接1150.600.83
    文献[32]630 kB亚磷酰胺(沉积)合成测序51×重叠序列连接1170.190.29
    文献[17]80 kB亚磷酰胺(电化学)合成测序372×索引序列连接1580.861.16
    文献[37,45]3 kB亚磷酰胺(沉积)纳米孔测序200×索引序列连接880~10001.711.74
    文献[38]2 MB亚磷酰胺(沉积)合成测序10.5×种子序列连接1521.181.55
    文献[46]22 MB亚磷酰胺(沉积)合成测序160×索引序列连接2300.891.08
    文献[36]150 kB亚磷酰胺(电化学)合成测序40×索引序列连接1170.570.85
    文献[12]200 MB亚磷酰胺(沉积)合成测序索引序列连接150~2000.811.10
    文献[43]8.5 MB亚磷酰胺(沉积)合成测序164×索引序列连接1941.942.64
    文献[44]854 kB亚磷酰胺(柱子)合成测序250×索引序列连接851.783.37
    文献[12]33 kB亚磷酰胺(沉积)纳米孔测序36×索引序列连接1500.811.10
    文献[47]18 B酶(柱基)纳米孔测序175×无(单体)150~2001.571.57
    下载: 导出CSV
  • GANTZ J and REINSEL D. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far East[R]. IDC iView, 2012: 1–16.
    EXTANCE A. How DNA could store all the world’s data[J]. Nature, 2016, 537(7618): 22–24. doi: 10.1038/537022a
    ZHIRNOV V, ZADEGAN R M, SANDHU G S, et al. Nucleic acid memory[J]. Nature Materials, 2016, 15(4): 366–370. doi: 10.1038/nmat4594
    COLQUHOUN H and LUTZ J F. Information-containing macromolecules[J]. Nature Chemistry, 2014, 6(6): 455–456. doi: 10.1038/nchem.1958
    王君珂, 印珏, 牛人杰, 等. DNA计算与DNA纳米技术[J]. 电子与信息学报, 2020, 42(6): 1313–1325. doi: 10.11999/JEIT190826.

    WANG Junke, YIN Jue, NIU Renjie, et al. DNA computing and DNA nanotechnology[J]. Journal of Electronics & Information Technology, 2020, 42(6): 1313–1325. doi: 10.11999/JEIT190826.
    许进, 强小利, 张凯, 等. 基于探针图的并行型图顶点着色DNA计算模型(英文)[J]. 工程, 2018, 4(1): 61–77. doi: 10.1016/j.eng.2018.02.011

    XU Jin, QIANG Xiaoli, ZHANG Kai, et al. A DNA computing model for the graph vertex coloring problem based on a probe graph[J]. Engineering, 2018, 4(1): 61–77. doi: 10.1016/j.eng.2018.02.011
    蓝雯飞, 邢志宝, 黄俊, 等. DNA自组装计算模型求解二部图完美匹配问题[J]. 计算机研究与发展, 2016, 53(11): 2583–2593. doi: 10.7544/issn1000-1239.2016.20150312

    LAN Wenfei, XING Zhibao, HUANG Jun, et al. The DNA self-assembly computing model for solving perfect matching problem of bipartite graph[J]. Journal of Computer Research and Development, 2016, 53(11): 2583–2593. doi: 10.7544/issn1000-1239.2016.20150312
    朱维军, 周清雷, 张钦宪. 基于DNA计算的线性时序逻辑模型检测方法[J]. 计算机学报, 2016, 39(12): 2578–2597. doi: 10.11897/SP.J.1016.2016.02578

    ZHU Weijun, ZHOU Qinglei, and ZHANG Qinxian. A LTL model checking approach based on DNA computing[J]. Chinese Journal of Computers, 2016, 39(12): 2578–2597. doi: 10.11897/SP.J.1016.2016.02578
    夏宏, 张实君. 基于分子计算的逻辑模型构建[J]. 科技通报, 2016, 32(5): 11–15. doi: 10.3969/j.issn.1001-7119.2016.05.003

    XIA Hong and ZHANG Shijun. Constructing the logical model based on molecular computing[J]. Bulletin of Science and Technology, 2016, 32(5): 11–15. doi: 10.3969/j.issn.1001-7119.2016.05.003
    周旭, 周炎涛, 欧阳艾嘉, 等. 一种最大团问题的tile自组装高效模型[J]. 计算机研究与发展, 2014, 51(6): 1253–1262. doi: 10.7544/issn1000-1239.2014.20120904

    ZHOU Xu, ZHOU Yantao, OUYANG Aijia, et al. An efficient tile assembly model for maximum clique problem[J]. Journal of Computer Research and Development, 2014, 51(6): 1253–1262. doi: 10.7544/issn1000-1239.2014.20120904
    周旭, 周炎涛, 李肯立, 等. 基于tile自组装模型的最大匹配问题算法研究[J]. 电子学报, 2015, 43(2): 262–268. doi: 10.3969/j.issn.0372-2112.2015.02.009

    ZHOU Xu, ZHOU Yantao, LI Kenli, et al. Efficient maximum matching problem algorithms in the tile assembly model[J]. Acta Electronica Sinica, 2015, 43(2): 262–268. doi: 10.3969/j.issn.0372-2112.2015.02.009
    ORGANICK L, ANG S D, CHEN Y J, et al. Random access in large-scale DNA data storage[J]. Nature Biotechnology, 2018, 36(3): 242–248. doi: 10.1038/nbt.4079
    RUTTEN M G T A, VAANDRAGER F W, ELEMANS J A A W, et al. Encoding information into polymers[J]. Nature Reviews Chemistry, 2018, 2(11): 365–381. doi: 10.1038/s41570-018-0051-5
    DNA to the rescue for data storage[J]. Chemical & Engineering News, 2015, 93(35): 40-41.
    陈为刚, 黄刚, 李炳志, 等. 音视频文件的DNA信息存储[J]. 中国科学: 生命科学, 2020, 50(1): 81–85. doi: 10.1360/SSV-2019-0211

    CHEN Weigang, HUANG Gang, LI Bingzhi, et al. DNA information storage for audio and video files[J]. Scientia Sinica Vitae, 2020, 50(1): 81–85. doi: 10.1360/SSV-2019-0211
    GREENGARD S. Cracking the code on DNA storage[J]. Communications of the ACM, 2017, 60(7): 16–18. doi: 10.1145/3088493
    GRASS R N, HECKEL R, PUDDU M, et al. Robust chemical preservation of digital information on DNA in silica with error-correcting codes[J]. Angewandte Chemie International Edition, 2015, 54(8): 2552–2555. doi: 10.1002/anie.201411378
    LUNT B M. How long is long-term data storage?[C]. Archiving Conference, Society for Imaging Science and Technology, 2011: 29–33.
    SHRIVASTAVA S and BADLANI R. Data storage in DNA[J]. International Journal of Electrical Energy, 2014, 2(2): 119–124.
    GREENBERG A, HAMILTON J, MALTZ D A, et al. The cost of a cloud: Research problems in data center networks[J]. ACM SIGCOMM Computer Communication Review, 2008, 39(1): 68–73. doi: 10.1145/1496091.1496103
    SHETH R U and WANG H H. DNA-based memory devices for recording cellular events[J]. Nature Reviews Genetics, 2018, 19(11): 718–732. doi: 10.1038/s41576-018-0052-8
    WIENER N. Interview: Machines smarter than men[J]. US News World Report, 1964, 56: 84–86.
    NEIMAN M S. On the molecular memory systems and the directed mutations[J]. Radiotekhnika, 1965, 6: 1–8.
    DAVIS J. Microvenus[J]. Art Journal, 1996, 55(1): 70–74. doi: 10.1080/00043249.1996.10791743
    CLELLAND C T, RISCA V, and BANCROFT C. Hiding messages in DNA microdots[J]. Nature, 1999, 399(6736): 533–534. doi: 10.1038/21092
    BANCROFT C, BOWLER T, BLOOM B, et al. Long-term storage of information in DNA[J]. Science, 2001, 293(5536): 1763–1765.
    AILENBERG M and ROTSTEIN O D. An improved huffman coding method for archiving text, images, and music characters in DNA[J]. BioTechniques, 2009, 47(3): 747–754. doi: 10.2144/000113218
    WONG P C, WONG K K, and FOOTE H. Organic data memory using the DNA approach[J]. Communications of the ACM, 2003, 46(1): 95–98. doi: 10.1145/602421.602426
    ARITA M and OHASHI Y. Secret signatures inside genomic DNA[J]. Biotechnology Progress, 2004, 20(5): 1605–1607. doi: 10.1021/bp049917i
    YACHIE N, SEKIYAMA K, SUGAHARA J, et al. Alignment-based approach for durable data storage into living organisms[J]. Biotechnology Progress, 2007, 23(2): 501–505. doi: 10.1021/bp060261y
    CHURCH G M, GAO Yuan, and KOSURI S. Next-generation digital information storage in DNA[J]. Science, 2012, 337(6102): 1628. doi: 10.1126/science.1226355
    GOLDMAN N, BERTONE P, CHEN Siyuan, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA[J]. Nature, 2013, 494(7435): 77–80. doi: 10.1038/nature11875
    GIBSON D G, GLASS J I, LARTIGUE C, et al. Creation of a bacterial cell controlled by a chemically synthesized genome[J]. Science, 2010, 329(5987): 52–56. doi: 10.1126/science.1190719
    HECKEL R, SHOMORONY I, RAMCHANDRAN K, et al. Fundamental limits of DNA storage systems[C]. 2017 IEEE International Symposium on Information Theory, Aachen, Germany, 2017: 3130–3134.
    KOSURI S and CHURCH G M. Large-scale de novo DNA synthesis: Technologies and applications[J]. Nature Methods, 2014, 11(5): 499–507. doi: 10.1038/nmeth.2918
    BORNHOLT J, LOPEZ R, CARMEAN D M, et al. A DNA-based archival storage system[J]. ACM SIGPLAN Notices, 2016, 50(4): 637–649.
    YAZDI S M H T, YUAN Yongbo, MA Jian, et al. A rewritable, random-access DNA-based storage system[J]. Scientific Reports, 2015, 5: 14138. doi: 10.1038/srep14138
    ERLICH Y and ZIELINSKI D. DNA fountain enables a robust and efficient storage architecture[J]. Science, 2017, 355(6328): 950–954. doi: 10.1126/science.aaj2038
    谭丽, 孙季丰, 郭礼华. 基于memetic算法的DNA序列数据压缩方法[J]. 电子与信息学报, 2014, 36(1): 121–127.

    TAN Li, SUN Jifeng, and GUO Lihua. DNA sequence data compression method based on memetic algorithm[J]. Journal of Electronics &Information Technology, 2014, 36(1): 121–127.
    SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27(3): 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x
    HECKEL R, MIKUTIS G, and GRASS R N. A characterization of the DNA data storage channel[J]. Scientific Reports, 2019, 9(1): 9663. doi: 10.1038/s41598-019-45832-6
    REED I S and SOLOMON G. Polynomial codes over certain finite fields[J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300–304. doi: 10.1137/0108018
    ANAVY L, VAKNIN I, ATAR O, et al. Improved DNA based storage capacity and fidelity using composite DNA letters[J]. bioRxiv, 2018. doi: 10.1101/433524
    CHOI Y, RYU T, LEE A C, et al. Addition of degenerate bases to DNA-based data storage for increased information capacity[J]. bioRxiv, 2018. doi: 10.1101/367052
    YAZDI S M H T, GABRYS R, and MILENKOVIC O. Portable and error-free DNA-based data storage[J]. Scientific Reports, 2017, 7: 5011. doi: 10.1038/s41598-017-05188-1
    BLAWAT M, GAEDKE K, HÜTTER I, et al. Forward error correction for DNA data storage[J]. Procedia Computer Science, 2016, 80: 1011–1022. doi: 10.1016/j.procs.2016.05.398
    LEE H H, KALHOR R, GOELA N, et al. Enzymatic DNA synthesis for digital information storage[J]. bioRxiv, 2018. doi: 10.1101/348987
    BAUM E. Building an associative memory vastly larger than the brain[J]. Science, 1995, 268(5210): 583–585. doi: 10.1126/science.7725109
    CARUTHERS M H. The chemical synthesis of DNA/RNA: Our gift to science[J]. Journal of Biological Chemistry, 2013, 288(2): 1420–1427. doi: 10.1074/jbc.X112.442855
    GOODWIN S, MCPHERSON J D, and MCCOMBIE W R. Coming of age: Ten years of next-generation sequencing technologies[J]. Nature Reviews Genetics, 2016, 17(6): 333–351. doi: 10.1038/nrg.2016.49
    SHENDURE J, BALASUBRAMANIAN S, CHURCH G M, et al. DNA sequencing at 40: Past, present and future[J]. Nature, 2017, 550(7676): 345–353. doi: 10.1038/nature24286
    DEAMER D, AKESON M, and BRANTON D. Three decades of nanopore sequencing[J]. Nature Biotechnology, 2016, 34(5): 518–524. doi: 10.1038/nbt.3423
    FONTANA JR R E and DECAD G M. Moore’s law realities for recording systems and memory storage components: HDD, tape, NAND, and optical[J]. AIP Advances, 2018, 8(5): 056506. doi: 10.1063/1.5007621
    BONNET J, COLOTTE M, COUDY D, et al. Chain and conformation stability of solid-state DNA: Implications for room temperature storage[J]. Nucleic Acids Research, 2010, 38(5): 1531–1546. doi: 10.1093/nar/gkp1060
    PRAKADAN S M, SHALEK A K, and WEITZ D A. Scaling by shrinking: Empowering single-cell 'omics' with microfluidic devices[J]. Nature Reviews Genetics, 2017, 18(6): 345–361. doi: 10.1038/nrg.2017.15
    NEWMAN S, STEPHENSON A P, WILLSEY M, et al. High density DNA data storage library via dehydration with digital microfluidic retrieval[J]. Nature Communications, 2019, 10(1): 1706. doi: 10.1038/s41467-019-09517-y
  • 加载中
图(1) / 表(1)
计量
  • 文章访问数:  4232
  • HTML全文浏览量:  1459
  • PDF下载量:  240
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-11-01
  • 修回日期:  2020-05-18
  • 网络出版日期:  2020-05-21
  • 刊出日期:  2020-06-22

目录

    /

    返回文章
    返回