高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多目标进化策略算法的DNA核酸编码设计

张凯 陈彬 许志伟

张凯, 陈彬, 许志伟. 基于多目标进化策略算法的DNA核酸编码设计[J]. 电子与信息学报, 2020, 42(6): 1365-1373. doi: 10.11999/JEIT190869
引用本文: 张凯, 陈彬, 许志伟. 基于多目标进化策略算法的DNA核酸编码设计[J]. 电子与信息学报, 2020, 42(6): 1365-1373. doi: 10.11999/JEIT190869
Kai ZHANG, Bin Chen, Zhiwei Xu. A Multiobjective Evolution Strategy Algorithm for DNA Sequence Design[J]. Journal of Electronics & Information Technology, 2020, 42(6): 1365-1373. doi: 10.11999/JEIT190869
Citation: Kai ZHANG, Bin Chen, Zhiwei Xu. A Multiobjective Evolution Strategy Algorithm for DNA Sequence Design[J]. Journal of Electronics & Information Technology, 2020, 42(6): 1365-1373. doi: 10.11999/JEIT190869

基于多目标进化策略算法的DNA核酸编码设计

doi: 10.11999/JEIT190869
基金项目: 国家自然科学基金(61472293, 61702383, 61602328)
详细信息
    作者简介:

    张凯:男,1979年生,教授,研究方向为DNA计算、多目标进化算法

    陈彬:男,1982年生,硕士生,研究方向为智能优化算法

    许志伟:男,1995年生,博士生,研究方向为DNA编码、演化计算

    通讯作者:

    许志伟 xuzhiwei@wust.edu.cn

  • 中图分类号: TP301

A Multiobjective Evolution Strategy Algorithm for DNA Sequence Design

Funds: The National Natural Science Foundation of China (61472293, 61702383, 61602328)
  • 摘要: 设计高质量的核酸分子集合能有效提高DNA计算的可靠性、有效性和可求解问题的规模。DNA分子需要满足热力学约束、相似度约束、GC含量约束等多个相互冲突的目标函数,是典型的多目标优化问题。该文提出一种多目标进化策略(MOES)算法求解DNA分子序列设计问题,算法设计了随机碱基变异算子实现高效的局部搜索和全局搜索。改进的评价函数综合考虑了候选解的支配关系和冲突目标的平衡程度,选取符合DNA编码约束的核酸序列。实验结果证明,该文提出的算法具有高效的搜索效率和快速收敛能力,可以产生高质量的DNA序列集合,优于其他对比算法产生的DNA分子序列集合。
  • 图  1  非支配解集中的边界点和理想解

    图  2  目标函数收敛过程

    表  1  个体变异算子伪代码

     输入: X=$5' $–x1x2$ \cdots $xn–$3' $
     输出: Y=$5' $–y1y2$ \cdots $yn–$3' $
     1: for j=1 to n
     2: List.add(j)
     3: end for
     4: k=random(n)
     5: for j=1 to k
     6: i = random(List.count)
     7: yi=(xi+random(3)+1) mod 4
     8: List.delete(i)
     9: end for
    下载: 导出CSV

    表  2  算法总体流程伪代码

     1: Initialization
     2: while (t < max iteration)
     3: for i=1 to P
     4:  p = Pt(i)
     5:  q = Individual Mutation(p)
     6:  if qp then
     7:   Pt(i)=q
     8:  else if (qp) and (pq) then
     9:   if Fitness(q) > Fitness(p) then
     10:    Pt(i)=q
     11:   end if
     12:  end if
     13: end for
     14: t=t+1
     15: end while
    下载: 导出CSV

    表  3  7条长度为20的DNA序列的结果比较

    方法(序列)连续性发卡结构H-measure相似度TmGC(%)
    MGA[7]
    TAGACCACTGTTGCACATGG00585250.279450
    ATTCGGTCAGACTTGCTGTG00645248.665050
    ATAGTGCGGACAGTAGTTCC00665950.163450
    AATACGCGGAACGTAACCTC00618550.415850
    AATACGCGGAACGTAACCTC00618550.415850
    ACAGCCTTAAGCCTAACTCC03655449.064150
    ATGCTTCCGACATGGAATGG03635749.816050
    f (x)064384441.75080
    IWO[8]
    ACACCAGCACACCAGAAACA90555548.467050
    GTTCAATCGCCTCTCGGTAT00575749.393550
    GCTACCTCTTCCACCATTCT00555549.245350
    GAATCAATGGCGGTCAGAAG00666649.944050
    TTGGTCCGGTTATTCCTTCG00656550.641850
    CCATCTTCCGTACTTCACTG00565651.099350
    TTCGACTCGGTTCCTTGCTA00585847.604950
    f (x)904124123.49440
    pMO-ABC[10]
    TGTGGTTGGTTAGTCGGTTG00464951.042150
    GGTGGTATTGGTGGTATTGG00474753.802750
    CTTCTCTTCTCTTGCCGCTT00395646.411250
    AACAACCTCCACACCGAACA00623249.173750
    CTCTCTCTCTCACTCTCTCA00414846.522050
    CTCTCATTCCTTCTTACCCC160435150.828350
    TGGTGTTGCTGGTGTAGGTT00485149.398550
    f (x)1603263347.39150
    MOES
    GGAGAGGAGAAGAAGAAGAG00522548.172750
    CCTCCACATCACCATTAACC00563152.380750
    CTCTCTCTCTCTCTCTCTCT00343745.665850
    TTCCTTCCTTCCTTCCTTCC00363948.750050
    TTGGTTGGTTGGTTGGTTGG00304651.305450
    TTGTTGTTGGTGGTGGTGGT00304850.223650
    TGTGTGTGGTGTGTGTTGTG00304651.002550
    f (x)002682726.71490
    下载: 导出CSV

    表  4  14条长度为20的DNA序列的结果比较

    方法(序列)连续性发卡结构H-measure相似度TmGC(%)
    MGA[7]
    CTCATCTAATCAGCCTCGCA0013511448.155450
    CTAATAGTGACAGCTGCGTG0313111950.242150
    GCATCGTTAGAGACACCTAC0313412450.793250
    GCATCAATATGCGCGACTAC0013112550.281550
    CATTAAGTAGACGCTGTCGG0313211450.950750
    TATGGATGAGGAGGACCTAG0313311750.638750
    CAGAGATGTTCTGTACCACC0312811751.223250
    CGTCGAGAATTCGTAGCTCA0013711948.322450
    TCTGTTACCGTATCGGATCG0312911550.879150
    AGAAGAGTTCGACTTGCTGG0313412147.550750
    GCAAGGAATTCACCGTCTGT0313312948.988150
    CGTGTGAAGAGAGTGGTTCA0012712348.935550
    CGACTGAATCATGGACCTGT0313412649.762450
    TACCGAGAAGTAGGACTGCA0313412448.384750
    f (x)030185216873.67250
    INSGA-II[9]
    CGAGACATCGTGCATATCGT0414312449.639350
    TATAGCACGAGTGCGCGTAT0313713048.565950
    GATCTACGATCATGAGAGCG0413512649.667350
    TCTGTACTGCTGACTCGAGT0316312447.131250
    CGAGTAGTCACACGATGAGA0015213249.283650
    AGATGATCAGCAGCGACACT0313313346.554650
    TGTGCTCGTCTCTGCATACT0415913047.150750
    AGACGAGTCGTACAGTACAG0015213449.909150
    ATGTACGTGAGATGCAGCAG0013912148.927050
    ATCACTACTCGCTCGTCACT0314113247.519050
    TCAGAGATACTCACGTCACG0314212349.283650
    GACAGAGCTATCAGCTACTG0312912449.292750
    GCTGACATAGAGTGCGATAC0013013350.172550
    ACATCGACACTACTACGCAC0313314450.155450
    f (x)033198818103.61790
    pMO-ABC[10]
    GTTATTGGTGGTGTGCGTTG001438251.930550
    ACGGAAGTAGGAAGGAGAGA0013710647.808950
    GGAAGACGCAGAAGAGAAAG9012111048.260950
    CCTCCTTATTGCCTTCCTTC0011410250.308150
    AACTAACCACCGACCAACCA009511850.110250
    ACACACAACACACACACTCC008811950.457750
    ACACCACCACATTACCACAC009711951.916150
    CTTCCGTCTCTTCTCTCTCT0013410546.956150
    AAGGAGCGAGGAAGCGAAAA1601079545.830650
    AACACCAGAACATCCACACC009013150.547450
    CCAACACCATACAACAGACC009513052.372050
    AAGGCGGAAGGATAGAAGAG0012811548.537050
    TCTGCCGCTTCTTCTTCTTC001189546.400050
    TCCTCTCGTCTATTCTCCTC001119848.442750
    f (x)250157815256.54140
    MOES
    CATACACACTCACACTCACC001128951.679250
    TTGTTGTGGGTTGTCCGGTT901059049.794950
    ACACACACACACACACACAC00937850.924450
    TTGTGGTCCTGGTGTTCTCT001129048.495750
    GAGAGAGAGAGAGAGAGAGA007210045.656850
    TGGTGTGGTGTGGTTAGGTT00969350.532550
    TTGGTGGTGGTGGTTGTAGT00969550.532550
    CCAACCAACCAACCAACCAA00957851.305450
    AACAAGCCAGAAGCCAGAAG009410247.506650
    GTTGGTGCTGTTGTTGAGGT001019949.455050
    GAAGAAGGGAGGAGAGAAGA907710847.496150
    AATGGAAGCGAAGCGAAGTG009310447.676650
    AACCATCAACCGCCGAAGAA031049548.169450
    AAGGTGGAGAGGAAGGAGAA008211147.409850
    f (x)183133213326.02240
    下载: 导出CSV
  • ADLEMAN L M. Molecular computation of solutions to combinatorial problems[J]. Science, 1994, 266(5187): 1021–1024. doi: 10.1126/science.7973651
    DE SILVA P Y and GANEGODA G U. New trends of digital data storage in DNA[J]. BioMed Research International, 2016: 8072463.
    BRAICH R S, CHELYAPOV N, JOHNSON C, et al. Solution of a 20-variable 3-SAT problem on a DNA computer[J]. Science, 2002, 296(5567): 499–502. doi: 10.1126/science.1069528
    ZIMMERMANN K H. Efficient DNA sticker algorithms for NP-complete graph problems[J]. Computer Physics Communications, 2002, 144(3): 297–309. doi: 10.1016/S0010-4655(02)00270-9
    XU Jin, QIANG Xiaoli, ZHANG Kai, et al. A DNA computing model for the graph vertex coloring problem based on a probe graph[J]. Engineering, 2018, 4(1): 61–77. doi: 10.1016/j.eng.2018.02.011
    CHAVES-GONZÁLEZ J M and VEGA-RODRÍGUEZ M A. A multiobjective approach based on the behavior of fireflies to generate reliable DNA sequences for molecular computing[J]. Applied Mathematics and Computation, 2014, 227: 291–308. doi: 10.1016/j.amc.2013.11.032
    PENG Ximei, ZHENG Xuedong, WANG Bin, et al. A micro-genetic algorithm for DNA encoding sequences design[C]. The 2nd International Conference on Control Science and Systems Engineering, Singapore, 2016: 10–14.
    YANG Gaijing, WANG Bin, ZHENG Xuedong, et al. IWO algorithm based on niche crowding for DNA sequence design[J]. Interdisciplinary Sciences: Computational Life Sciences, 2017, 9(3): 341–349. doi: 10.1007/s12539-016-0160-0
    WANG Yanfeng, SHEN Yongpeng, ZHANG Xuncai, et al. An improved non-dominated sorting genetic algorithm-Ⅱ (INSGA-Ⅱ) applied to the design of DNA codewords[J]. Mathematics and Computers in Simulation, 2018, 151: 131–139. doi: 10.1016/j.matcom.2018.03.011
    CHAVES-GONZÁLEZ J M and MARTÍNEZ-GIL J. An efficient design for a multi-objective evolutionary algorithm to generate DNA libraries suitable for computation[J]. Interdisciplinary Sciences: Computational Life Sciences, 2018, 11(3): 542–558.
    YANG Shuming, SHAO Dongguo, and LUO Yangjie. A novel evolution strategy for multiobjective optimization problem[J]. Applied Mathematics and Computation, 2005, 170(2): 850–873. doi: 10.1016/j.amc.2004.12.025
    ARNOLD D V and BEYER H G. Investigation of the (μ, λ)-ES in the presence of noise[C]. The 2001 Congress on Evolutionary Computation, Seoul, South Korea, 2001, 1: 332–339.
    EBENAU C, ROTTSCHÄFER J, and THIERAUF G. An advanced evolutionary strategy with an adaptive penalty function for mixed-discrete structural optimisation[J]. Advances in Engineering Software, 2005, 36(1): 29–38. doi: 10.1016/j.advengsoft.2003.10.008
    MEZURA-MONTES E and COELLO C A C. A simple multimembered evolution strategy to solve constrained optimization problems[J]. IEEE Transactions on Evolutionary Computation, 2005, 9(1): 1–17. doi: 10.1109/TEVC.2004.836819
    XIAO J, XU Jin, CHEN Zhihua, et al. A hybrid quantum chaotic swarm evolutionary algorithm for DNA encoding[J]. Computers & Mathematics with Applications, 2009, 57(11/12): 1949–1958.
    CHAVES-GONZÁLEZ J M, VEGA-RODRÍGUEZ M A, and GRANADO-CRIADO J M. A multiobjective swarm intelligence approach based on artificial bee colony for reliable DNA sequence design[J]. Engineering Applications of Artificial Intelligence, 2013, 26(9): 2045–2057. doi: 10.1016/j.engappai.2013.04.011
    MUHAMMAD M S, SELVAN K V, MASRA S M W, et al. An improved binary particle swarm optimization algorithm for DNA encoding enhancement[C]. 2011 IEEE Symposium on Swarm Intelligence, Paris, France, 2011: 1–8.
    CHAVES-GONZÁLEZ J M and VEGA-RODRÍGUEZ M A. DNA strand generation for DNA computing by using a multi-objective differential evolution algorithm[J]. Biosystems, 2014, 116: 49–64. doi: 10.1016/j.biosystems.2013.12.005
    BUI L T and ALAM S. Multi-Objective Optimization in Computational Intelligence: Theory and Practice[M]. Hershey: IGI Global, 2008.
    SHIN S Y, LEE I H, KIM D, et al. Multiobjective evolutionary optimization of DNA sequences for reliable DNA computing[J]. IEEE Transactions on Evolutionary Computation, 2005, 9(2): 143–158. doi: 10.1109/TEVC.2005.844166
  • 加载中
图(2) / 表(4)
计量
  • 文章访问数:  3805
  • HTML全文浏览量:  1391
  • PDF下载量:  107
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-11-01
  • 修回日期:  2020-03-01
  • 网络出版日期:  2020-04-09
  • 刊出日期:  2020-06-22

目录

    /

    返回文章
    返回