高级搜索

基于多目标进化策略算法的DNA核酸编码设计

张凯 陈彬 许志伟

引用本文: 张凯, 陈彬, 许志伟. 基于多目标进化策略算法的DNA核酸编码设计[J]. 电子与信息学报, 2020, 42(6): 1365-1373. doi: 10.11999/JEIT190869 shu
Citation:  Kai ZHANG, Bin Chen, Zhiwei Xu. A Multiobjective Evolution Strategy Algorithm for DNA Sequence Design[J]. Journal of Electronics and Information Technology, 2020, 42(6): 1365-1373. doi: 10.11999/JEIT190869 shu

基于多目标进化策略算法的DNA核酸编码设计

    作者简介: 张凯: 男,1979年生,教授,研究方向为DNA计算、多目标进化算法;
    陈彬: 男,1982年生,硕士生,研究方向为智能优化算法;
    许志伟: 男,1995年生,博士生,研究方向为DNA编码、演化计算
    通讯作者: 许志伟,xuzhiwei@wust.edu.cn
  • 基金项目: 国家自然科学基金(61472293, 61702383, 61602328)

摘要: 设计高质量的核酸分子集合能有效提高DNA计算的可靠性、有效性和可求解问题的规模。DNA分子需要满足热力学约束、相似度约束、GC含量约束等多个相互冲突的目标函数,是典型的多目标优化问题。该文提出一种多目标进化策略(MOES)算法求解DNA分子序列设计问题,算法设计了随机碱基变异算子实现高效的局部搜索和全局搜索。改进的评价函数综合考虑了候选解的支配关系和冲突目标的平衡程度,选取符合DNA编码约束的核酸序列。实验结果证明,该文提出的算法具有高效的搜索效率和快速收敛能力,可以产生高质量的DNA序列集合,优于其他对比算法产生的DNA分子序列集合。

English

    1. [1]

      ADLEMAN L M. Molecular computation of solutions to combinatorial problems[J]. Science, 1994, 266(5187): 1021–1024. doi: 10.1126/science.7973651

    2. [2]

      DE SILVA P Y and GANEGODA G U. New trends of digital data storage in DNA[J]. BioMed Research International, 2016: 8072463.

    3. [3]

      BRAICH R S, CHELYAPOV N, JOHNSON C, et al. Solution of a 20-variable 3-SAT problem on a DNA computer[J]. Science, 2002, 296(5567): 499–502. doi: 10.1126/science.1069528

    4. [4]

      ZIMMERMANN K H. Efficient DNA sticker algorithms for NP-complete graph problems[J]. Computer Physics Communications, 2002, 144(3): 297–309. doi: 10.1016/S0010-4655(02)00270-9

    5. [5]

      XU Jin, QIANG Xiaoli, ZHANG Kai, et al. A DNA computing model for the graph vertex coloring problem based on a probe graph[J]. Engineering, 2018, 4(1): 61–77. doi: 10.1016/j.eng.2018.02.011

    6. [6]

      CHAVES-GONZÁLEZ J M and VEGA-RODRÍGUEZ M A. A multiobjective approach based on the behavior of fireflies to generate reliable DNA sequences for molecular computing[J]. Applied Mathematics and Computation, 2014, 227: 291–308. doi: 10.1016/j.amc.2013.11.032

    7. [7]

      PENG Ximei, ZHENG Xuedong, WANG Bin, et al. A micro-genetic algorithm for DNA encoding sequences design[C]. The 2nd International Conference on Control Science and Systems Engineering, Singapore, 2016: 10–14.

    8. [8]

      YANG Gaijing, WANG Bin, ZHENG Xuedong, et al. IWO algorithm based on niche crowding for DNA sequence design[J]. Interdisciplinary Sciences: Computational Life Sciences, 2017, 9(3): 341–349. doi: 10.1007/s12539-016-0160-0

    9. [9]

      WANG Yanfeng, SHEN Yongpeng, ZHANG Xuncai, et al. An improved non-dominated sorting genetic algorithm-Ⅱ (INSGA-Ⅱ) applied to the design of DNA codewords[J]. Mathematics and Computers in Simulation, 2018, 151: 131–139. doi: 10.1016/j.matcom.2018.03.011

    10. [10]

      CHAVES-GONZÁLEZ J M and MARTÍNEZ-GIL J. An efficient design for a multi-objective evolutionary algorithm to generate DNA libraries suitable for computation[J]. Interdisciplinary Sciences: Computational Life Sciences, 2018, 11(3): 542–558.

    11. [11]

      YANG Shuming, SHAO Dongguo, and LUO Yangjie. A novel evolution strategy for multiobjective optimization problem[J]. Applied Mathematics and Computation, 2005, 170(2): 850–873. doi: 10.1016/j.amc.2004.12.025

    12. [12]

      ARNOLD D V and BEYER H G. Investigation of the (μ, λ)-ES in the presence of noise[C]. The 2001 Congress on Evolutionary Computation, Seoul, South Korea, 2001, 1: 332–339.

    13. [13]

      EBENAU C, ROTTSCHÄFER J, and THIERAUF G. An advanced evolutionary strategy with an adaptive penalty function for mixed-discrete structural optimisation[J]. Advances in Engineering Software, 2005, 36(1): 29–38. doi: 10.1016/j.advengsoft.2003.10.008

    14. [14]

      MEZURA-MONTES E and COELLO C A C. A simple multimembered evolution strategy to solve constrained optimization problems[J]. IEEE Transactions on Evolutionary Computation, 2005, 9(1): 1–17. doi: 10.1109/TEVC.2004.836819

    15. [15]

      XIAO J, XU Jin, CHEN Zhihua, et al. A hybrid quantum chaotic swarm evolutionary algorithm for DNA encoding[J]. Computers & Mathematics with Applications, 2009, 57(11/12): 1949–1958.

    16. [16]

      CHAVES-GONZÁLEZ J M, VEGA-RODRÍGUEZ M A, and GRANADO-CRIADO J M. A multiobjective swarm intelligence approach based on artificial bee colony for reliable DNA sequence design[J]. Engineering Applications of Artificial Intelligence, 2013, 26(9): 2045–2057. doi: 10.1016/j.engappai.2013.04.011

    17. [17]

      MUHAMMAD M S, SELVAN K V, MASRA S M W, et al. An improved binary particle swarm optimization algorithm for DNA encoding enhancement[C]. 2011 IEEE Symposium on Swarm Intelligence, Paris, France, 2011: 1–8.

    18. [18]

      CHAVES-GONZÁLEZ J M and VEGA-RODRÍGUEZ M A. DNA strand generation for DNA computing by using a multi-objective differential evolution algorithm[J]. Biosystems, 2014, 116: 49–64. doi: 10.1016/j.biosystems.2013.12.005

    19. [19]

      BUI L T and ALAM S. Multi-Objective Optimization in Computational Intelligence: Theory and Practice[M]. Hershey: IGI Global, 2008.

    20. [20]

      SHIN S Y, LEE I H, KIM D, et al. Multiobjective evolutionary optimization of DNA sequences for reliable DNA computing[J]. IEEE Transactions on Evolutionary Computation, 2005, 9(2): 143–158. doi: 10.1109/TEVC.2005.844166

    1. [1]

      王君珂, 印珏, 牛人杰, 任少康, 晁洁. DNA计算与DNA纳米技术. 电子与信息学报, 2020, 42(6): 1313-1325.

    2. [2]

      殷志祥, 唐震, 张强, 崔建中, 杨静, 王日晟, 赵寿为, 张居丽. 基于DNA折纸基底的与非门计算模型. 电子与信息学报, 2020, 42(6): 1355-1364.

    3. [3]

      孙军伟, 李智, 王延峰. 基于DNA链置换的三级联组合分子逻辑电路设计. 电子与信息学报, 2020, 42(6): 1401-1409.

    4. [4]

      姚敏立, 王旭健, 张峰干, 戴定成. 基于动态参数差分进化算法的多约束稀布矩形面阵优化. 电子与信息学报, 2020, 42(5): 1281-1287.

    5. [5]

      柳娟, 谢文彬, 汪改英, 汤敏丽. 基于DNA和限制性核酸内切酶的基本逻辑门设计. 电子与信息学报, 2020, 42(6): 1332-1339.

    6. [6]

      毛秀海, 李凡, 左小磊. DNA数据存储. 电子与信息学报, 2020, 42(6): 1303-1312.

    7. [7]

      王延峰, 张桢桢, 王盼如, 孙军伟. 基于DNA链置换的两位格雷码减法器分子电路设计. 电子与信息学报, 2020, 42(0): 1-9.

    8. [8]

      牛莹, 张勋才. 基于变步长约瑟夫遍历和DNA动态编码的图像加密算法. 电子与信息学报, 2020, 42(6): 1383-1391.

    9. [9]

      许鹏, 方刚, 石晓龙, 刘文斌. DNA存储及其研究进展. 电子与信息学报, 2020, 42(6): 1326-1331.

    10. [10]

      姜文, 牛杰, 吴一戎, 梁兴东. 机载多通道SAR运动目标方位向速度和法向速度联合估计算法. 电子与信息学报, 2020, 42(6): 1542-1548.

    11. [11]

      张海波, 程妍, 刘开健, 贺晓帆. 车联网中整合移动边缘计算与内容分发网络的移动性管理策略. 电子与信息学报, 2020, 42(6): 1444-1451.

    12. [12]

      陈容, 陈岚, WAHLAArfan Haider. 基于公式递推法的可变计算位宽的循环冗余校验设计与实现. 电子与信息学报, 2020, 42(5): 1261-1267.

    13. [13]

      全英汇, 高霞, 沙明辉, 陈侠达, 李亚超, 邢孟道, 岳超良. 基于期望最大化算法的捷变频联合正交频分复用雷达高速多目标参数估计. 电子与信息学报, 2020, 42(7): 1611-1618.

    14. [14]

      董亚非, 胡文晓, 钱梦瑶, 王越. 基于DNA适配体的荧光生物传感器. 电子与信息学报, 2020, 42(6): 1374-1382.

    15. [15]

      唐伦, 肖娇, 魏延南, 赵国繁, 陈前斌. 基于云雾混合计算的车联网联合资源分配算法. 电子与信息学报, 2020, 42(0): 1-8.

    16. [16]

      夏士超, 姚枝秀, 鲜永菊, 李云. 移动边缘计算中分布式异构任务卸载算法. 电子与信息学报, 2020, 41(0): 1-8.

    17. [17]

      周杨, 张天骐. 多径环境下异步长码DS-CDMA信号伪码序列及信息序列盲估计. 电子与信息学报, 2020, 41(0): 1-8.

    18. [18]

      吕敬祥, 罗文浪. 无线传感网络量化及能量优化策略. 电子与信息学报, 2020, 42(5): 1118-1124.

    19. [19]

      向敏, 饶华阳, 张进进, 陈梦鑫. 基于GCN的软件定义电力通信网络路由控制策略. 电子与信息学报, 2020, 42(0): 1-8.

    20. [20]

      曾菊玲, 张春雷, 蒋砺思, 夏凌. 基于信道定价的无线虚拟网络资源分配策略:匹配/Stackelberg分层博弈. 电子与信息学报, 2020, 41(0): 0-7.

  • 图 1  非支配解集中的边界点和理想解

    图 2  目标函数收敛过程

    表 1  个体变异算子伪代码

     输入: X=$5' $–x1x2$ \cdots $xn–$3' $
     输出: Y=$5' $–y1y2$ \cdots $yn–$3' $
     1: for j=1 to n
     2: List.add(j)
     3: end for
     4: k=random(n)
     5: for j=1 to k
     6: i = random(List.count)
     7: yi=(xi+random(3)+1) mod 4
     8: List.delete(i)
     9: end for
    下载: 导出CSV

    表 2  算法总体流程伪代码

     1: Initialization
     2: while (t < max iteration)
     3: for i=1 to P
     4:  p = Pt(i)
     5:  q = Individual Mutation(p)
     6:  if qp then
     7:   Pt(i)=q
     8:  else if (qp) and (pq) then
     9:   if Fitness(q) > Fitness(p) then
     10:    Pt(i)=q
     11:   end if
     12:  end if
     13: end for
     14: t=t+1
     15: end while
    下载: 导出CSV

    表 3  7条长度为20的DNA序列的结果比较

    方法(序列)连续性发卡结构H-measure相似度TmGC(%)
    MGA[7]
    TAGACCACTGTTGCACATGG00585250.279450
    ATTCGGTCAGACTTGCTGTG00645248.665050
    ATAGTGCGGACAGTAGTTCC00665950.163450
    AATACGCGGAACGTAACCTC00618550.415850
    AATACGCGGAACGTAACCTC00618550.415850
    ACAGCCTTAAGCCTAACTCC03655449.064150
    ATGCTTCCGACATGGAATGG03635749.816050
    f (x)064384441.75080
    IWO[8]
    ACACCAGCACACCAGAAACA90555548.467050
    GTTCAATCGCCTCTCGGTAT00575749.393550
    GCTACCTCTTCCACCATTCT00555549.245350
    GAATCAATGGCGGTCAGAAG00666649.944050
    TTGGTCCGGTTATTCCTTCG00656550.641850
    CCATCTTCCGTACTTCACTG00565651.099350
    TTCGACTCGGTTCCTTGCTA00585847.604950
    f (x)904124123.49440
    pMO-ABC[10]
    TGTGGTTGGTTAGTCGGTTG00464951.042150
    GGTGGTATTGGTGGTATTGG00474753.802750
    CTTCTCTTCTCTTGCCGCTT00395646.411250
    AACAACCTCCACACCGAACA00623249.173750
    CTCTCTCTCTCACTCTCTCA00414846.522050
    CTCTCATTCCTTCTTACCCC160435150.828350
    TGGTGTTGCTGGTGTAGGTT00485149.398550
    f (x)1603263347.39150
    MOES
    GGAGAGGAGAAGAAGAAGAG00522548.172750
    CCTCCACATCACCATTAACC00563152.380750
    CTCTCTCTCTCTCTCTCTCT00343745.665850
    TTCCTTCCTTCCTTCCTTCC00363948.750050
    TTGGTTGGTTGGTTGGTTGG00304651.305450
    TTGTTGTTGGTGGTGGTGGT00304850.223650
    TGTGTGTGGTGTGTGTTGTG00304651.002550
    f (x)002682726.71490
    下载: 导出CSV

    表 4  14条长度为20的DNA序列的结果比较

    方法(序列)连续性发卡结构H-measure相似度TmGC(%)
    MGA[7]
    CTCATCTAATCAGCCTCGCA0013511448.155450
    CTAATAGTGACAGCTGCGTG0313111950.242150
    GCATCGTTAGAGACACCTAC0313412450.793250
    GCATCAATATGCGCGACTAC0013112550.281550
    CATTAAGTAGACGCTGTCGG0313211450.950750
    TATGGATGAGGAGGACCTAG0313311750.638750
    CAGAGATGTTCTGTACCACC0312811751.223250
    CGTCGAGAATTCGTAGCTCA0013711948.322450
    TCTGTTACCGTATCGGATCG0312911550.879150
    AGAAGAGTTCGACTTGCTGG0313412147.550750
    GCAAGGAATTCACCGTCTGT0313312948.988150
    CGTGTGAAGAGAGTGGTTCA0012712348.935550
    CGACTGAATCATGGACCTGT0313412649.762450
    TACCGAGAAGTAGGACTGCA0313412448.384750
    f (x)030185216873.67250
    INSGA-II[9]
    CGAGACATCGTGCATATCGT0414312449.639350
    TATAGCACGAGTGCGCGTAT0313713048.565950
    GATCTACGATCATGAGAGCG0413512649.667350
    TCTGTACTGCTGACTCGAGT0316312447.131250
    CGAGTAGTCACACGATGAGA0015213249.283650
    AGATGATCAGCAGCGACACT0313313346.554650
    TGTGCTCGTCTCTGCATACT0415913047.150750
    AGACGAGTCGTACAGTACAG0015213449.909150
    ATGTACGTGAGATGCAGCAG0013912148.927050
    ATCACTACTCGCTCGTCACT0314113247.519050
    TCAGAGATACTCACGTCACG0314212349.283650
    GACAGAGCTATCAGCTACTG0312912449.292750
    GCTGACATAGAGTGCGATAC0013013350.172550
    ACATCGACACTACTACGCAC0313314450.155450
    f (x)033198818103.61790
    pMO-ABC[10]
    GTTATTGGTGGTGTGCGTTG001438251.930550
    ACGGAAGTAGGAAGGAGAGA0013710647.808950
    GGAAGACGCAGAAGAGAAAG9012111048.260950
    CCTCCTTATTGCCTTCCTTC0011410250.308150
    AACTAACCACCGACCAACCA009511850.110250
    ACACACAACACACACACTCC008811950.457750
    ACACCACCACATTACCACAC009711951.916150
    CTTCCGTCTCTTCTCTCTCT0013410546.956150
    AAGGAGCGAGGAAGCGAAAA1601079545.830650
    AACACCAGAACATCCACACC009013150.547450
    CCAACACCATACAACAGACC009513052.372050
    AAGGCGGAAGGATAGAAGAG0012811548.537050
    TCTGCCGCTTCTTCTTCTTC001189546.400050
    TCCTCTCGTCTATTCTCCTC001119848.442750
    f (x)250157815256.54140
    MOES
    CATACACACTCACACTCACC001128951.679250
    TTGTTGTGGGTTGTCCGGTT901059049.794950
    ACACACACACACACACACAC00937850.924450
    TTGTGGTCCTGGTGTTCTCT001129048.495750
    GAGAGAGAGAGAGAGAGAGA007210045.656850
    TGGTGTGGTGTGGTTAGGTT00969350.532550
    TTGGTGGTGGTGGTTGTAGT00969550.532550
    CCAACCAACCAACCAACCAA00957851.305450
    AACAAGCCAGAAGCCAGAAG009410247.506650
    GTTGGTGCTGTTGTTGAGGT001019949.455050
    GAAGAAGGGAGGAGAGAAGA907710847.496150
    AATGGAAGCGAAGCGAAGTG009310447.676650
    AACCATCAACCGCCGAAGAA031049548.169450
    AAGGTGGAGAGGAAGGAGAA008211147.409850
    f (x)183133213326.02240
    下载: 导出CSV
  • 加载中
图(2)表(4)
计量
  • PDF下载量:  28
  • 文章访问数:  1474
  • HTML全文浏览量:  546
文章相关
  • 通讯作者:  许志伟, xuzhiwei@wust.edu.cn
  • 收稿日期:  2019-11-01
  • 录用日期:  2020-03-01
  • 网络出版日期:  2020-04-09
  • 刊出日期:  2020-06-01
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

返回文章