高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

有噪声标注情况下的中医舌色分类方法

卓力 孙亮亮 张辉 李晓光 张菁

卓力, 孙亮亮, 张辉, 李晓光, 张菁. 有噪声标注情况下的中医舌色分类方法[J]. 电子与信息学报, 2022, 44(1): 89-98. doi: 10.11999/JEIT210935
引用本文: 卓力, 孙亮亮, 张辉, 李晓光, 张菁. 有噪声标注情况下的中医舌色分类方法[J]. 电子与信息学报, 2022, 44(1): 89-98. doi: 10.11999/JEIT210935
ZHUO Li, SUN Liangliang, ZHANG Hui, LI Xiaoguang, ZHANG Jing. TCM Tongue Color Classification Method under Noisy Labeling[J]. Journal of Electronics & Information Technology, 2022, 44(1): 89-98. doi: 10.11999/JEIT210935
Citation: ZHUO Li, SUN Liangliang, ZHANG Hui, LI Xiaoguang, ZHANG Jing. TCM Tongue Color Classification Method under Noisy Labeling[J]. Journal of Electronics & Information Technology, 2022, 44(1): 89-98. doi: 10.11999/JEIT210935

有噪声标注情况下的中医舌色分类方法

doi: 10.11999/JEIT210935
基金项目: 国家自然科学基金(61871006)
详细信息
    作者简介:

    卓力:女,1971年生,教授,博士生导师,研究方向为智能化信息处理、图像/视频编码与传输

    孙亮亮:男,1996年生,硕士生,研究方向为人工智能系统设计与集成

    张辉:男,1982年生,副教授,硕士生导师,研究方向为视觉目标检测与跟踪

    李晓光:男,1980年生,副教授,硕士生导师,研究方向为图像视频信号处理、 超分辨率图像复原

    张菁:女,1975年生,教授,博士生导师,研究方向为多媒体信息检索

    通讯作者:

    卓力 zhuoli@bjut.edu.cn

  • 中图分类号: TN911.73; TP391

TCM Tongue Color Classification Method under Noisy Labeling

Funds: The National Natural Science Foundation of China (61871006)
  • 摘要: 舌色是中医(TCM)望诊最关注的诊察特征之一,自动准确的舌色分类是舌诊客观化研究的重要内容。由于不同类别舌色之间的视觉界限存在模糊性以及医生标注者的主观性等,标注的舌象数据中常含有噪声,影响舌色分类模型的训练。为此,该文提出一种有噪声标注情况下的中医舌色分类方法:首先,提出一种两阶段的数据清洗方法,对含有噪声的标注样本进行识别,并进行清洗;其次,设计一种基于通道注意力机制的轻型卷积神经网络,通过增强特征的表达能力,实现舌色的准确分类;最后,提出一种带有噪声样本过滤机制的知识蒸馏策略,该策略中加入了由教师网络主导的噪声样本过滤机制,进一步剔除噪声样本,同时利用教师网络指导轻型卷积神经网络的训练,提升了分类性能。在自建的中医舌色分类数据集上的实验结果表明,该文提出的舌色分类方法能以较低的计算复杂度,显著提升分类的准确率,达到了93.88%。
  • 图  1  有噪声标注情况下的中医舌色分类方法整体框图

    图  2  轻型CNN网络结构图

    图  3  通道注意力网络结构

    图  4  知识蒸馏的结构图

    图  5  舌色样本的示例

    图  6  清洗出的部分噪声样本示例

    图  7  不同分类模型准确率对比

    表  1  有噪声标准舌图像的数据清洗流程

     输入: 原始数据集${D_o} = \{ ({x_1},{y_1}),({x_2},{y_2}) \cdots ({x_n},{y_n})\}$;
        准确率阈值${\text{Ac} }{ {\text{c} }_{{\rm{th}}} } = 0.9$;
        变量$ a = 0 $;
     过程:
     第1阶段:
       (1) 将${D_{\rm{o}}}$按照4:1随机划分为训练集和测试集;
         ${D_{ {\text{train} } } } = \{ ({x_{r + 1} },{y_{r + 1} }),({x_{r + 2} },{y_{r + 2} }), \cdots, ({x_{r + {\text{len} }({D_{ {\text{train} } } })} },{y_{r + {\text{len} }({D_{ {\text{train} } } })} })\}$
         ${D_{ {\text{test} } } } = \{ ({x_{e + 1} },{y_{e + 1} }),({x_{e + 2} },{y_{e + 2} }) ,\cdots , ({x_{e + {\text{len} }({D_{ {\text{test} } } })} },{y_{e + {\text{len} }({D_{ {\text{test} } } })} })\}$
       (2) 用$ {D_{{\text{train}}}} $训练ResNet18网络;
       (3) 用训练好的ResNet18模型对$ {D_{{\text{test}}}} $中样本进行预测得到
        $\hat y = ({\hat y_{e + 1} }\; \cdots \;{\hat y_{e + {\text{len} }({D_{ {\text{test} } } })} })$;
       (4) for i in range ($ {\text{len}}({D_{{\text{test}}}}) $):
           if $ {\hat y_{e + i}}! = y{}_{e + i} $:
          $ {D_{{\text{test}}}} $.remove($ {x_{e + i}} $) ;
          $ {D_{{\text{detele}}}} $add($ {x_{e + i}} $) ; // 将删除的数据放在一起组成$ {D_{{\text{detele}}}} $
          $ {\text{Acc}} = {\text{len}}({D_{{\text{test}}}})/({\text{len}}({D_{{\text{test}}}}) + {\text{len}}({D_{{\text{detele}}}}) - a) $;
          $ a = a + {\text{len}}({D_{{\text{detele}}}}) $;
       (5) while ${\text{Acc} } \le {\text{Ac} }{ {\text{c} }_{ {\text{th} } } }$ do
          $ {D_{{\text{onew}}}} \leftarrow {D_{{\text{train}}}} + {D_{{\text{test}}}} $;// 将去除噪声样本的
          $ {D_{{\text{train}}}} $与$ {D_{{\text{test}}}} $组合形成$ {D_{{\text{onew}}}} $
          将步骤(1)中的${D_{\rm{o}}}$换成$ {D_{{\text{onew}}}} $,repeat 步骤(1)—步骤
          (4)操作;
          end while
    第2阶段:
      (6) 将最终的$ {D_{{\text{onew}}}} $训练ResNet18和VGG16;
      (7) 利用各个网络训练的最优网络模型对$ {D_{{\text{detele}}}} $中的样本进行
        预测得到:
          $ {\hat y_r} = ({\hat y_{r + 1}},{\hat y_{r + 2}}, \cdots ,{\hat y_{r + {\text{len}}({D_{{\text{detele}}}})}}) $ // ResNet18模
          型的预测值
          $ {\hat y_v} = ({\hat y_{v + 1}},{\hat y_{v + 2}}, \cdots ,{\hat y_{v + {\text{len}}({D_{{\text{detele}}}})}}) $ // VGG16模型
          预测值
          for i in range ($ {\text{len}}({D_{{\text{detele}}}}) $):
          if $ {\hat y_{r + i}} = = {\hat y_{v + i}} $: // 当三者一致时,将预测结果作
          为样本新的标注
          $ {y_i} = {\hat y_{v + i}} $
          $ {D_{{\text{onew}}}} $.add($ {x_{r + i}} $)
     输出:经过清洗后的舌象数据$ {D_{{\text{onew}}}} $
    下载: 导出CSV

    表  2  噪声样本过滤流程

     输入:训练集$ {D_t} = \{ ({x_1},{y_1}),({x_2},{y_2}), \cdots ,({x_n},{y_n})\} $;
        教师网络$ {\text{Mode}}{{\text{l}}_t} $和已经训练好的模型参数$ \theta $以及损失函数
        Loss;
        remember_rate=0.95;
        单次送入网络的样本数量为batch_size;
     过程:
     (1) 将训练集中的样本按照batch_size大小送入教师网络中;
     (2) 利用教师网络$ {\text{Mode}}{{\text{l}}_t} $和训练好的模型参数$ \theta $得到batch_size个
       样本的预测结果;
     (3) 利用损失函数Loss计算样本的损失值,将batch_size个样本
       的损失值按照升序方式排序;
     (4) 设置remember_rate将样本按照排序的结果保留前
       batch_size×remember_rate个样本,清除掉剩余的样本;
     (5) 将步骤(4)保留的样本更新送入学生网络样本。
     输出:干净的数据样本,并将数据送入学生网络
    下载: 导出CSV

    表  3  数据清洗前后分类准确率(%)对比结果

    淡红舌红舌暗红舌紫舌整体精度
    清洗前7576.9291.3088.8982.34
    清洗后8888.4695.6510091.81
    下载: 导出CSV

    表  4  各种CNN网络结构的参数量(MB)

    AlexNetVGG16ResNet18MobileNet V2轻型CNN网络
    参数量61.1138.411.24.25.2
    下载: 导出CSV

    表  5  采用知识蒸馏前后的对比实验结果

    教师网络学生网络
    名称精度参数量(MB)名称精度参数量(MB)
    ×××轻型卷积
    神经网络
    0.92875.2
    ResNet18+CBAM0.941811.80.9345
    ResNet500.941525.60.9356
    ResNext500.942325.10.9345
    ResNet1010.943544.60.9352
    ResNet50+SeNet0.944728.10.9388
    下载: 导出CSV

    表  6  不同分类网络的比较结果(%)

    淡红舌红舌暗红舌紫舌整体精度
    DenseNet12182.6068.0084.6188.8982.92
    ResNet1875.0076.9291.3088.8982.34
    ResNext5078.2684.0080.7677.7880.72
    ShuffleNetV286.9576.0076.9288.8981.14
    MobileNetV286.9580.0076.9277.7881.32
    EfficientNet-b486.9564.0084.6177.7881.45
    本文方法91.3092.0096.15100.0093.88
    下载: 导出CSV

    表  7  基于两阶段数据清洗方法的消融实验结果

    原始数据集第1阶段第2阶段准确率(%)
    方式1××82.34
    方式2×90.02
    方式391.81
    下载: 导出CSV

    表  8  轻型CNN网络的消融实验结果

    知识蒸馏基线模型通道注意力准确率(%)
    ×92.78
    93.88
    下载: 导出CSV
  • [1] 李乃民. 中国舌诊大全[M]. 北京: 学苑出版社, 1994: 30–170.

    LI Naimin. Complete Collection of Tongue Diagnosis of Traditional Chinese Medicine[M]. Beijing: Academy Press, 1994: 30–170.
    [2] WANG Xingyuan and WANG Zongyu. A novel method for image retrieval based on structure elements’ descriptor[J]. Journal of Visual Communication and Image Representation, 2013, 24(1): 63–74. doi: 10.1016/j.jvcir.2012.10.003
    [3] WANG Xingyuan, CHEN Zhifeng, and YUN Jiaojiao. An effective method for color image retrieval based on texture[J]. Computer Standards & Interfaces, 2012, 34(1): 31–35. doi: 10.1016/j.csi.2011.05.001
    [4] WANG Chunpeng, WANG Xingyuan, XIA Zhiqiu, et al. Image description with polar harmonic Fourier moments[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(12): 4440–4452. doi: 10.1109/TCSVT.2019.2960507
    [5] XIA Zhiqiu, WANG Xingyuan, ZHOU Wenjie, et al. Color medical image lossless watermarking using chaotic system and accurate quaternion polar harmonic transforms[J]. Signal Processing, 2019, 157: 108–118. doi: 10.1016/j.sigpro.2018.11.011
    [6] WANG Xingyuan and WANG Zongyu. The method for image retrieval based on multi-factors correlation utilizing block truncation coding[J]. Pattern Recognition, 2014, 47(10): 3293–3303. doi: 10.1016/j.patcog.2014.04.020
    [7] 王奕然, 张新峰. 基于AdaBoost级联框架的舌色分类[J]. 北京生物医学工程, 2020, 39(1): 8–14. doi: 10.3969/j.issn.1002-3208.2020.01.002

    WANG Yiran and ZHANG Xinfeng. Tongue color classification based on AdaBoost cascade framework[J]. Beijing Biomedical Engineering, 2020, 39(1): 8–14. doi: 10.3969/j.issn.1002-3208.2020.01.002
    [8] 张艺凡, 胡广芹, 张新峰. 基于支持向量机的痤疮患者舌色苔色识别算法研究[J]. 北京生物医学工程, 2016, 35(1): 7–11. doi: 10.3969/j.issn.1002-3208.2016.01.02

    ZHANG Yifan, HU Guangqin, and ZHANG Xinfeng. An algorithm study on tongue color recognition of patients with acne based on support vector machine[J]. Beijing Biomedical Engineering, 2016, 35(1): 7–11. doi: 10.3969/j.issn.1002-3208.2016.01.02
    [9] 张静, 张新峰, 王亚真, 等. 多标记学习在中医舌象分类中的研究[J]. 北京生物医学工程, 2016, 35(2): 111–116. doi: 10.3969/j.issn.1002-3208.2016.02.01

    ZHANG Jing, ZHANG Xinfeng, WANG Yazhen, et al. Research on multi-label learning in the classification of tongue images in TCM[J]. Beijing Biomedical Engineering, 2016, 35(2): 111–116. doi: 10.3969/j.issn.1002-3208.2016.02.01
    [10] 王永刚, 杨杰, 周越. 中医舌象颜色识别的研究[J]. 生物医学工程学杂志, 2005, 22(6): 1116–1120. doi: 10.3321/j.issn:1001-5515.2005.06.008

    WANG Yonggang, YANG Jie, and ZHOU Yue. Tongue image color recognition in traditional Chinese medicine[J]. Journal of Biomedical Engineering, 2005, 22(6): 1116–1120. doi: 10.3321/j.issn:1001-5515.2005.06.008
    [11] 王爱民, 赵忠旭, 沈兰荪. 中医舌象自动分析中舌色、苔色分类方法的研究[J]. 北京生物医学工程, 2000, 19(3): 136–142. doi: 10.3969/j.issn.1002-3208.2000.03.002

    WANG Aimin, ZHAO Zhongxu, and SHEN Lansun. Research on the tongue color classification in automatic tongue analysis of traditional Chinese medicine[J]. Beijing Biomedical Engineering, 2000, 19(3): 136–142. doi: 10.3969/j.issn.1002-3208.2000.03.002
    [12] Hou Jun, Su Hongyi, Yan Bo, et al. Classification of tongue color based on CNN[C]. Proceedings of the 2017 IEEE 2nd International Conference on Big Data Analysis, Beijing, China, 2017: 725–729.
    [13] FU Shengyu, ZHENG Hong, YANG Zijiang, et al. Computerized tongue coating nature diagnosis using convolutional neural network[C]. The 2017 IEEE 2nd International Conference on Big Data Analysis, Beijing, China, 2017: 730–734.
    [14] 肖庆新. 基于深度学习的中医舌象特征分类技术研究[D]. [硕士论文], 北京工业大学, 2019.

    XIAO Qingxin. Research on TCM tongue feature classification technology based on deep learning[D]. [Master dissertation], Beijing University of Technology, 2019.
    [15] UNAR S, WANG Xingyuan, and ZHANG Chuan. Visual and textual information fusion using Kernel method for content based image retrieval[J]. Information Fusion, 2018, 44: 176–187. doi: 10.1016/j.inffus.2018.03.006
    [16] UNAR S, WANG Xingyuan, WANG Chunpeng, et al. A decisive content based image retrieval approach for feature fusion in visual and textual images[J]. Knowledge-Based Systems, 2019, 179: 8–20. doi: 10.1016/j.knosys.2019.05.001
    [17] PENG Xiaojiang, WANG Kai, ZENG Zhaoyang, et al. Suppressing mislabeled data via grouping and self-attention[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 786–802.
    [18] NORTHCUTT C G, JIANG Lu, and CHUANG I L. Confident learning: Estimating uncertainty in dataset labels[J]. Journal of Artificial Intelligence Research, 2021, 70: 1373–1411. doi: 10.1613/jair.1.12125
    [19] MA Xingjun, HUANG Hanxun, WANG Yisen, et al. Normalized loss functions for deep learning with noisy labels[C]. The 37th International Conference on Machine Learning, Vienna, Austria, 2020: 6543–6553.
    [20] ZHENG Songzhu, WU Pengxiang, GOSWAMI A, et al. Error-bounded correction of noisy labels[C]. The 37th International Conference on Machine Learning, Vienna, Austria, 2020: 11447–11457.
    [21] YI Kun and WU Jianxin. Probabilistic end-to-end noise correction for learning with noisy labels[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 7010–7018.
    [22] WANG Kai, PENG Xiaojiang, YANG Jianfei, et al. Suppressing uncertainties for large-scale facial expression recognition[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 6896–6905.
    [23] ZENG Jiabei, SHAN Shiguang, and CHEN Xilin. Facial expression recognition with inconsistently annotated datasets[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 227–243.
    [24] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19.
    [25] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [26] HU Jie, SHEN Li, and SUN Gang. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
    [27] KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. The 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105.
    [28] SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1409–1556.
    [29] SANDLER M, HOWARD A, ZHU Menglong, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4510–4520.
    [30] XIE Saining, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5987–5995.
    [31] HUANG Gao, LIU Zhuang, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2261–2269.
    [32] MA Ningning, ZHANG Xiangyu, ZHENG Haitao, et al. ShuffleNet V2: Practical guidelines for efficient CNN architecture design[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 122–138.
    [33] TAN Mingxing and LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks[C]. The 36th International Conference on Machine Learning, Long Beach, USA, 2019: 6105–6114.
  • 加载中
图(7) / 表(8)
计量
  • 文章访问数:  1150
  • HTML全文浏览量:  864
  • PDF下载量:  109
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-03
  • 修回日期:  2021-12-22
  • 录用日期:  2021-12-24
  • 网络出版日期:  2021-12-27
  • 刊出日期:  2022-01-10

目录

    /

    返回文章
    返回