GitHub - GuokaiLiu/IDAFD-UNL: Intelligent Data Annotation and Fault Diagnosis Under Noisy Labels

Intelligent Data Annotation and Fault Diagnosis Under Noisy Labels (IDAFD-UNL)

@Author : Guokai Liu
@Contact : liuguokai@hust.edu.cn
@Institution : Huazhong Univ. of Sci. and Tech.

Motivation: Intelligent data annotation and fault diagnosis (IDAFD) is of great interest in both academia and industry. In the past decades, intelligent fault diagnosis (IFD) has evolved from expertise-based to data-driven paradigms (see Fig .1) and achieved great success. However, this success is predicated on the correctly annotated datasets. Labels in large industrial datasets can be noisy and thus degrade the performance of fault diagnosis models. In recent years, deep learning-based label-denoising (DLLD) has gained attention in the field of fault diagnosis. Nevertheless, the related research is still limited in the prognostics and health management (PHM) community. To promote the development of IDAFD-UNL, we created this repository.


Fig 1. The development of intelligent fault diagnosis.

Background: In practice, data-driven fault diagnosis require diverse data with reliable labels for training and evaluation. However, the acquirement of essential data is difficult due to limited working conditions and collection costs. Moreover, even if the essential data can be collected, perfect data annotation remains a challenge because of insufficient labeling expertise and complicated labeling workload. Current data annotation has evolved from expert-based, to crowd-source-based and model-automation-based labeling. However, the available data labeling strategies can hardly guarantee perfect annotation without corrupted labels (See Fig.2). Hence, the label noise problem, i.e., data with corrupted labels, is introduced and brings new challenges to data-driven fault diagnosis.


Fig 2. Label noise from annotators in fault diagnosis.

Resources: We have collected excellent resources in this repository, such as datasets, papers, and available codes. Any issue or pull request is welcomed.

⭐: If this repository facilitates your current or future research and makes a positive contribution to your study. Please cite the reference as follows.

@article{liu2022active,
  title={An Active Label-denoising Algorithm Based on Broad Learning for Annotation of Machine Health Status},
  author={Liu, Guokai and Shen, Weiming and Gao, Liang and Kusiak, Andrew},
  journal={Science China Technological Sciences},
  notes={https://doi.org/10.1007/s11431-022-2091-9}, 
  year={2022}
}

Label noise simulation


Fig 3. Simulated symmetric and asymmetric label noise.

Function

def flip_label(y, pattern, ratio, dt='CWRU', one_hot=False, random_seed=42):
    import numpy as np
    # Source: https://github.com/chenpf1025/noisy_label_understanding_utilizing
    # y: true label, one hot
    # pattern: 'Symm' or 'Asym'
    # p: float, noisy ratio

    y=y.copy()
    if dt=='CWRU':
        # Source: https://github.com/udibr/noisy_labels
        flip = {0:7, 1:9, 2:0, 3:4, 4:2, 5:1, 6:3, 7:5, 8:6, 9:8}
    else:
        print('Please assign your fliping dictionary')
    
    # convert one hot label to int
    if one_hot:
        y = np.argmax(y,axis=1) #[np.where(r==1)[0][0] for r in y]
    n_class = max(y)+1
    
    # filp label
    np.random.seed(random_seed)
    for i in range(len(y)):
        if pattern=='Symm':
            p1 = ratio/(n_class-1)*np.ones(n_class)
            p1[y[i]] = 1-ratio
            y[i] = np.random.choice(n_class,p=p1)
        elif pattern=='Asym':
            # y[i] = np.random.choice([y[i],(y[i]+1)%n_class],p=[1-ratio,ratio])            
            y[i] = np.random.choice([y[i],flip[y[i]]],p=[1-ratio,ratio])            
            
    # convert back to one hot
    if one_hot:
        y = np.eye(n_class)[y]
    return y

Demo

# Simulate label noise
Yn1 = flip_label(Ys, 'Symm', 0.35, dt=args.dataset)
Yn2 = flip_label(Ys, 'Asym', 0.35, dt=args.dataset)

PHM Datasets


Fig 4. Open‐sourced datasets for fault diagnosis and prognosis.


@article{liu2022knowledge,
  title={Knowledge transfer in fault diagnosis of rotary machines},
  author={Liu, Guokai and Shen, Weiming and Gao, Liang and Kusiak, Andrew},
  journal={IET Collaborative Intelligent Manufacturing},
  volume={4},
  number={1},
  pages={17--34},
  year={2022}
}

IFDUNL Papers

Nie X, Xie G (2020) A Novel Normalized Recurrent Neural Network for Fault Diagnosis with Noisy Labels. J Intell Manuf. https://doi.org/10.1007/s10845-020-01608-8
Nie X, Xie G (2021) A Fault Diagnosis Framework Insensitive to Noisy Labels Based on Recurrent Neural Network. IEEE Sensors Journal 21:2676–2686. https://doi.org/10.1109/JSEN.2020.3023748
Zhang K, Tang B, Deng L, et al (2021) A Fault Diagnosis Method for Wind Turbines Gearbox Based on Adaptive Loss Weighted Meta-ResNet under Noisy Labels. Mechanical Systems and Signal Processing 161:107963. https://doi.org/10.1016/j.ymssp.2021.107963
Ainapure A, Li X, Singh J, et al (2020) Enhancing Intelligent Cross-Domain Fault Diagnosis Performance on Rotating Machines with Noisy Health Labels. Procedia Manufacturing 48:940–946. https://doi.org/10.1016/j.promfg.2020.05.133
Ainapure A, Siahpour S, Li X, et al (2022) Intelligent Robust Cross-Domain Fault Diagnostic Method for Rotating Machines Using Noisy Condition Labels. Mathematics 10:455. https://doi.org/10.3390/math10030455

ML/DL Surveys

Frenay B, Verleysen M (2014) Classification in the Presence of Label Noise: A Survey. IEEE Transactions on Neural Networks and Learning Systems 25:845–869. https://doi.org/10.1109/TNNLS.2013.2292894
Han B, Yao Q, Liu T, et al (2020) A Survey of Label-noise Representation Learning: Past, Present and Future. https://doi.org/10.48550/arXiv.2011.04406
Cordeiro FR, Carneiro G (2020) A Survey on Deep Learning with Noisy Labels: How to train your model when you cannot trust on the annotations? In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). pp 9–16. https://doi.org/10.1109/SIBGRAPI51738.2020.00010
Algan G, Ulusoy I (2021) Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey. Knowledge-Based Systems 215:106771. https://doi.org/10.1016/j.knosys.2021.106771
Song H, Kim M, Park D, et al (2022) Learning From Noisy Labels With Deep Neural Networks: A Survey. IEEE Transactions on Neural Networks and Learning Systems 1–19. https://doi.org/10.1109/TNNLS.2022.3152527

A Taxonomy


Fig 5. High-level research overview of robust deep learning for noisy labels.

⭐: This section was copied from https://github.com/songhwanjun/Awesome-Noisy-Labels. Please refer to it for more details.

@article{song2022learning,
  title={Learning from noisy labels with deep neural networks: A survey},
  author={Song, Hwanjun and Kim, Minseok and Park, Dongmin and Shin, Yooju and Lee, Jae-Gil},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2022},
  publisher={IEEE}
}

Robust Learning for Noisy Labels
|--- A. Robust Architecture
     |--- A.1. Noise Adaptation Layer: adding a noise adaptation layer at the top of an underlying DNN to learn label transition process
     |--- A.2. Dedicated Architecture: developing a dedicated architecture to reliably support more diverse types of label noises.
|--- B. Robust Regularization
     |--- B.1. Explicit Regularization: an explicit form that modifies the expected tarining loss, e.g., weight decay and dropout.
     |--- B.2. Implicit Regularization: an implicit form that gives the effect of stochasticity, e.g., data augmentation and mini-batch SGD.
|--- C. Robust Loss Function: designing a new loss function robust to label noise.
|--- D. Loss Adjsutment
     |--- D.1. Loss Correction: multiplying the estimated transition matrix to the prediction for all the observable labels.
     |--- D.2. Loss Reweighting: multiplying the estimated example confidence (weight) to the example loss.
     |--- D.3. Label Refurbishment: replacing the original label with other reliable one.
     |--- D.4. Meta Learning: finding an optimal adjustment rule for loss reweighing or label refurbishment.
|--- E. Sample Selection
     |--- E.1. Multi-network Learning: collaborative learning or co-training to identify clean examples from noisy data.
     |--- E.2. Multi-round Learning: refining the selected clean set through training multiple rounds.
     |--- E.3. Hybrid Leanring: combining a specific sample selection strategy with a specific semi-supervised learning model or other orthogonal directions.

A. [Robust Architecture]

A.1. Noise Adaptation Layer

Year	Venue	Title	Implementation
2015	ICCV	Webly supervised learning of convolutional networks	Official (Caffe)
2015	ICLRW	Training convolutional networks with noisy labels	Unofficial (Keras)
2016	ICDM	Learning deep networks from noisy labels with dropout regularization	Official (MATLAB)
2016	ICASSP	Training deep neural-networks based on unreliable labels	Unofficial (Chainer)
2017	ICLR	Training deep neural-networks using a noise adaptation layer	Official (Keras)

A.2. Dedicated Architecture

Year	Venue	Title	Implementation
2015	CVPR	Learning from massive noisy labeled data for image classification	Official (Caffe)
2018	NeurIPS	Masking: A new perspective of noisy supervision	Official (TensorFlow)
2018	TIP	Deep learning from noisy image labels with quality embedding	N/A
2019	ICML	Robust inference via generative classifiers for handling noisy labels	Official (PyTorch)

B. [Robust Regularization]

B.1. Explicit Regularization

Year	Venue	Title	Implementation
2018	ECCV	Deep bilevel learning	Official (TensorFlow)
2019	CVPR	Learning from noisy labels by regularized estimation of annotator confusion	Official (TensorFlow)
2019	ICML	Using pre-training can improve model robustness and uncertainty	Official (PyTorch)
2020	ICLR	Can gradient clipping mitigate label noise?	Unofficial (PyTorch)
2020	ICLR	Wasserstein adversarial regularization (WAR) on label noise	N/A
2021	ICLR	Robust early-learning: Hindering the memorization of noisy labels	Official (PyTorch)
2021	ICLR	When Optimizing f-Divergence is Robust with Label Noise	Official (PyTorch)
2021	ICCV	Learning with Noisy Labels via Sparse Regularization	Official (PyTorch)
2021	NeurIPS	Open-set Label Noise Can Improve Robustness Against Inherent Label Noise	Official (PyTorch)

B.2. Implicit Regularization

Year	Venue	Title	Implementation
2015	ICLR	Explaining and harnessing adversarial examples	Unofficial (PyTorch)
2017	ICLRW	Regularizing neural networks by penalizing confident output distributions	Unofficial (PyTorch)
2018	ICLR	Mixup: Beyond empirical risk minimization	Official (PyTorch)
2021	CVPR	Augmentation Strategies for Learning with Noisy Labels	Official (PyTorch)
2021	CVPR	AutoDO: Robust AutoAugment for Biased Data With Label Noise via Scalable Probabilistic Implicit Differentiation	Official (PyTorch)

C. [Robust Loss Function]

Year	Venue	Title	Implementation
2017	AAAI	Robust loss functions under label noise for deep neural networks	N/A
2017	ICCV	Symmetric cross entropy for robust learning with noisy labels	Official (Keras)
2018	NeurIPS	Generalized cross entropy loss for training deep neural networks with noisy labels	Unofficial (PyTorch)
2020	ICLR	Curriculum loss: Robust learning and generalization against label corruption	N/A
2020	ICML	Normalized loss functions for deep learning with noisy labels	Official (PyTorch)
2020	ICML	Peer loss functions: Learning from noisy labels without knowing noise rates	Official (PyTorch)
2021	CVPR	Learning Cross-Modal Retrieval with Noisy Labels	Official (Pytorch)
2021	CVPR	A Second-Order Approach to Learning With Instance-Dependent Label Noise	Official (PyTorch)
2022	ICLR	An Information Fusion Approach to Learning with Instance-Dependent Label Noise	N/A

D. [Loss Adjustment]

D.1. Loss Correction

Year	Venue	Title	Implementation
2017	CVPR	Making deep neural networks robust to label noise: A loss correction approach	Official (Keras)
2018	NeurIPS	Using trusted data to train deep networks on labels corrupted by severe noise	Official (PyTorch)
2019	NeurIPS	Are anchor points really indispensable in label-noise learning?	Official (PyTorch)
2020	NeurIPS	Dual T: Reducing estimation error for transition matrix in label-noise learning	N/A
2021	AAAI	Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model	Official (PyTorch)

D.2. Loss Reweighting

Year	Venue	Title	Implementation
2017	TNNLS	Multiclass learning with partially corrupted labels	Unofficial (PyTorch)
2017	NeurIPS	Active Bias: Training more accurate neural networks by emphasizing high variance samples	Unofficial (TensorFlow)

D.3. Label Refurbishment

Year	Venue	Title	Implementation
2015	ICLR	Training deep neural networks on noisy labels with bootstrapping	Unofficial (Keras)
2018	ICML	Dimensionality-driven learning with noisy labels	Official (Keras)
2019	ICML	Unsupervised label noise modeling and loss correction	Official (PyTorch)
2020	NeurIPS	Self-adaptive training: beyond empirical risk minimization	Official (PyTorch)
2020	ICML	Error-bounded correction of noisy labels	Official (PyTorch)
2021	AAAI	Beyond class-conditional assumption: A primary attempt to combat instancedependent label noise	Official (PyTorch)

D.4. Meta Learning

Year	Venue	Title	Implementation
2017	NeurIPSW	Learning to learn from weak supervision by full supervision	Unofficial (TensorFlow)
2017	ICCV	Learning from noisy labels with distillation	N/A
2018	ICML	Learning to reweight examples for robust deep learning	Official (TensorFlow)
2019	NeurIPS	Meta-Weight-Net: Learning an explicit mapping for sample weighting	Official (PyTorch)
2020	CVPR	Distilling effective supervision from severe label noise	Official (TensorFlow)
2021	AAAI	Meta label correction for noisy label learning	Official (PyTorch)
2021	ICCV	Adaptive Label Noise Cleaning with Meta-Supervision for Deep Face Recognition	N/A

E. [Sample Selection]

E.1. Multi-network Learning

Year	Venue	Title	Implementation
2017	NeurIPS	Decoupling when to update from how to update	Official (TensorFlow)
2018	ICML	MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels	Official (TensorFlow)
2018	NeurIPS	Co-teaching: Robust training of deep neural networks with extremely noisy labels	Official (PyTorch)
2019	ICML	How does disagreement help generalization against label corruption?	Official (PyTorch)
2021	CVPR	Jo-SRC: A Contrastive Approach for Combating Noisy Labels	Official (PyTorch)

E.2. Single- or Multi-round Learning

Year	Venue	Title	Implementation
2018	CVPR	Iterative learning with open-set noisy labels	Official (Keras)
2019	ICML	Learning with bad training data via iterative trimmed loss minimization	Official (GluonCV)
2019	ICML	Understanding and utilizing deep neural networks trained with noisy labels	Official (Keras)
2019	ICCV	O2U-Net: A simple noisy label detection approach for deep neural networks	Unofficial (PyTorch)
2020	ICMLW	How does early stopping can help generalization against label noise?	Official (Tensorflow)
2020	NeurIPS	A topological filter for learning with label noise	Official (PyTorch)
2021	ICLR	Learning with Instance-Dependent Label Noise: A Sample Sieve Approach	Official (PyTorch)
2021	NeurIPS	FINE Samples for Learning with Noisy Labels	Official (PyTorch)
2022	ICLR	Sample Selection with Uncertainty of Losses for Learning with Noisy Labels	N/A

E.3. Hybrid Learning

Year	Venue	Title	Implementation
2019	ICML	SELFIE: Refurbishing unclean samples for robust deep learning	Official (TensorFlow)
2020	ICLR	SELF: Learning to filter noisy labels with self-ensembling	N/A
2020	ICLR	DivideMix: Learning with noisy labels as semi-supervised learning	Official (PyTorch)
2021	ICLR	Robust curriculum learning: from clean label detection to noisy label self-correction	N/A
2021	NeurIPS	Understanding and Improving Early Stopping for Learning with Noisy Labels	Official (PyTorch)

Acknowledgement

We appreciate the editors and reviewers who have provided insightful and constructive comments on this study. Thank you very much : ) We hope this repository can facilitate future studies on the IDAFD-UNL problem.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Figures		Figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aldbt_demo.ipynb		aldbt_demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intelligent Data Annotation and Fault Diagnosis Under Noisy Labels (IDAFD-UNL)

Label noise simulation

PHM Datasets

IFDUNL Papers

ML/DL Surveys

A Taxonomy

A. [Robust Architecture]

A.1. Noise Adaptation Layer

A.2. Dedicated Architecture

B. [Robust Regularization]

B.1. Explicit Regularization

B.2. Implicit Regularization

C. [Robust Loss Function]

D. [Loss Adjustment]

D.1. Loss Correction

D.2. Loss Reweighting

D.3. Label Refurbishment

D.4. Meta Learning

E. [Sample Selection]

E.1. Multi-network Learning

E.2. Single- or Multi-round Learning

E.3. Hybrid Learning

Other Awesome Links

Acknowledgement

About

Releases

Packages

Languages

License

GuokaiLiu/IDAFD-UNL

Folders and files

Latest commit

History

Repository files navigation

Intelligent Data Annotation and Fault Diagnosis Under Noisy Labels (IDAFD-UNL)

Label noise simulation

PHM Datasets

IFDUNL Papers

ML/DL Surveys

A Taxonomy

A. [Robust Architecture]

A.1. Noise Adaptation Layer

A.2. Dedicated Architecture

B. [Robust Regularization]

B.1. Explicit Regularization

B.2. Implicit Regularization

C. [Robust Loss Function]

D. [Loss Adjustment]

D.1. Loss Correction

D.2. Loss Reweighting

D.3. Label Refurbishment

D.4. Meta Learning

E. [Sample Selection]

E.1. Multi-network Learning

E.2. Single- or Multi-round Learning

E.3. Hybrid Learning

Other Awesome Links

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages