west china medical publishers
Author
  • Title
  • Author
  • Keyword
  • Abstract
Advance search
Advance search

Search

find Author "ZHAO Dechun" 4 results
  • Research on Parkinson’s disease recognition algorithm based on sample enhancement

    Parkinson’s disease patients have early vocal cord damage, and their voiceprint characteristics differ significantly from those of healthy individuals, which can be used to identify Parkinson's disease. However, the samples of the voiceprint dataset of Parkinson's disease patients are insufficient, so this paper proposes a double self-attention deep convolutional generative adversarial network model for sample enhancement to generate high-resolution spectrograms, based on which deep learning is used to recognize Parkinson’s disease. This model improves the texture clarity of samples by increasing network depth and combining gradient penalty and spectral normalization techniques, and a family of pure convolutional neural networks (ConvNeXt) classification network based on Transfer learning is constructed to extract voiceprint features and classify them, which improves the accuracy of Parkinson’s disease recognition. The validation experiments of the effectiveness of this paper’s algorithm are carried out on the Parkinson’s disease speech dataset. Compared with the pre-sample enhancement, the clarity of the samples generated by the proposed model in this paper as well as the Fréchet inception distance (FID) are improved, and the network model in this paper is able to achieve an accuracy of 98.8%. The results of this paper show that the Parkinson’s disease recognition algorithm based on double self-attention deep convolutional generative adversarial network sample enhancement can accurately distinguish between healthy individuals and Parkinson’s disease patients, which helps to solve the problem of insufficient samples for early recognition of voiceprint data in Parkinson’s disease. In summary, the method effectively improves the classification accuracy of small-sample Parkinson's disease speech dataset and provides an effective solution idea for early Parkinson's disease speech diagnosis.

    Release date: Export PDF Favorites Scan
  • Audiovisual emotion recognition based on a multi-head cross attention mechanism

    In audiovisual emotion recognition, representational learning is a research direction receiving considerable attention, and the key lies in constructing effective affective representations with both consistency and variability. However, there are still many challenges to accurately realize affective representations. For this reason, in this paper we proposed a cross-modal audiovisual recognition model based on a multi-head cross-attention mechanism. The model achieved fused feature and modality alignment through a multi-head cross-attention architecture, and adopted a segmented training strategy to cope with the modality missing problem. In addition, a unimodal auxiliary loss task was designed and shared parameters were used in order to preserve the independent information of each modality. Ultimately, the model achieved macro and micro F1 scores of 84.5% and 88.2%, respectively, on the crowdsourced annotated multimodal emotion dataset of actor performances (CREMA-D). The model in this paper can effectively capture intra- and inter-modal feature representations of audio and video modalities, and successfully solves the unity problem of the unimodal and multimodal emotion recognition frameworks, which provides a brand-new solution to the audiovisual emotion recognition.

    Release date: Export PDF Favorites Scan
  • Research on fault diagnosis of patient monitor based on text mining

    The conventional fault diagnosis of patient monitors heavily relies on manual experience, resulting in low diagnostic efficiency and ineffective utilization of fault maintenance text data. To address these issues, this paper proposes an intelligent fault diagnosis method for patient monitors based on multi-feature text representation, improved bidirectional gate recurrent unit (BiGRU) and attention mechanism. Firstly, the fault text data was preprocessed, and the word vectors containing multiple linguistic features was generated by linguistically-motivated bidirectional encoder representation from Transformer. Then, the bidirectional fault features were extracted and weighted by the improved BiGRU and attention mechanism respectively. Finally, the weighted loss function is used to reduce the impact of class imbalance on the model. To validate the effectiveness of the proposed method, this paper uses the patient monitor fault dataset for verification, and the macro F1 value has achieved 91.11%. The results show that the model built in this study can realize the automatic classification of fault text, and may provide assistant decision support for the intelligent fault diagnosis of the patient monitor in the future.

    Release date: Export PDF Favorites Scan
  • Medical text classification model integrating medical entity label semantics

    Automatic classification of medical questions is of great significance in improving the quality and efficiency of online medical services, and belongs to the task of intent recognition. Joint entity recognition and intent recognition perform better than single task models. Currently, most publicly available medical text intent recognition datasets lack entity annotation, and manual annotation of these entities requires a lot of time and manpower. To solve this problem, this paper proposes a medical text classification model, bidirectional encoder representation based on transformer-recurrent convolutional neural network-entity-label-semantics (BRELS), which integrates medical entity label semantics. This model firstly utilizes an adaptive fusion mechanism to absorb prior knowledge of medical entity labels, achieving local feature enhancement. Then in global feature extraction, a lightweight recurrent convolutional neural network (LRCNN) is used to suppress parameter growth while preserving the original semantics of the text. The ablation and comparison experiments are conducted on three public medical text intent recognition datasets to validate the performance of the model. The results show that F1 score reaches 87.34%, 81.71%, and 77.74% on each dataset, respectively. The results show that the BRELS model can effectively identify and understand medical terminology, thereby effectively identifying users’ intentions, which can improve the quality and efficiency of online medical services.

    Release date: Export PDF Favorites Scan
1 pages Previous 1 Next

Format

Content