Signal classification is a key of brain-computer interface (BCI). In this paper, we present a new method for classifying the electroencephalogram (EEG) signals of which the features are heterogeneous. This method is called wrapped elastic net feature selection and classification. Firstly, we used the joint application of time-domain statistic, power spectral density (PSD), common spatial pattern (CSP) and autoregressive (AR) model to extract high-dimensional fused features of the preprocessed EEG signals. Then we used the wrapped method for feature selection. We fitted the logistic regression model penalized with elastic net on the training data, and obtained the parameter estimation by coordinate descent method. Then we selected best feature subset by using 10-fold cross-validation. Finally, we classified the test sample using the trained model. Data used in the experiment were the EEG data from international BCI Competition Ⅳ. The results showed that the method proposed was suitable for fused feature selection with high-dimension. For identifying EEG signals, it is more effective and faster, and can single out a more relevant subset to obtain a relatively simple model. The average test accuracy reached 81.78%.
Existing emotion recognition research is typically limited to static laboratory settings and has not fully handle the changes in emotional states in dynamic scenarios. To address this problem, this paper proposes a method for dynamic continuous emotion recognition based on electroencephalography (EEG) and eye movement signals. Firstly, an experimental paradigm was designed to cover six dynamic emotion transition scenarios including happy to calm, calm to happy, sad to calm, calm to sad, nervous to calm, and calm to nervous. EEG and eye movement data were collected simultaneously from 20 subjects to fill the gap in current multimodal dynamic continuous emotion datasets. In the valence-arousal two-dimensional space, emotion ratings for stimulus videos were performed every five seconds on a scale of 1 to 9, and dynamic continuous emotion labels were normalized. Subsequently, frequency band features were extracted from the preprocessed EEG and eye movement data. A cascade feature fusion approach was used to effectively combine EEG and eye movement features, generating an information-rich multimodal feature vector. This feature vector was input into four regression models including support vector regression with radial basis function kernel, decision tree, random forest, and K-nearest neighbors, to develop the dynamic continuous emotion recognition model. The results showed that the proposed method achieved the lowest mean square error for valence and arousal across the six dynamic continuous emotions. This approach can accurately recognize various emotion transitions in dynamic situations, offering higher accuracy and robustness compared to using either EEG or eye movement signals alone, making it well-suited for practical applications.
The task of automatic generation of medical image reports faces various challenges, such as diverse types of diseases and a lack of professionalism and fluency in report descriptions. To address these issues, this paper proposes a multimodal medical imaging report based on memory drive method (mMIRmd). Firstly, a hierarchical vision transformer using shifted windows (Swin-Transformer) is utilized to extract multi-perspective visual features of patient medical images, and semantic features of textual medical history information are extracted using bidirectional encoder representations from transformers (BERT). Subsequently, the visual and semantic features are integrated to enhance the model's ability to recognize different disease types. Furthermore, a medical text pre-trained word vector dictionary is employed to encode labels of visual features, thereby enhancing the professionalism of the generated reports. Finally, a memory driven module is introduced in the decoder, addressing long-distance dependencies in medical image data. This study is validated on the chest X-ray dataset collected at Indiana University (IU X-Ray) and the medical information mart for intensive care chest x-ray (MIMIC-CXR) released by the Massachusetts Institute of Technology and Massachusetts General Hospital. Experimental results indicate that the proposed method can better focus on the affected areas, improve the accuracy and fluency of report generation, and assist radiologists in quickly completing medical image report writing.
The effective classification of multi-task motor imagery electroencephalogram (EEG) is helpful to achieve accurate multi-dimensional human-computer interaction, and the high frequency domain specificity between subjects can improve the classification accuracy and robustness. Therefore, this paper proposed a multi-task EEG signal classification method based on adaptive time-frequency common spatial pattern (CSP) combined with convolutional neural network (CNN). The characteristics of subjects' personalized rhythm were extracted by adaptive spectrum awareness, and the spatial characteristics were calculated by using the one-versus-rest CSP, and then the composite time-domain characteristics were characterized to construct the spatial-temporal frequency multi-level fusion features. Finally, the CNN was used to perform high-precision and high-robust four-task classification. The algorithm in this paper was verified by the self-test dataset containing 10 subjects (33 ± 3 years old, inexperienced) and the dataset of the 4th 2018 Brain-Computer Interface Competition (BCI competition Ⅳ-2a). The average accuracy of the proposed algorithm for the four-task classification reached 93.96% and 84.04%, respectively. Compared with other advanced algorithms, the average classification accuracy of the proposed algorithm was significantly improved, and the accuracy range error between subjects was significantly reduced in the public dataset. The results show that the proposed algorithm has good performance in multi-task classification, and can effectively improve the classification accuracy and robustness.
In the process of lower limb rehabilitation training, fatigue estimation is of great significance to improve the accuracy of intention recognition and avoid secondary injury. However, most of the existing methods only consider surface electromyography (sEMG) features but ignore electrocardiogram (ECG) features when performing in fatigue estimation, which leads to the low and unstable recognition efficiency. Aiming at this problem, a method that uses the fusion features of ECG and sEMG signal to estimate the fatigue during lower limb rehabilitation was proposed, and an improved particle swarm optimization-support vector machine classifier (improved PSO-SVM) was proposed and used to identify the fusion feature vector. Finally, the accurate recognition of the three states of relax, transition and fatigue was achieved, and the recognition rates were 98.5%, 93.5%, and 95.5%, respectively. Comparative experiments showed that the average recognition rate of this method was 4.50% higher than that of sEMG features alone, and 13.66% higher than that of the combined features of ECG and sEMG without feature fusion. It is proved that the feature fusion of ECG and sEMG signals in the process of lower limb rehabilitation training can be used for recognizing fatigue more accurately.
Speech feature learning is the core and key of speech recognition method for mental illness. Deep feature learning can automatically extract speech features, but it is limited by the problem of small samples. Traditional feature extraction (original features) can avoid the impact of small samples, but it relies heavily on experience and is poorly adaptive. To solve this problem, this paper proposes a deep embedded hybrid feature sparse stack autoencoder manifold ensemble algorithm. Firstly, based on the prior knowledge, the psychotic speech features are extracted, and the original features are constructed. Secondly, the original features are embedded in the sparse stack autoencoder (deep network), and the output of the hidden layer is filtered to enhance the complementarity between the deep features and the original features. Third, the L1 regularization feature selection mechanism is designed to compress the dimensions of the mixed feature set composed of deep features and original features. Finally, a weighted local preserving projection algorithm and an ensemble learning mechanism are designed, and a manifold projection classifier ensemble model is constructed, which further improves the classification stability of feature fusion under small samples. In addition, this paper designs a medium-to-large-scale psychotic speech collection program for the first time, collects and constructs a large-scale Chinese psychotic speech database for the verification of psychotic speech recognition algorithms. The experimental results show that the main innovation of the algorithm is effective, and the classification accuracy is better than other representative algorithms, and the maximum improvement is 3.3%. In conclusion, this paper proposes a new method of psychotic speech recognition based on embedded mixed sparse stack autoencoder and manifold ensemble, which effectively improves the recognition rate of psychotic speech.
Objective To propose a heart sound segmentation method based on multi-feature fusion network. Methods Data were obtained from the CinC/PhysioNet 2016 Challenge dataset (a total of 3 153 recordings from 764 patients, about 91.93% of whom were male, with an average age of 30.36 years). Firstly the features were extracted in time domain and time-frequency domain respectively, and reduced redundant features by feature dimensionality reduction. Then, we selected optimal features separately from the two feature spaces that performed best through feature selection. Next, the multi-feature fusion was completed through multi-scale dilated convolution, cooperative fusion, and channel attention mechanism. Finally, the fused features were fed into a bidirectional gated recurrent unit (BiGRU) network to heart sound segmentation results. Results The proposed method achieved precision, recall and F1 score of 96.70%, 96.99%, and 96.84% respectively. Conclusion The multi-feature fusion network proposed in this study has better heart sound segmentation performance, which can provide high-accuracy heart sound segmentation technology support for the design of automatic analysis of heart diseases based on heart sounds.
As the most common active brain-computer interaction paradigm, motor imagery brain-computer interface (MI-BCI) suffers from the bottleneck problems of small instruction set and low accuracy, and its information transmission rate (ITR) and practical application are severely limited. In this study, we designed 6-class imagination actions, collected electroencephalogram (EEG) signals from 19 subjects, and studied the effect of collaborative brain-computer interface (cBCI) collaboration strategy on MI-BCI classification performance, the effects of changes in different group sizes and fusion strategies on group multi-classification performance are compared. The results showed that the most suitable group size was 4 people, and the best fusion strategy was decision fusion. In this condition, the classification accuracy of the group reached 77%, which was higher than that of the feature fusion strategy under the same group size (77.31% vs. 56.34%), and was significantly higher than that of the average single user (77.31% vs. 44.90%). The research in this paper proves that the cBCI collaboration strategy can effectively improve the MI-BCI classification performance, which lays the foundation for MI-cBCI research and its future application.
Colorectal polyps are important early markers of colorectal cancer, and their early detection is crucial for cancer prevention. Although existing polyp segmentation models have achieved certain results, they still face challenges such as diverse polyp morphology, blurred boundaries, and insufficient feature extraction. To address these issues, this study proposes a parallel coordinate fusion network (PCFNet), aiming to improve the accuracy and robustness of polyp segmentation. PCFNet integrates parallel convolutional modules and a coordinate attention mechanism, enabling the preservation of global feature information while precisely capturing detailed features, thereby effectively segmenting polyps with complex boundaries. Experimental results on Kvasir-SEG and CVC-ClinicDB demonstrate the outstanding performance of PCFNet across multiple metrics. Specifically, on the Kvasir-SEG dataset, PCFNet achieved an F1-score of 0.897 4 and a mean intersection over union (mIoU) of 0.835 8; on the CVC-ClinicDB dataset, it attained an F1-score of 0.939 8 and an mIoU of 0.892 3. Compared with other methods, PCFNet shows significant improvements across all performance metrics, particularly in multi-scale feature fusion and spatial information capture, demonstrating its innovativeness. The proposed method provides a more reliable AI-assisted diagnostic tool for early colorectal cancer screening.
The result of the emotional state induced by music may provide theoretical support and help for assisted music therapy. The key to assessing the state of emotion is feature extraction of the emotional electroencephalogram (EEG). In this paper, we study the performance optimization of the feature extraction algorithm. A public multimodal database for emotion analysis using physiological signals (DEAP) proposed by Koelstra et al. was applied. Eight kinds of positive and negative emotions were extracted from the dataset, representing the data of fourteen channels from the different regions of brain. Based on wavelet transform, δ, θ, α and β rhythms were extracted. This paper analyzed and compared the performances of three kinds of EEG features for emotion classification, namely wavelet features (wavelet coefficients energy and wavelet entropy), approximate entropy and Hurst exponent. On this basis, an EEG feature fusion algorithm based on principal component analysis (PCA) was proposed. The principal component with a cumulative contribution rate more than 85% was retained, and the parameters which greatly varied in characteristic root were selected. The support vector machine was used to assess the state of emotion. The results showed that the average accuracy rates of emotional classification with wavelet features, approximate entropy and Hurst exponent were respectively 73.15%, 50.00% and 45.54%. By combining these three methods, the features fused with PCA possessed an accuracy of about 85%. The obtained classification accuracy by using the proposed fusion algorithm based on PCA was improved at least 12% than that by using single feature, providing assistance for emotional EEG feature extraction and music therapy.