期刊文献+
共找到3,048篇文章
< 1 2 153 >
每页显示 20 50 100
Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition
1
作者 Fatma Harby Mansor Alohali +1 位作者 Adel Thaljaoui Amira Samy Talaat 《Computers, Materials & Continua》 SCIE EI 2024年第2期2689-2719,共31页
Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotiona... Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior assessment.Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs.Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals effectively.Although these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some limitations.So this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human emotions.The utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over time.The next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster convergence.Then the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram sequentially.The attitude of feature selection is to retain only dominant features by excluding the irrelevant ones.In this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features selection.Finally,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short Term Memory(Bi-LSTM)network to discern emotions.Since the deep Bi-LSTM can hierarchically learn complex features and increases model capacity by achieving more robust temporal modeling,it is more effective than a shallow Bi-LSTM in capturing the intricate tones of emotional content existent in speech signals.The effectiveness and resilience of the proposed SER model were evaluated by experiments,comparing it to state-of-the-art SER techniques.The results indicated that the model achieved accuracy rates of 90.92%,93%,and 92%over the Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS),Berlin Database of Emotional Speech(EMO-DB),and The Interactive Emotional Dyadic Motion Capture(IEMOCAP)datasets,respectively.These findings signify a prominent enhancement in the ability to emotional depictions identification in speech,showcasing the potential of the proposed model in advancing the SER field. 展开更多
关键词 Artificial intelligence application multi features sequential selection speech emotion recognition deep Bi-LSTM
在线阅读 下载PDF
Pedestrian Attributes Recognition in Surveillance Scenarios with Hierarchical Multi-Task CNN Models 被引量:2
2
作者 Wenhua Fang Jun Chen Ruimin Hu 《China Communications》 SCIE CSCD 2018年第12期208-219,共12页
Pedestrian attributes recognition is a very important problem in video surveillance and video forensics. Traditional methods assume the pedestrian attributes are independent and design handcraft features for each one.... Pedestrian attributes recognition is a very important problem in video surveillance and video forensics. Traditional methods assume the pedestrian attributes are independent and design handcraft features for each one. In this paper, we propose a joint hierarchical multi-task learning algorithm to learn the relationships among attributes for better recognizing the pedestrian attributes in still images using convolutional neural networks(CNN). We divide the attributes into local and global ones according to spatial and semantic relations, and then consider learning semantic attributes through a hierarchical multi-task CNN model where each CNN in the first layer will predict each group of such local attributes and CNN in the second layer will predict the global attributes. Our multi-task learning framework allows each CNN model to simultaneously share visual knowledge among different groups of attribute categories. Extensive experiments are conducted on two popular and challenging benchmarks in surveillance scenarios, namely, the PETA and RAP pedestrian attributes datasets. On both benchmarks, our framework achieves superior results over the state-of-theart methods by 88.2% on PETA and 83.25% on RAP, respectively. 展开更多
关键词 attributes recognition CNN multi-TASK learning
在线阅读 下载PDF
STUDY ON THE COAL-ROCK INTERFACE RECOGNITION METHOD BASED ON MULTI-SENSOR DATA FUSION TECHNIQUE 被引量:7
3
作者 Ren FangYang ZhaojianXiong ShiboResearch Institute of Mechano-Electronic Engineering,Taiyuan University of Technology,Taiyuan 030024, China 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2003年第3期321-324,共4页
The coal-rock interface recognition method based on multi-sensor data fusiontechnique is put forward because of the localization of single type sensor recognition method. Themeasuring theory based on multi-sensor data... The coal-rock interface recognition method based on multi-sensor data fusiontechnique is put forward because of the localization of single type sensor recognition method. Themeasuring theory based on multi-sensor data fusion technique is analyzed, and hereby the testplatform of recognition system is manufactured. The advantage of data fusion with the fuzzy neuralnetwork (FNN) technique has been probed. The two-level FNN is constructed and data fusion is carriedout. The experiments show that in various conditions the method can always acquire a much higherrecognition rate than normal ones. 展开更多
关键词 Coal-rock interface recognition (CIR) Data fusion (DF) multi-SENSOR
在线阅读 下载PDF
Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video 被引量:3
4
作者 GOUTSU Yusuke KOBAYASHI Takaki +4 位作者 OBARA Junya KUSAJIMA Ikuo TAKEICHI Kazunari TAKANO Wataru NAKAMURA Yoshihiko 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2015年第4期657-665,共9页
Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become availa... Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely. 展开更多
关键词 gesture recognition multi-modal integration hidden Markov model random forests
在线阅读 下载PDF
Static Digits Recognition Using Rotational Signatures and Hu Moments with a Multilayer Perceptron 被引量:1
5
作者 Francisco Solís Margarita Hernández +1 位作者 Amelia Pérez Carina Toxqui 《Engineering(科研)》 2014年第11期692-698,共7页
This paper presents two systems for recognizing static signs (digits) from American Sign Language (ASL). These systems avoid the use color marks, or gloves, using instead, low-pass and high-pass filters in space and f... This paper presents two systems for recognizing static signs (digits) from American Sign Language (ASL). These systems avoid the use color marks, or gloves, using instead, low-pass and high-pass filters in space and frequency domains, and color space transformations. First system used rotational signatures based on a correlation operator;minimum distance was used for the classification task. Second system computed the seven Hu invariants from binary images;these descriptors fed to a Multi-Layer Perceptron (MLP) in order to recognize the 9 different classes. First system achieves 100% of recognition rate with leaving-one-out validation and second experiment performs 96.7% of recognition rate with Hu moments and 100% using 36 normalized moments and k-fold cross validation. 展开更多
关键词 SIGN Language recognition ROTATIONAL SIGNATURES HU MOMENTS multi-Layer PERCEPTRON
在线阅读 下载PDF
Mood States Recognition of Rowing Athletes Based on Multi-Physiological Signals Using PSO-SVM
6
作者 Jing Wang Pei Lei +2 位作者 Kun Wang Lijuan Mao Xinyu Chai 《E-Health Telecommunication Systems and Networks》 2014年第2期9-17,共9页
Athletes have various emotions before competition, and mood states have impact on the competi- tion results. Recognition of athletes’ mood states could help athletes to have better adjustment before competition, whic... Athletes have various emotions before competition, and mood states have impact on the competi- tion results. Recognition of athletes’ mood states could help athletes to have better adjustment before competition, which is significant to competition achievements. In this paper, physiological signals of female rowing athletes in pre- and post-competition were collected. Based on the multi-physiological signals related to pre- and post-competition, such as heart rate and respiration rate, features were extracted which had been subtracted the emotion baseline. Then the particle swarm optimization (PSO) was adopted to optimize the feature selection from the feature set, and combined with the least squares support vector machine (LS-SVM) classifier. Positive mood states and negative mood states were classified by the LS-SVM with PSO feature optimization. The results showed that the classification accuracy by the LS-SVM algorithm combined with PSO and baseline subtraction was better than the condition without baseline subtraction. The combination can contribute to good classification of mood states of rowing athletes, and would be informative to psychological adjustment of athletes. 展开更多
关键词 Affective Computing MOOD States recognition multi-Physiological Signals PSO SVM
在线阅读 下载PDF
OMP-BASED MULTI-BAND SIGNAL RECONSTRUCTION FOR ECOLOGICAL SOUNDS RECOGNITION
7
作者 Ouyang Zhen Li Ying 《Journal of Electronics(China)》 2014年第1期50-60,共11页
The paper proposes a new method of multi-band signal reconstruction based on Orthogonal Matching Pursuit(OMP),which aims to develop a robust Ecological Sounds Recognition(ESR)system.Firstly,the OMP is employed to spar... The paper proposes a new method of multi-band signal reconstruction based on Orthogonal Matching Pursuit(OMP),which aims to develop a robust Ecological Sounds Recognition(ESR)system.Firstly,the OMP is employed to sparsely decompose the original signal,thus the high correlation components are retained to reconstruct in the first stage.Then,according to the frequency distribution of both foreground sound and background noise,the signal can be compensated by the residual components in the second stage.Via the two-stage reconstruction,high non-stationary noises are effectively reduced,and the reconstruction precision of foreground sound is improved.At recognition stage,we employ deep belief networks to model the composite feature sets extracted from reconstructed signal.The experimental results show that the proposed approach achieved superior recognition performance on 60 classes of ecological sounds in different environments under different Signal-to-Noise Ratio(SNR),compared with the existing method. 展开更多
关键词 Ecological Sounds recognition(ESR) multi-band reconstruction Orthogonal Matching Pursuit(OMP) Sparse decomposition Deep belief networks
在线阅读 下载PDF
基于Recognition-Primed Decision模型的多智能体作战仿真(英文) 被引量:3
8
作者 孟庆操 赵晓哲 姜伟 《系统仿真学报》 CAS CSCD 北大核心 2011年第2期294-299,共6页
自然决策方法应用到多智能体协作决策是人工智能研究的热点问题,而作战仿真中的协作决策问题是多智能体协作决策重要的应用领域。为了建立作战仿真中的协作决策模型,将模糊集、模糊规则引入Klein提出的RPD模型中对战场环境中的不确定性... 自然决策方法应用到多智能体协作决策是人工智能研究的热点问题,而作战仿真中的协作决策问题是多智能体协作决策重要的应用领域。为了建立作战仿真中的协作决策模型,将模糊集、模糊规则引入Klein提出的RPD模型中对战场环境中的不确定性信息进行处理,并建立了基于RPD模型的作战仿真多智能体体系。仿真结果表明内核为RPD模型的兵力主体能够对战场环境自主反应,并能够进行协作决策来协调统一团队的行为。 展开更多
关键词 多智能体系统 作战仿真 自然决策方法 RPD模型
在线阅读 下载PDF
应用在基于Agent作战仿真中的协作Recognition-Primed Decision模型(英文)
9
作者 赵晓哲 姜伟 +1 位作者 史红权 王超 《系统仿真学报》 CAS CSCD 北大核心 2009年第6期1615-1619,1627,共6页
现阶段,团队认知、自然决策方法和协作理论方面的研究是人工智能方面的热点问题,然而在将自然决策方法应用到多智能体的协作决策方面还需要进行大量的工作。该研究的目的是建立作战仿真中的协作决策模型,在对Klein的RPD模型进行了修改... 现阶段,团队认知、自然决策方法和协作理论方面的研究是人工智能方面的热点问题,然而在将自然决策方法应用到多智能体的协作决策方面还需要进行大量的工作。该研究的目的是建立作战仿真中的协作决策模型,在对Klein的RPD模型进行了修改的基础上,提出了协作的SRPD模型,它能够支持多智能体系统态势感知的统一,并能将感知简化和提炼为多智能体的协作决策服务,并将该模型引入到作战仿真多智能体系统中建立了基于协作SRPD模型的多智能体体系。实验表明内核为协作SRPD模型的兵力主体能够对战场环境自主反应,并能够进行协作决策来协调统一团队的行为。 展开更多
关键词 复杂适应系统 多智能体系统 自然决策方法 认知优先决策
在线阅读 下载PDF
基于Multi-class SVM的车辆换道行为识别模型研究 被引量:14
10
作者 陈亮 冯延超 李巧茹 《安全与环境学报》 CAS CSCD 北大核心 2020年第1期193-199,共7页
自动安全换道是车辆实现无人驾驶的关键,为精确识别行驶车辆换道状态,保证行车安全,设计了一种基于多分类支持向量机(Multi-class Support Vector Machine,Multiclass SVM)的车辆换道识别模型。从NGSIM数据集中选取美国101公路车辆轨迹... 自动安全换道是车辆实现无人驾驶的关键,为精确识别行驶车辆换道状态,保证行车安全,设计了一种基于多分类支持向量机(Multi-class Support Vector Machine,Multiclass SVM)的车辆换道识别模型。从NGSIM数据集中选取美国101公路车辆轨迹数据进行分类处理,并将车辆换道过程划分为车辆跟驰阶段、车辆换道准备阶段和车辆换道执行阶段。采用网格搜索结合粒子群优化算法(Grid Search-PSO)对SVM模型中惩罚参数C和核参数g进行寻优标定,利用多分类支持向量机换道识别模型对样本数据进行训练和测试,模型测试精度达97.68%。研究表明,模型能够很好地识别车辆在换道过程中的行为状态,为车辆换道阶段的研究提供支持。 展开更多
关键词 安全工程 多分类支持向量机 NGSIM数据 车辆换道识别
在线阅读 下载PDF
Exploring Latent Semantic Information for Textual Emotion Recognition in Blog Articles 被引量:3
11
作者 Xin Kang Fuji Ren Yunong Wu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期204-216,共13页
Understanding people's emotions through natural language is a challenging task for intelligent systems based on Internet of Things(Io T). The major difficulty is caused by the lack of basic knowledge in emotion ex... Understanding people's emotions through natural language is a challenging task for intelligent systems based on Internet of Things(Io T). The major difficulty is caused by the lack of basic knowledge in emotion expressions with respect to a variety of real world contexts. In this paper, we propose a Bayesian inference method to explore the latent semantic dimensions as contextual information in natural language and to learn the knowledge of emotion expressions based on these semantic dimensions. Our method synchronously infers the latent semantic dimensions as topics in words and predicts the emotion labels in both word-level and document-level texts. The Bayesian inference results enable us to visualize the connection between words and emotions with respect to different semantic dimensions. And by further incorporating a corpus-level hierarchy in the document emotion distribution assumption, we could balance the document emotion recognition results and achieve even better word and document emotion predictions. Our experiment of the wordlevel and the document-level emotion predictions, based on a well-developed Chinese emotion corpus Ren-CECps, renders both higher accuracy and better robustness in the word-level and the document-level emotion predictions compared to the state-of-theart emotion prediction algorithms. 展开更多
关键词 Bayesian inference emotion-topic model emotion recognition multi-label classification natural language understanding
在线阅读 下载PDF
Face Recognition on Partial and Holistic LBP Features 被引量:2
12
作者 Xiao-Rong Pu,Yi Zhou,and Rui-Yi Zhou the School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China 《Journal of Electronic Science and Technology》 CAS 2012年第1期56-60,共5页
An algorithm for face description and recognition based on multi-resolution with multi-scale local binary pattern (multi-LBP) features is proposed. The facial image pyramid is constructed and each facial image is di... An algorithm for face description and recognition based on multi-resolution with multi-scale local binary pattern (multi-LBP) features is proposed. The facial image pyramid is constructed and each facial image is divided into various regions from which partial and holistic local binary patter (LBP) histograms are extracted. All LBP features of each image are concatenated to a single LBP eigenvector with different resolutions. The dimensionaUty of LBP features is then reduced by a local margin alignment (LMA) algorithm based on manifold, which can preserve the between-class variance. Support vector machine (SVM) is applied to classify facial images. Extensive experiments on ORL and CMU face databases clearly show the superiority of the proposed scheme over some existed algorithms, especially on the robustness of the method against different facial expressions and postures of the subjects. 展开更多
关键词 Face recognition local binary pattern operator multi-resolution with multi-scale local binary pattern ocal margin alignment dimensionality reduction.
在线阅读 下载PDF
LLE-BASED CLASSIFICATION ALGORITHM FOR MMW RADAR TARGET RECOGNITION 被引量:1
13
作者 Luo Lei Li Yuehua Luan Yinghong 《Journal of Electronics(China)》 2010年第1期139-144,共6页
In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample... In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters. 展开更多
关键词 Manifold learning Locally Linear Embedding(LLE) multi-class classification MilliMeter-Wave(MMW) Target recognition
在线阅读 下载PDF
A Recognition-Based Approach to Segmenting Arabic Handwritten Text
14
作者 Ashraf Elnagar Rahima Bentrcia 《Journal of Intelligent Learning Systems and Applications》 2015年第4期93-103,共11页
Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcomin... Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcoming, which is over-segmentation. The aim of segmentation is to produce the letters (segments) of a handwritten word. When a resulting letter (segment) is made of more than one piece (stroke) instead of one, this is called over-segmentation. Our objective is to overcome this problem by using an Artificial Neural Networks (ANN) to verify the resulting segment. We propose a set of heuristic-based rules to assemble strokes in order to report the precise segmented letters. Preprocessing phases that include normalization and feature extraction are required as a prerequisite step for the ANN system for recognition and verification. In our previous work [1], we did achieve a segmentation success rate of 86% but without recognition. In this work, our experimental results confirmed a segmentation success rate of no less than 95%. 展开更多
关键词 CHARACTER Segmentation Handwritten recognition Systems ARABIC HANDWRITING Neural Networks multi-AGENTS
在线阅读 下载PDF
DM-L Based Feature Extraction and Classifier Ensemble for Object Recognition
15
作者 Hamayun A. Khan 《Journal of Signal and Information Processing》 2018年第2期92-110,共19页
Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained ... Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance. 展开更多
关键词 DEEP Learning Object recognition CNN DEEP multi-LAYER Feature Extraction Principal Component Analysis CLASSIFIER ENSEMBLE Caltech-101 BENCHMARK Database
在线阅读 下载PDF
融合知识图谱和大模型的高校科研管理问答系统设计 被引量:1
16
作者 王永 秦嘉俊 +1 位作者 黄有锐 邓江洲 《计算机科学与探索》 北大核心 2025年第1期107-117,共11页
科研管理是高校管理中的重要组成部分,但现有的科研管理系统难以满足用户的个性化需求。以高校科研管理向智能化转型为需求导向,将知识图谱、传统模型和大语言模型相结合,共同构建新一代高校科研管理问答系统。采集科研知识用于构建科... 科研管理是高校管理中的重要组成部分,但现有的科研管理系统难以满足用户的个性化需求。以高校科研管理向智能化转型为需求导向,将知识图谱、传统模型和大语言模型相结合,共同构建新一代高校科研管理问答系统。采集科研知识用于构建科研知识图谱。利用同时进行意图分类和实体提取的多任务模型进行语义解析。借助解析结果来生成查询语句,并从知识图谱中检索信息来回复常规问题。将大语言模型与知识图谱相结合,以辅助处理开放性问题。在意图和实体具有关联的数据集上的实验结果表明,采用的多任务模型在意图分类和实体识别任务上的F1值分别为0.958和0.937,优于其他对比模型和单任务模型。Cypher生成测试表明了自定义Prompt在激发大语言模型涌现能力方面的成效,利用大语言模型实现文本生成Cypher的准确率达到85.8%,有效处理了基于知识图谱的开放性问题。采用知识图谱、传统模型和大语言模型搭建的问答系统的准确性为0.935,很好地满足了智能问答的需求。 展开更多
关键词 知识图谱 多任务模型 意图分类 命名实体识别 大语言模型
在线阅读 下载PDF
FGITA:一种基于细粒度对齐的多模态命名实体识别框架
17
作者 吕学强 王涛 +3 位作者 游新冬 赵海兴 才藏太 陈玉忠 《小型微型计算机系统》 北大核心 2025年第4期769-775,共7页
命名实体识别任务旨在识别出非结构化文本中所包含的实体并将其分配给预定义的实体类别中.随着互联网和社交媒体的发展,文本信息往往伴随着图像等视觉模态信息出现,传统的命名实体识别方法在多模态信息中表现不佳.近年来,多模态命名实... 命名实体识别任务旨在识别出非结构化文本中所包含的实体并将其分配给预定义的实体类别中.随着互联网和社交媒体的发展,文本信息往往伴随着图像等视觉模态信息出现,传统的命名实体识别方法在多模态信息中表现不佳.近年来,多模态命名实体识别任务广受重视.然而,现有的多模态命名实体识别方法中,存在跨模态知识间的细粒度对齐不足问题,文本表征会融合语义不相关的图像信息,进而引入噪声.为了解决这些问题,提出了一种基于细粒度图文对齐的多模态命名实体识别方法(FGITA:A Multi-Modal NER Frame based on Fine-Grained Image-Text Alignment).首先,该方法通过目标检测、语义相似性判断等,确定更为细粒度的文本实体和图像子对象之间的语义相关性;其次,通过双线性注意力机制,计算出图像子对象与实体的相关性权重,并依据权重将子对象信息融入到实体表征中;最后,提出了一种跨模态对比学习方法,依据图像和实体之间的匹配程度,优化实体和图像在嵌入空间中的距离,借此帮助实体表征学习相关的图像信息.在两个公开数据集上的实验表明,FGITA优于5个主流多模态命名实体识别方法,验证了方法的有效性,同时验证了细粒度跨模态对齐在多模态命名实体识别任务中的重要性和优越性. 展开更多
关键词 多模态 命名实体识别 信息抽取 知识图谱 对比学习
在线阅读 下载PDF
MC-Res2UNet网络在盐体识别中的应用
18
作者 王新 张傲 +1 位作者 张薇 陈同俊 《石油地球物理勘探》 北大核心 2025年第1期21-29,共9页
精确识别埋藏在地表下的盐体对于石油和天然气勘探有重大意义。传统的语义分割算法依然存在对盐体的识别精度较低、边缘识别效果较差、识别效率低等问题。文中提出一种基于MC-Res2UNet网络的盐体识别方法,该网络整体架构由U-Net网络改... 精确识别埋藏在地表下的盐体对于石油和天然气勘探有重大意义。传统的语义分割算法依然存在对盐体的识别精度较低、边缘识别效果较差、识别效率低等问题。文中提出一种基于MC-Res2UNet网络的盐体识别方法,该网络整体架构由U-Net网络改进。首先,使用Res2Net网络作为编码器提取盐体特征信息;然后,在解码层中的卷积之后引入CBAM注意力模块重新分配盐体空间信息和通道信息,抑制不重要的信息;最后,利用多尺度特征融合模块融合空间信息和语义信息,提高盐体识别精度。将文中提出的MC-Res2UNet模型用于TGS盐体数据集进行验证,像素准确率可达到96.6%,交并比可达到86.8%,优于传统的DeepLabV3+、DANet等语义分割方法,对地下盐体有更好的识别效果。 展开更多
关键词 盐体识别 U-Net 多尺度特征融合 注意力机制
在线阅读 下载PDF
时空语义驱动的渐进多视角行为去偏置研究
19
作者 钟忺 陈亮 +4 位作者 刘文璇 叶舒 江奎 王正 林嘉文 《计算机工程》 北大核心 2025年第1期1-10,共10页
在实际应用中,单视角摄像头采集数据由于物体存在遮挡而失去对某些区域的可见性,因此结合多个视角下的数据进行行为分析对于维护社会稳定及民生安全至关重要。针对多视角行为识别中存在的偏置问题,即不同视角下空间语义不一致导致的视... 在实际应用中,单视角摄像头采集数据由于物体存在遮挡而失去对某些区域的可见性,因此结合多个视角下的数据进行行为分析对于维护社会稳定及民生安全至关重要。针对多视角行为识别中存在的偏置问题,即不同视角下空间语义不一致导致的视角间行为表征差异以及同一行为执行过程中的时序语义不一致导致的行为表征差异,提出一种渐进去偏置的多视角方法。首先,在多视角下的同一行为样本中以证据理论为引导,结合不同视角下的行为同构性进行视角间行为去偏置,优化不同视角下关注的行为特征权重,以获得更全面的无偏行为表示。其次,结合多粒度解耦策略,分析不同粒度对行为特征无偏表达的影响,准确分离行为相关和行为无关特征,以避免视角内行为无关信息扰乱行为表征导致的显著差异。最后,在时序维度上构建不同行为特征权重,增强同一视角内行为特征一致性,减弱同一行为的行为表征差异。在多个数据集上的实验结果验证了所提方法的有效性,在N-UCLA和NTU-RGB+D数据集上的跨视角准确率分别达到了97.4%和96.4%,并且所提方法在满足多视角下对行为识别进行准确分析应用需求的同时通过一种新的去偏置思路为多视角行为识别问题提供了一种有效的解决方案。 展开更多
关键词 多视角行为识别 渐进式去偏置 证据理论 解耦 多粒度
在线阅读 下载PDF
技术生命周期视角下颠覆性技术早期识别方法研究
20
作者 侯艳辉 陈荣 王家坤 《情报学报》 北大核心 2025年第2期157-170,共14页
针对目前颠覆性技术识别过程中忽略技术演化特征的问题,本文提出一种考虑技术生命周期阶段性和特征异质性的颠覆性技术早期识别方法。首先,采用Sentence-BERT (sentence bidirectional encoder representation from transformers)对专... 针对目前颠覆性技术识别过程中忽略技术演化特征的问题,本文提出一种考虑技术生命周期阶段性和特征异质性的颠覆性技术早期识别方法。首先,采用Sentence-BERT (sentence bidirectional encoder representation from transformers)对专利摘要进行向量化。其次,构建过滤识别系统。第一层使用LOCI (local outlier factor with constraint integration)异常检测算法识别离群专利并分类;第二层,采用S曲线生命周期识别,对处于成熟期的专利类别进行过滤;第三层,对萌芽阶段的专利进行创新性测度;第四层,对成长阶段的专利文本、技术报道数据进行颠覆性测度,完成过滤。最后,以量子信息技术领域为例,阐述该识别方法的应用过程。研究结果表明,量子信息领域共发现三个萌芽期颠覆性主题和三个成长期颠覆性主题,与官方发布的报告进行对比,结果一致,验证了本文方法的可行性与有效性。 展开更多
关键词 技术生命周期 早期识别 过滤识别 多源数据 量子信息
在线阅读 下载PDF
上一页 1 2 153 下一页 到第
使用帮助 返回顶部