期刊文献+
共找到3,843篇文章
< 1 2 193 >
每页显示 20 50 100
Density peaks clustering based integrate framework for multi-document summarization 被引量:2
1
作者 BaoyanWang Jian Zhang +1 位作者 Yi Liu Yuexian Zou 《CAAI Transactions on Intelligence Technology》 2017年第1期26-30,共5页
We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based met... We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10]. 展开更多
关键词 multi-document summarization Integrated score framework Density peaks clustering Sentences rank
在线阅读 下载PDF
Constructing a taxonomy to support multi-document summarization of dissertation abstracts
2
作者 KHOO Christopher S.G. GOH Dion H. 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2005年第11期1258-1267,共10页
This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level an... This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts. 展开更多
关键词 Text summarization Automatic multi-document summarization Variable-based framework Digital library
在线阅读 下载PDF
Unsupervised Graph-Based Tibetan Multi-Document Summarization
3
作者 Xiaodong Yan Yiqin Wang +3 位作者 Wei Song Xiaobing Zhao A.Run Yang Yanxing 《Computers, Materials & Continua》 SCIE EI 2022年第10期1769-1781,共13页
Text summarization creates subset that represents the most important or relevant information in the original content,which effectively reduce information redundancy.Recently neural network method has achieved good res... Text summarization creates subset that represents the most important or relevant information in the original content,which effectively reduce information redundancy.Recently neural network method has achieved good results in the task of text summarization both in Chinese and English,but the research of text summarization in low-resource languages is still in the exploratory stage,especially in Tibetan.What’s more,there is no large-scale annotated corpus for text summarization.The lack of dataset severely limits the development of low-resource text summarization.In this case,unsupervised learning approaches are more appealing in low-resource languages as they do not require labeled data.In this paper,we propose an unsupervised graph-based Tibetan multi-document summarization method,which divides a large number of Tibetan news documents into topics and extracts the summarization of each topic.Summarization obtained by using traditional graph-based methods have high redundancy and the division of documents topics are not detailed enough.In terms of topic division,we adopt two level clustering methods converting original document into document-level and sentence-level graph,next we take both linguistic and deep representation into account and integrate external corpus into graph to obtain the sentence semantic clustering.Improve the shortcomings of the traditional K-Means clustering method and perform more detailed clustering of documents.Then model sentence clusters into graphs,finally remeasure sentence nodes based on the topic semantic information and the impact of topic features on sentences,higher topic relevance summary is extracted.In order to promote the development of Tibetan text summarization,and to meet the needs of relevant researchers for high-quality Tibetan text summarization datasets,this paper manually constructs a Tibetan summarization dataset and carries out relevant experiments.The experiment results show that our method can effectively improve the quality of summarization and our method is competitive to previous unsupervised methods. 展开更多
关键词 multi-document summarization text clustering topic feature fusion graphic model
在线阅读 下载PDF
Research on multi-document summarization based on latent semantic indexing
4
作者 秦兵 刘挺 +1 位作者 张宇 李生 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2005年第1期91-94,共4页
A multi-document summarization method based on Latent Semantic Indexing (LSI) is proposed. The method combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decompos... A multi-document summarization method based on Latent Semantic Indexing (LSI) is proposed. The method combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decomposition (SVD) to reduce the dimension of the matrix and extract features, and then the sentence similarity is computed. The sentences are clustered according to similarity of sentences. The centroid sentences are selected from each class. Finally, the selected sentences are ordered to generate the summarization. The evaluation and results are presented, which prove that the proposed methods are efficient. 展开更多
关键词 multi-document summarization LSI (latent semantic indexing) CLUSTERING
在线阅读 下载PDF
TWO-STAGE SENTENCE SELECTION APPROACH FOR MULTI-DOCUMENT SUMMARIZATION
5
作者 Zhang Shu Zhao Tiejun Zheng Dequan Zhao Hua 《Journal of Electronics(China)》 2008年第4期562-567,共6页
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summar... Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summary is proposed,which has two stages,the acquisition of acandidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentenceset is obtained by redundancy-based sentence selection approach.At the second stage,optimum se-lection of sentences is proposed to delete sentences in the candidate sentence set according to itscontribution to the whole set until getting the appointed summary length.With a test corpus,theROUGE value of summaries gotten by the proposed approach proves its validity,compared with thetraditional method of sentence selection.The influence of the token chosen in the two-stage sentenceselection approach on the quality of the generated summaries is analyzed. 展开更多
关键词 TWO-STAGE Sentence selection approach multi-document summarization
在线阅读 下载PDF
Using AdaBoost Meta-Learning Algorithm for Medical News Multi-Document Summarization 被引量:1
6
作者 Mahdi Gholami Mehr 《Intelligent Information Management》 2013年第6期182-190,共9页
Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss abo... Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss about multi-document summarization that differs from the single one in which the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Since the number and variety of online medical news make them difficult for experts in the medical field to read all of the medical news, an automatic multi-document summarization can be useful for easy study of information on the web. Hence we propose a new approach based on machine learning meta-learner algorithm called AdaBoost that is used for summarization. We treat a document as a set of sentences, and the learning algorithm must learn to classify as positive or negative examples of sentences based on the score of the sentences. For this learning task, we apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from different medical websites. Then we compare our results with some existing approaches. 展开更多
关键词 multi-document summarization Machine Learning Decision Trees ADABOOST C4.5 MEDICAL Document summarization
在线阅读 下载PDF
Multi-Document Summarization Model Based on Integer Linear Programming
7
作者 Rasim Alguliev Ramiz Aliguliyev Makrufa Hajirahimova 《Intelligent Control and Automation》 2010年第2期105-111,共7页
This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main c... This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main content of the text, and summaries are created by extracting the highest scored sentences from the original document. The model formalized as a multiobjective integer programming problem. An advantage of this model is that it can cover the main content of source (s) and provide less redundancy in the generated sum- maries. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, have been used the similarity of sentences to the original document and the similarity between sentences. Performance evaluation is conducted by comparing summarization outputs with manual summaries of DUC2004 dataset. Experiments showed that the proposed approach outperforms the related methods. 展开更多
关键词 multi-document summarization Content COVERAGE LESS REDUNDANCY INTEGER Linear Programming
在线阅读 下载PDF
Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models
8
作者 Josua Käser Thomas Nagy +1 位作者 Patrick Stirnemann Thomas Hanne 《Computers, Materials & Continua》 2025年第4期201-217,共17页
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t... We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains. 展开更多
关键词 Text summarization pre-trained transformer-based language models large language models technical healthcare texts natural language processing
在线阅读 下载PDF
Automated Multi-Document Biomedical Text Summarization Using Deep Learning Model 被引量:2
9
作者 Ahmed S.Almasoud Siwar Ben Haj Hassine +5 位作者 Fahd N.Al-Wesabi Mohamed K.Nour Anwer Mustafa Hilal Mesfer Al Duhayyim Manar Ahmed Hamza Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第6期5799-5815,共17页
Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical dat... Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical data,automated multi-document biomedical text summarization becomes an effective and robust approach of accessing the increased amount of technical and medical literature in the biomedical sector through the summarization of multiple source documents by retaining the significantly informative data.So,multi-document biomedical text summarization acts as a vital role to alleviate the issue of accessing precise and updated information.This paper presents a Deep Learning based Attention Long Short Term Memory(DLALSTM)Model for Multi-document Biomedical Text Summarization.The proposed DL-ALSTM model initially performs data preprocessing to convert the available medical data into a compatible format for further processing.Then,the DL-ALSTM model gets executed to summarize the contents from the multiple biomedical documents.In order to tune the summarization performance of the DL-ALSTM model,chaotic glowworm swarm optimization(CGSO)algorithm is employed.Extensive experimentation analysis is performed to ensure the betterment of the DL-ALSTM model and the results are investigated using the PubMed dataset.Comprehensive comparative result analysis is carried out to showcase the efficiency of the proposed DL-ALSTM model with the recently presented models. 展开更多
关键词 BIOMEDICAL text summarization healthcare deep learning lstm parameter tuning
在线阅读 下载PDF
Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs
10
作者 YE Feiyue XU Xinchen 《Journal of Shanghai Jiaotong university(Science)》 EI 2018年第4期584-592,共9页
As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of do... As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of documents. In this paper, we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization, known as Graph & Keywordp. The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents. Therefore, we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary. Meanwhile, because of the differences in the words importance in documents, we propose to use keyword density for the summaries to provide rich content while using a small number of words. The experiment results show that the Graph & Keywordp method outperforms the state of the art systems when tested on the Duc2004 data set. Key words: multi-document, graph algorithm, keyword density, Graph & Keywordp, Due2004 展开更多
关键词 multi-document graph algorithm keyword density Graph & Keywordρ Duc2004
原文传递
BHLM:Bayesian theory-based hybrid learning model for multi-document summarization
11
作者 S.Suneetha A.Venugopal Reddy 《International Journal of Modeling, Simulation, and Scientific Computing》 EI 2018年第2期229-250,共22页
In order to understand and organize the document in an efficient way,the multidocument summarization becomes the prominent technique in the Internet world.As the information available is in a large amount,it is necess... In order to understand and organize the document in an efficient way,the multidocument summarization becomes the prominent technique in the Internet world.As the information available is in a large amount,it is necessary to summarize the document for obtaining the condensed information.To perform the multi-document summarization,a new Bayesian theory-based Hybrid Learning Model(BHLM)is proposed in this paper.Initially,the input documents are preprocessed,where the stop words are removed from the document.Then,the feature of the sentence is extracted to determine the sentence score for summarizing the document.The extracted feature is then fed into the hybrid learning model for learning.Subsequently,learning feature,training error and correlation coefficient are integrated with the Bayesian model to develop BHLM.Also,the proposed method is used to assign the class label assisted by the mean,variance and probability measures.Finally,based on the class label,the sentences are sorted out to generate the final summary of the multi-document.The experimental results are validated in MATLAB,and the performance is analyzed using the metrics,precision,recall,F-measure and rouge-1.The proposed model attains 99.6%precision and 75%rouge-1 measure,which shows that the model can provide the final summary efficiently. 展开更多
关键词 multi-document text feature sentence score hybrid learning model Bayesian theory
原文传递
EDSUCh:A robust ensemble data summarization method for effective medical diagnosis 被引量:1
12
作者 Mohiuddin Ahmed A.N.M.Bazlur Rashid 《Digital Communications and Networks》 SCIE CSCD 2024年第1期182-189,共8页
Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective dia... Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective diagnosis.In this paper,we propose an ensemble summarization method that combines clustering and sampling to create a summary of the original data to ensure the inclusion of rare patterns.To the best of our knowledge,there has been no such technique available to augment the performance of anomaly detection techniques and simultaneously increase the efficiency of medical diagnosis.The performance of popular anomaly detection algorithms increases significantly in terms of accuracy and computational complexity when the summaries are used.Therefore,the medical diagnosis becomes more effective,and our experimental results reflect that the combination of the proposed summarization scheme and all underlying algorithms used in this paper outperforms the most popular anomaly detection techniques. 展开更多
关键词 Data summarization ENSEMBLE Medical diagnosis Sampling
在线阅读 下载PDF
Video Summarization Approach Based on Binary Robust Invariant Scalable Keypoints and Bisecting K-Means
13
作者 Sameh Zarif Eman Morad +3 位作者 Khalid Amin Abdullah Alharbi Wail S.Elkilani Shouze Tang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3565-3583,共19页
Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract ... Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality. 展开更多
关键词 BRISK bisecting K-mean video summarization keyframe extraction shot detection
在线阅读 下载PDF
Adaptive Graph Convolutional Adjacency Matrix Network for Video Summarization
14
作者 Jing Zhang Guangli Wu Shanshan Song 《Computers, Materials & Continua》 SCIE EI 2024年第8期1947-1965,共19页
Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes an... Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques. 展开更多
关键词 Attention mechanism deep learning graph neural network key-shot video summarization
在线阅读 下载PDF
Enhanced Topic-Aware Summarization Using Statistical Graph Neural Networks
15
作者 Ayesha Khaliq Salman Afsar Awan +2 位作者 Fahad Ahmad Muhammad Azam Zia Muhammad Zafar Iqbal 《Computers, Materials & Continua》 SCIE EI 2024年第8期3221-3242,共22页
The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Curr... The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges. 展开更多
关键词 summarization graph attention network bidirectional encoder representations from transformers Latent Dirichlet Allocation term frequency-inverse document frequency
在线阅读 下载PDF
Drug and Vaccine Extractive Text Summarization Insights Using Fine-Tuned Transformers
16
作者 Rajesh Bandaru Y.Radhika 《Journal of Artificial Intelligence and Technology》 2024年第4期351-362,共12页
Text representation is a key aspect in determining the success of various text summarizing techniques.Summarization using pretrained transformer models has produced encouraging results.Yet the scope of applying these ... Text representation is a key aspect in determining the success of various text summarizing techniques.Summarization using pretrained transformer models has produced encouraging results.Yet the scope of applying these models in medical and drug discovery is not examined to a proper extent.To address this issue,this article aims to perform extractive summarization based on fine-tuned transformers pertaining to drug and medical domain.This research also aims to enhance sentence representation.Exploring the extractive text summarization aspects of medical and drug discovery is a challenging task as the datasets are limited.Hence,this research concentrates on the collection of abstracts collected from PubMed for various domains of medical and drug discovery such as drug and COVID,with a total capacity of 1,370 abstracts.A detailed experimentation using BART(Bidirectional Autoregressive Transformer),T5(Text-to-Text Transfer Transformer),LexRank,and TexRank for the analysis of the dataset is carried out in this research to perform extractive text summarization. 展开更多
关键词 BART BERT extractive text summarization LexRank TexRank
在线阅读 下载PDF
Generating Abstractive Summaries from Social Media Discussions Using Transformers
17
作者 Afrodite Papagiannopoulou Chrissanthi Angeli Mazida Ahmad 《Open Journal of Applied Sciences》 2025年第1期239-258,共20页
The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and... The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and insights, influencing daily habits, and driving business, political, and economic decisions. Text posts are particularly significant, and natural language processing (NLP) has emerged as a powerful tool for analyzing such data. While traditional NLP methods have been effective for structured media, social media content poses unique challenges due to its informal and diverse nature. This has spurred the development of new techniques tailored for processing and extracting insights from unstructured user-generated text. One key application of NLP is the summarization of user comments to manage overwhelming content volumes. Abstractive summarization has proven highly effective in generating concise, human-like summaries, offering clear overviews of key themes and sentiments. This enhances understanding and engagement while reducing cognitive effort for users. For businesses, summarization provides actionable insights into customer preferences and feedback, enabling faster trend analysis, improved responsiveness, and strategic adaptability. By distilling complex data into manageable insights, summarization plays a vital role in improving user experiences and empowering informed decision-making in a data-driven landscape. This paper proposes a new implementation framework by fine-tuning and parameterizing Transformer Large Language Models to manage and maintain linguistic and semantic components in abstractive summary generation. The system excels in transforming large volumes of data into meaningful summaries, as evidenced by its strong performance across metrics like fluency, consistency, readability, and semantic coherence. 展开更多
关键词 Abstractive summarization TRANSFORMERS Social Media summarization Transformer Language Models
在线阅读 下载PDF
Chinese multi-document personal name disambiguation 被引量:8
18
作者 Wang Houfeng(王厚峰) Mei Zheng 《High Technology Letters》 EI CAS 2005年第3期280-283,共4页
This paper presents a new approach to determining whether an interested personal name across doeuments refers to the same entity. Firstly,three vectors for each text are formed: the personal name Boolean vectors deno... This paper presents a new approach to determining whether an interested personal name across doeuments refers to the same entity. Firstly,three vectors for each text are formed: the personal name Boolean vectors denoting whether a personal name occurs the text the biographical word Boolean vector representing title, occupation and so forth, and the feature vector with real values. Then, by combining a heuristic strategy based on Boolean vectors with an agglomeratie clustering algorithm based on feature vectors, it seeks to resolve multi-document personal name coreference. Experimental results show that this approach achieves a good performance by testing on "Wang Gang" corpus. 展开更多
关键词 personal name disambiguation Chinese multi-document heuristic strategy. agglomerative clustering
在线阅读 下载PDF
Interactive System for Video Summarization Based on Multimodal Fusion 被引量:1
19
作者 Zheng Li Xiaobing Du +2 位作者 Cuixia Ma Yanfeng Li Hongan Wang 《Journal of Beijing Institute of Technology》 EI CAS 2019年第1期27-34,共8页
Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is ... Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently. 展开更多
关键词 VIDEO VISUALIZATION INTERACTION MULTIMODAL FUSION VIDEO summarization
在线阅读 下载PDF
AUTOMATIC TEXT SUMMARIZATION BASED ON TEXTUAL COHESION 被引量:6
20
作者 Chen Yanmin Liu Bingquan Wang Xiaolong 《Journal of Electronics(China)》 2007年第3期338-346,共9页
This paper presents two different algorithms that derive the cohesion structure in the form of lexical chains from two kinds of language resources HowNet and TongYiCiCiLin. The re-search that connects the cohesion str... This paper presents two different algorithms that derive the cohesion structure in the form of lexical chains from two kinds of language resources HowNet and TongYiCiCiLin. The re-search that connects the cohesion structure of a text to the derivation of its summary is displayed. A novel model of automatic text summarization is devised,based on the data provided by lexical chains from original texts. Moreover,the construction rules of lexical chains are modified accord-ing to characteristics of the knowledge database in order to be more suitable for Chinese summa-rization. Evaluation results show that high quality indicative summaries are produced from Chi-nese texts. 展开更多
关键词 Text summarization Textual cohesion Lexical chain HOWNET TongYiCiCiLin
在线阅读 下载PDF
上一页 1 2 193 下一页 到第
使用帮助 返回顶部