摘要
【目的】提出一种多源数据下,通过共享语义空间对技术主题进行表征的新兴技术识别方法,并探究方法的有效性。【方法】使用LDA主题模型识别多源数据主题,用Word2Vec模型基于主题代表词汇及其权重,将主题表示为向量,进而进行主题合并。使用主题强度、主题新颖度指标判别新兴主题。【结果】在具体的自动驾驶汽车实证领域,共识别出了驾驶主体切换、行驶轨迹选择与控制、变道安全、运动估计及风险规避、汽车结构设计、感知环境技术、通信技术及通信安全等7个新兴技术。【局限】未来将探讨如何更客观地确定阈值,以及细化主题的粒度。【结论】运用LDA主题模型与共享语义空间,可以在多源数据下识别出新兴技术主题,优化现有的识别方法。
[Objective]The paper proposed a new method to identify emerging technologies using shared semantic model and multi-source data.[Methods]We used the LDA model to detect the topics of multi-source data.Then,we utilized the Word2Vec model to create vectors for these topics based on the representative words and their weights.Third,we merged the topics,and used topic strength and novelty to identify emerging technologies.[Results]We found seven emerging technoligies from the field of Autonomous Vechicles,including Driver Switching,Selection and Control of Travel Path,Lane Change Safety,Motion Estimation and Risk Aversion,Structure Design,Perception of the Environment,as well as Communication Technology and Communication Security.[Limitations]More research is needed to explore better ways to determine the threshold and find finegrained topics.[Conclusions]The proposed method is able to detect emerging topics using data from multiple sources,which optimizes the exisiting methods.
作者
周云泽
闵超
Zhou Yunze;Min Chao(School of Information Management,Nanjing University,Nanjing 210023,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2022年第2期55-66,共12页
Data Analysis and Knowledge Discovery
基金
江苏省社会科学基金项目(项目编号:18TQC005)
中央高校基本科研业务费专项(项目编号:14380005)的研究成果之一。