当今通用人工智能(AGI)发展火热,各大语言模型(LLMs)层出不穷。大语言模型的广泛应用大大提高了人们的工作水平和效率,但大语言模型也并非完美的,同样伴随着诸多缺点。如:敏感数据安全性、幻觉性、时效性等。同时对于通用大语言模型来讲...当今通用人工智能(AGI)发展火热,各大语言模型(LLMs)层出不穷。大语言模型的广泛应用大大提高了人们的工作水平和效率,但大语言模型也并非完美的,同样伴随着诸多缺点。如:敏感数据安全性、幻觉性、时效性等。同时对于通用大语言模型来讲,对于一些专业领域问题的回答并不是很准确,这就需要检索增强生成(RAG)技术的支持。尤其是在智慧医疗领域方面,由于相关数据的缺乏,不能发挥出大语言模型优秀的对话和解决问题的能力。本算法通过使用Jieba分词,Word2Vec模型对文本数据进行词嵌入,计算句子间的向量相似度并做重排序,帮助大语言模型快速筛选出最可靠可信的模型外部的医疗知识数据,再根据编写相关的提示词(Prompt),可以使大语言模型针对医生或患者的问题提供令人满意的答案。Nowadays, general artificial intelligence is developing rapidly, and major language models are emerging one after another. The widespread application of large language models has greatly improved people’s work level and efficiency, but large language models are not perfect and are also accompanied by many shortcomings. Such as: data security, illusion, timeliness, etc. At the same time, for general large language models, the answers to questions in some professional fields are not very accurate, which requires the support of RAG technology. Especially in the field of smart medical care, due to the lack of relevant data, the excellent conversation and problem-solving capabilities of the large language model cannot be brought into play. This algorithm uses Jieba word segmentation and the Word2Vec model to embed text data, calculate the vector similarity between sentences and reorder them, helping the large language model to quickly screen out the most reliable and trustworthy medical knowledge data outside the model, and then write relevant prompts to enable the large language model to provide satisfactory answers to doctors or patients’ questions.展开更多
为了深入研究我国慢性病医防融合领域的发展趋势和演化过程,本文收集了2006~2024年的373篇相关文献,经过数据清洗和预处理后,引入Word2vec的LDA模型进行文献的主题挖掘,确定每个时期的最佳主题数量,并生成主题演化桑基图。计算不同时间...为了深入研究我国慢性病医防融合领域的发展趋势和演化过程,本文收集了2006~2024年的373篇相关文献,经过数据清洗和预处理后,引入Word2vec的LDA模型进行文献的主题挖掘,确定每个时期的最佳主题数量,并生成主题演化桑基图。计算不同时间段内各主题强度,并通过交互式条形图描述热点主题。结果显示,在第一阶段2006~2020年,大部分研究主要集中在如何整合医疗服务,以及如何将慢性病防控与医防结合;在第二阶段2021~2022年,除了延续既有的主题,部分研究焦点转移到如何更好地管理和融合综合医疗服务,以及如何将公共卫生服务与医疗体系更有效地结合;在第三阶段2023~2024年,研究重点在于如何实现健康服务与医防的深度融合,以及如何在医疗服务中具体落实医防融合的理念,研究更加注重实际操作和具体应用。通过主题演化分析揭示了不同时期内主题之间的关联和演化过程,综合医疗服务、慢性病防控与医防结合等主题在不同阶段都有较强的延续性,而研究重点随着时间的推移逐渐从综合医疗服务向医防融合和健康服务管理方向转移。研究发现,一些主题在不同时期内保持较高的强度,从本研究主题强度图可以看出,在慢性病医防融合领域,社区基层医疗机构在医防融合中具有重要作用,此外2021年及以后的阶段中公共卫生体系建设及医防融合成为研究的共识热点。该研究有助于更全面地理解慢性病医防融合领域的研究动态,为未来的研究方向和政策制定提供有益的参考,同时也为文本分析方法的应用提供了实践示范。未来的研究可以进一步挖掘基层医疗与医防协同机制以及健康服务管理与慢性病防控方面的潜力,更好地帮助社区基层医疗机构服务提供者应对来自人口老龄化社会慢性病高发以及多样化健康需求的挑战,同时也要关注对应的新兴技术如人工智能和大数据分析和对应的数据隐私和伦理挑战,以及政策实施中的风险。In this paper, in order to deeply study the development trend and evolution process in the field of chronic disease medical preventive integration in China, 373 relevant literatures from 2006~2024 were collected, and after data cleaning and pre-processing, the LDA model of Word2vec was introduced in the theme mining of the literature to determine the optimal number of themes in each period and generate the theme evolution Sankey diagram. The intensity of each topic in different time periods is calculated and hot topics are described by interactive bar charts. The results show that in the first period of 2006~2020, most of the studies focused on how to integrate healthcare services and how to combine chronic disease prevention and control with medical prevention;in the second period of 2021~2022, in addition to the continuation of the existing themes, some of the studies shifted their focus to how to better manage and integrate integrated healthcare services and how to combine public health services with the healthcare system more effectively;in the third stage, 2023~2024, the research focused on how to realize the deep integration of health services and medical preventive, and how to implement the concept of medical prevention integration in health care services, and the research focused more on practical operation and specific application. The analysis of theme evolution reveals the connection and evolution process between themes in different periods. The themes of comprehensive medical service, chronic disease prevention and control and medical prevention integration have strong continuity in different stages, while the focus of research gradually shifts from comprehensive medical service to medical prevention integration and health service management over time. It is found that some themes maintain a high intensity in different periods, and the intensity map of the themes in this study shows that in the field of chronic disease medical prevention integration, community-based primary healthcare organizations have an important role in medical prevention integration, and in addition, public health system construction and medical prevention integration have become consensus hotspots in research in the stage of 2021 and beyond. This study contributes to a more comprehensive understanding of the research dynamics in the field of chronic disease medical prevention integration, provides useful references for future research directions and policy formulation, and also provides a practical demonstration of the application of text analysis methods. Future research can further explore the potential of primary care and medical prevention synergistic mechanisms as well as health service management and chronic disease prevention and control to better help community-based primary care providers to cope with the challenges from the high prevalence of chronic diseases and diversified health needs of an aging population, as well as to pay attention to the corresponding emerging technologies such as artificial intelligence and big data analytics and the corresponding data privacy and ethical challenges, and the risks in policy implementation.展开更多
词性是自然语言处理的基本要素,词语顺序包含了所传达的语义与语法信息,它们都是自然语言中的关键信息.在word embedding模型中如何有效地将两者结合起来,是目前研究的重点.本文提出的Structured word2vec on POS联合了词语顺序与词性...词性是自然语言处理的基本要素,词语顺序包含了所传达的语义与语法信息,它们都是自然语言中的关键信息.在word embedding模型中如何有效地将两者结合起来,是目前研究的重点.本文提出的Structured word2vec on POS联合了词语顺序与词性两种信息,不仅使模型可以感知词语位置顺序,而且利用词性关联信息来建立上下文窗口内词语之间的固有句法关系.Structured word2vec on POS将词语按其位置顺序定向嵌入,对词向量和词性相关加权矩阵进行联合优化.实验通过词语类比、词相似性任务,证明了所提出的方法的有效性.展开更多
安全是民航业的核心主题。针对目前民航非计划事件分析严重依赖专家经验及分析效率低下的问题,文章提出一种结合Word2vec和双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络模型的民航非计划事件分析方法。首先采...安全是民航业的核心主题。针对目前民航非计划事件分析严重依赖专家经验及分析效率低下的问题,文章提出一种结合Word2vec和双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络模型的民航非计划事件分析方法。首先采用Word2vec模型针对事件文本语料进行词向量训练,缩小空间向量维度;然后通过BiLSTM模型自动提取特征,获取事件文本的完整序列信息和上下文特征向量;最后采用softmax函数对民航非计划事件进行分类。实验结果表明,所提出的方法分类效果更好,能达到更优的准确率和F 1值,对不平衡数据样本同样具有较稳定的分类性能,证明了该方法在民航非计划事件分析上的适用性和有效性。展开更多
[目的/意义]在人工智能技术及应用快速发展与深刻变革背景下,机器学习领域不断出现新的研究主题和方法,深度学习和强化学习技术持续发展。因此,有必要探索不同领域机器学习研究主题演化过程,并识别出热点与新兴主题。[方法/过程]本文以...[目的/意义]在人工智能技术及应用快速发展与深刻变革背景下,机器学习领域不断出现新的研究主题和方法,深度学习和强化学习技术持续发展。因此,有必要探索不同领域机器学习研究主题演化过程,并识别出热点与新兴主题。[方法/过程]本文以图书情报领域中2011—2022年Web of Science数据库中的机器学习研究论文为例,融合LDA和Word2vec方法进行主题建模和主题演化分析,引入主题强度、主题影响力、主题关注度与主题新颖性指标识别热点主题与新兴热点主题。[结果/结论]研究结果表明,(1)Word2vec语义处理能力与LDA主题演化能力的结合能够更加准确地识别研究主题,直观展示研究主题的分阶段演化规律;(2)图书情报领域的机器学习研究主题主要分为自然语言处理与文本分析、数据挖掘与分析、信息与知识服务三大类范畴。各类主题之间的关联性较强,且具有主题关联演化特征;(3)设计的主题强度、主题影响力和主题关注度指标及综合指标能够较好地识别出2011—2014年、2015—2018年和2019—2022年3个不同周期阶段的热点主题。展开更多
文摘当今通用人工智能(AGI)发展火热,各大语言模型(LLMs)层出不穷。大语言模型的广泛应用大大提高了人们的工作水平和效率,但大语言模型也并非完美的,同样伴随着诸多缺点。如:敏感数据安全性、幻觉性、时效性等。同时对于通用大语言模型来讲,对于一些专业领域问题的回答并不是很准确,这就需要检索增强生成(RAG)技术的支持。尤其是在智慧医疗领域方面,由于相关数据的缺乏,不能发挥出大语言模型优秀的对话和解决问题的能力。本算法通过使用Jieba分词,Word2Vec模型对文本数据进行词嵌入,计算句子间的向量相似度并做重排序,帮助大语言模型快速筛选出最可靠可信的模型外部的医疗知识数据,再根据编写相关的提示词(Prompt),可以使大语言模型针对医生或患者的问题提供令人满意的答案。Nowadays, general artificial intelligence is developing rapidly, and major language models are emerging one after another. The widespread application of large language models has greatly improved people’s work level and efficiency, but large language models are not perfect and are also accompanied by many shortcomings. Such as: data security, illusion, timeliness, etc. At the same time, for general large language models, the answers to questions in some professional fields are not very accurate, which requires the support of RAG technology. Especially in the field of smart medical care, due to the lack of relevant data, the excellent conversation and problem-solving capabilities of the large language model cannot be brought into play. This algorithm uses Jieba word segmentation and the Word2Vec model to embed text data, calculate the vector similarity between sentences and reorder them, helping the large language model to quickly screen out the most reliable and trustworthy medical knowledge data outside the model, and then write relevant prompts to enable the large language model to provide satisfactory answers to doctors or patients’ questions.
文摘为了深入研究我国慢性病医防融合领域的发展趋势和演化过程,本文收集了2006~2024年的373篇相关文献,经过数据清洗和预处理后,引入Word2vec的LDA模型进行文献的主题挖掘,确定每个时期的最佳主题数量,并生成主题演化桑基图。计算不同时间段内各主题强度,并通过交互式条形图描述热点主题。结果显示,在第一阶段2006~2020年,大部分研究主要集中在如何整合医疗服务,以及如何将慢性病防控与医防结合;在第二阶段2021~2022年,除了延续既有的主题,部分研究焦点转移到如何更好地管理和融合综合医疗服务,以及如何将公共卫生服务与医疗体系更有效地结合;在第三阶段2023~2024年,研究重点在于如何实现健康服务与医防的深度融合,以及如何在医疗服务中具体落实医防融合的理念,研究更加注重实际操作和具体应用。通过主题演化分析揭示了不同时期内主题之间的关联和演化过程,综合医疗服务、慢性病防控与医防结合等主题在不同阶段都有较强的延续性,而研究重点随着时间的推移逐渐从综合医疗服务向医防融合和健康服务管理方向转移。研究发现,一些主题在不同时期内保持较高的强度,从本研究主题强度图可以看出,在慢性病医防融合领域,社区基层医疗机构在医防融合中具有重要作用,此外2021年及以后的阶段中公共卫生体系建设及医防融合成为研究的共识热点。该研究有助于更全面地理解慢性病医防融合领域的研究动态,为未来的研究方向和政策制定提供有益的参考,同时也为文本分析方法的应用提供了实践示范。未来的研究可以进一步挖掘基层医疗与医防协同机制以及健康服务管理与慢性病防控方面的潜力,更好地帮助社区基层医疗机构服务提供者应对来自人口老龄化社会慢性病高发以及多样化健康需求的挑战,同时也要关注对应的新兴技术如人工智能和大数据分析和对应的数据隐私和伦理挑战,以及政策实施中的风险。In this paper, in order to deeply study the development trend and evolution process in the field of chronic disease medical preventive integration in China, 373 relevant literatures from 2006~2024 were collected, and after data cleaning and pre-processing, the LDA model of Word2vec was introduced in the theme mining of the literature to determine the optimal number of themes in each period and generate the theme evolution Sankey diagram. The intensity of each topic in different time periods is calculated and hot topics are described by interactive bar charts. The results show that in the first period of 2006~2020, most of the studies focused on how to integrate healthcare services and how to combine chronic disease prevention and control with medical prevention;in the second period of 2021~2022, in addition to the continuation of the existing themes, some of the studies shifted their focus to how to better manage and integrate integrated healthcare services and how to combine public health services with the healthcare system more effectively;in the third stage, 2023~2024, the research focused on how to realize the deep integration of health services and medical preventive, and how to implement the concept of medical prevention integration in health care services, and the research focused more on practical operation and specific application. The analysis of theme evolution reveals the connection and evolution process between themes in different periods. The themes of comprehensive medical service, chronic disease prevention and control and medical prevention integration have strong continuity in different stages, while the focus of research gradually shifts from comprehensive medical service to medical prevention integration and health service management over time. It is found that some themes maintain a high intensity in different periods, and the intensity map of the themes in this study shows that in the field of chronic disease medical prevention integration, community-based primary healthcare organizations have an important role in medical prevention integration, and in addition, public health system construction and medical prevention integration have become consensus hotspots in research in the stage of 2021 and beyond. This study contributes to a more comprehensive understanding of the research dynamics in the field of chronic disease medical prevention integration, provides useful references for future research directions and policy formulation, and also provides a practical demonstration of the application of text analysis methods. Future research can further explore the potential of primary care and medical prevention synergistic mechanisms as well as health service management and chronic disease prevention and control to better help community-based primary care providers to cope with the challenges from the high prevalence of chronic diseases and diversified health needs of an aging population, as well as to pay attention to the corresponding emerging technologies such as artificial intelligence and big data analytics and the corresponding data privacy and ethical challenges, and the risks in policy implementation.
文摘词性是自然语言处理的基本要素,词语顺序包含了所传达的语义与语法信息,它们都是自然语言中的关键信息.在word embedding模型中如何有效地将两者结合起来,是目前研究的重点.本文提出的Structured word2vec on POS联合了词语顺序与词性两种信息,不仅使模型可以感知词语位置顺序,而且利用词性关联信息来建立上下文窗口内词语之间的固有句法关系.Structured word2vec on POS将词语按其位置顺序定向嵌入,对词向量和词性相关加权矩阵进行联合优化.实验通过词语类比、词相似性任务,证明了所提出的方法的有效性.
文摘安全是民航业的核心主题。针对目前民航非计划事件分析严重依赖专家经验及分析效率低下的问题,文章提出一种结合Word2vec和双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络模型的民航非计划事件分析方法。首先采用Word2vec模型针对事件文本语料进行词向量训练,缩小空间向量维度;然后通过BiLSTM模型自动提取特征,获取事件文本的完整序列信息和上下文特征向量;最后采用softmax函数对民航非计划事件进行分类。实验结果表明,所提出的方法分类效果更好,能达到更优的准确率和F 1值,对不平衡数据样本同样具有较稳定的分类性能,证明了该方法在民航非计划事件分析上的适用性和有效性。
文摘[目的/意义]在人工智能技术及应用快速发展与深刻变革背景下,机器学习领域不断出现新的研究主题和方法,深度学习和强化学习技术持续发展。因此,有必要探索不同领域机器学习研究主题演化过程,并识别出热点与新兴主题。[方法/过程]本文以图书情报领域中2011—2022年Web of Science数据库中的机器学习研究论文为例,融合LDA和Word2vec方法进行主题建模和主题演化分析,引入主题强度、主题影响力、主题关注度与主题新颖性指标识别热点主题与新兴热点主题。[结果/结论]研究结果表明,(1)Word2vec语义处理能力与LDA主题演化能力的结合能够更加准确地识别研究主题,直观展示研究主题的分阶段演化规律;(2)图书情报领域的机器学习研究主题主要分为自然语言处理与文本分析、数据挖掘与分析、信息与知识服务三大类范畴。各类主题之间的关联性较强,且具有主题关联演化特征;(3)设计的主题强度、主题影响力和主题关注度指标及综合指标能够较好地识别出2011—2014年、2015—2018年和2019—2022年3个不同周期阶段的热点主题。