摘要
针对当前云计算技术的发展,以及传统人工智能训练平台开发存在的资源利用局限,提出一种基于Docker技术+Kubernetes集群的人工智能训练开发平台。在该平台中结合Kubernetes集群调度原理,采用Max Resource Usage Priority优选算法,按照评分高低的方式对宿主机节点进行选择,以提高对Kubernetes集群中资源的利用率。然后为适应未来Kubernetes集群资源的动态调整,提出一种基于现有各节点使用情况的资源使用预测模型,并在传统ARMIA模型的基础上,提出改进的RBF组合模型,以提高预测的准确率。结果表明,以上方法可有效提高资源利用率,以及提高Kubernetes集群中资源预测效率。
in view of the development of cloud computing technology and the limitation of resource utilization in the development of traditional artificial intelligence training platform,an artificial intelligence training development platform based on docker technology+kubernetes cluster is proposed.Based on the kubernetes cluster scheduling principle and the max resource usage priority optimization algorithm,the host node is selected according to the evaluation method to improve the utilization of resources in the kubernetes cluster.Then,in order to adapt to the dynamic adjustment of kubernetes cluster resources in the future,a resource usage prediction model based on the existing node usage is proposed,and on the basis of the traditional armia model,an improved RBF combination model is proposed to improve the prediction accuracy.The results show that the above methods can effectively improve the resource utilization and the efficiency of resource prediction in kubernetes cluster.
作者
黄巨涛
郑杰生
高尚
刘文彬
林嘉鑫
董召杰
王尧
HUANG Jutao;ZHENG Jiesheng;GAO Shang;LIU Wenbin;LIN Jiaxin;DONG Zhaojie;WANG Yao(Information Center of Guangdong Power Grid Co.Ltd.,Guangzhou 510000,China;Guangdong Electric Power Information Technology Co.Ltd.,Guangzhou 510000,China;Digital Grid Research Institute,China Southern Power Gird.,Guangzhou 510700,China)
出处
《自动化与仪器仪表》
2020年第7期159-162,166,共5页
Automation & Instrumentation
基金
中国南方电网有限责任公司科技项目(No.090000KK52170124)。