摘要
针对传统的协同过滤算法在利用签到记录进行兴趣点(POI)推荐时不能充分利用签到信息所隐含的偏好、位置和社交网络信息而损失准确率的问题,以及传统的单机串行算法在大数据处理能力上的弱势,提出一种基于位置和朋友关系的协同过滤(LFBCF)算法,以用户历史偏好为基础,综合考虑用户社交关系网络进行协同过滤,并以用户的活动范围作为约束实现对用户的兴趣点推荐。为了支持大数据量的实验,将算法在Spark分布式计算平台上进行了并行化实现。研究过程中使用了Gowalla和Brightkite这两个基于位置的社会化网络数据集,分析了数据集中签到数量、签到位置之间距离、社交关系等可能对推荐结果造成影响的因素,以此来支持提出的算法。实验部分通过与传统的协同过滤算法等经典算法在准确率、F-measure上的对比验证了算法在推荐效果上的优越性,并通过并行算法与单机串行算法在不同数据规模上加速比的对比验证了算法并行化的意义以及性能上的优越性。
Since the traditional collaborative filtering algorithm cannot make full use of information implied in check-ins of users in recommendation process, which contains users' preference, location and social relationship, a recommendation algorithm was proposed, which exploits past user behavior, the check-in information and social relation of users to improve the precision of Point of Interests( POI) recommendation, namely Location-Friendship Based Collaborative Filtering( LFBCF).And the recommendation was implemented on distributed computing platform Spark to support large scale dataset in experiments. Two real datasets in Location-based Social Network( LBSN) including Gowalla and Brightkite were employed in experiments. The amount of check-ins, the distance between locations and the social relationship were analyzed to verify the proposed algorithm. The comparison of precision and F-measure with traditional algorithm confirms the effectiveness of the proposed algorithm; and the comparison of speed-up ratio between the parallelized algorithm and serial algorithm demonstrates the significance of parallelization and superiority of performance.
出处
《计算机应用》
CSCD
北大核心
2016年第2期316-323,335,共9页
journal of Computer Applications
基金
国家863计划项目(2015AA050204)
北京市教育委员会共建项目建设计划项目~~
关键词
基于位置的社交网络
推荐系统
协同过滤
兴趣点
并行化
SPARK
Location-based Social Network(LBSN)
recommender system
collaborative filtering
Point of Interest(POI)
parallelization
Spark