The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimizati...The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.展开更多
Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed wo...Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.Findings-Query processing in Hadoop influences the distributed processing with the MapReduce model.MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers.Its results are valid for some extent size of the data.Originality/value-Pig supports the required parallel processing framework with the following constructs during the processing of queries:FOREACH;FLATTEN;COGROUP.展开更多
In this paper, we consider skyline queries in a mobile and distributed environment, where data objects are distributed in some sites (database servers) which are interconnected through a high-speed wired network, an...In this paper, we consider skyline queries in a mobile and distributed environment, where data objects are distributed in some sites (database servers) which are interconnected through a high-speed wired network, and queries are issued by mobile units (laptop, cell phone, etc.) which access the data objects of database servers by wireless channels. The inherent properties of mobile computing environment such as mobility, limited wireless bandwidth, frequent disconnection, make skyline queries more complicated. We show how to efficiently perform distributed skyline queries in a mobile environment and propose a skyline query processing approach, called efficient distributed skyline based on mobile computing (EDS-MC). In EDS-MC, a distributed skyline query is decomposed into five processing phases and each phase is elaborately designed in order to reduce the network communication, network delay and query response time. We conduct extensive experiments in a simulated mobile database system, and the experimental results demonstrate the superiority of EDS-MC over other skyline query processing techniques on mobile computing.展开更多
基金partially supported by NSFC under Grant Nos.61832001 and 62272008ZTE Industry-University-Institute Fund Project。
文摘The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.
文摘Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.Findings-Query processing in Hadoop influences the distributed processing with the MapReduce model.MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers.Its results are valid for some extent size of the data.Originality/value-Pig supports the required parallel processing framework with the following constructs during the processing of queries:FOREACH;FLATTEN;COGROUP.
基金supported by the Natural Science Foundation of Tianjin under Grant No. 08JCYBJC12400the Innovative Foundation of Small and Medium Enterprises under Grant No. 08ZXCXGX15000+1 种基金the National High-Technology Research and Development 863 Program of China under Grant No. 2009AA01Z152the National Natural Science Foundation of China under Grant No. 60872064
文摘In this paper, we consider skyline queries in a mobile and distributed environment, where data objects are distributed in some sites (database servers) which are interconnected through a high-speed wired network, and queries are issued by mobile units (laptop, cell phone, etc.) which access the data objects of database servers by wireless channels. The inherent properties of mobile computing environment such as mobility, limited wireless bandwidth, frequent disconnection, make skyline queries more complicated. We show how to efficiently perform distributed skyline queries in a mobile environment and propose a skyline query processing approach, called efficient distributed skyline based on mobile computing (EDS-MC). In EDS-MC, a distributed skyline query is decomposed into five processing phases and each phase is elaborately designed in order to reduce the network communication, network delay and query response time. We conduct extensive experiments in a simulated mobile database system, and the experimental results demonstrate the superiority of EDS-MC over other skyline query processing techniques on mobile computing.
文摘动态路网k近邻(kNN)查询是许多基于位置的服务(LBS)中的一个重要问题。针对该问题,提出一种面向动态路网的移动对象分布式kNN查询算法DkNN(Distributed kNN)。首先,将整个路网划分为部署于集群中不同节点中的多个子图;其次,通过并行地搜索查询范围所涉及的子图得到精确的kNN结果;最后,优化查询的搜索过程,引入查询范围剪枝策略和查询终止策略。在4个道路网络数据集上与3种基线算法进行了充分对比和验证。实验结果显示,与TEN~*-Index(Tree dEcomposition based kNN~*Index)算法相比,DkNN算法的查询时间减少了56.8%,路网更新时间降低了3个数量级。DkNN算法可以快速响应动态路网中的kNN查询请求,且在处理路网更新时具有较低的更新成本。