Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accur...Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accurately, the concept of a grid is presented, and grid-based methods for modeling geospatial objects are described. The semantic constitution of a building environment and the methods for modeling rooms, corridors, and staircases with grid objects are described. Based on the topology relationship between grid objects, a grid-based graph for a building environment is presented, and the corresponding route algorithm for pedestrians is proposed. The main advantages of the graph model proposed in this paper are as follows: 1) consideration of both semantic and geometric information, 2) consideration of the need for accurate geometric representation of the micro-spatial environment and the efficiency of pedestrian route analysis, 3) applicability of the graph model to route analysis in both static and dynamic environments, and 4) ability of the multi-hierarchical route analysis to integrate the multiple levels of pedestrian decision characteristics, from the high to the low, to determine the optimal path.展开更多
Integrating marketing and distribution businesses is crucial for improving the coordination of equipment and the efficient management of multi-energy systems.New energy sources are continuously being connected to dist...Integrating marketing and distribution businesses is crucial for improving the coordination of equipment and the efficient management of multi-energy systems.New energy sources are continuously being connected to distribution grids;this,however,increases the complexity of the information structure of marketing and distribution businesses.The existing unified data model and the coordinated application of marketing and distribution suffer from various drawbacks.As a solution,this paper presents a data model of"one graph of marketing and distribution"and a framework for graph computing,by analyzing the current trends of business and data in the marketing and distribution fields and using graph data theory.Specifically,this work aims to determine the correlation between distribution transformers and marketing users,which is crucial for elucidating the connection between marketing and distribution.In this manner,a novel identification algorithm is proposed based on the collected data for marketing and distribution.Lastly,a forecasting application is developed based on the proposed algorithm to realize the coordinated prediction and consumption of distributed photovoltaic power generation and distribution loads.Furthermore,an operation and maintenance(O&M)knowledge graph reasoning application is developed to improve the intelligent O&M ability of marketing and distribution equipment.展开更多
Graph data publication has been considered as an important step for data analysis and mining.Graph data,which provide knowledge on interactions among entities,can be locally generated and held by distributed data owne...Graph data publication has been considered as an important step for data analysis and mining.Graph data,which provide knowledge on interactions among entities,can be locally generated and held by distributed data owners.These data are usually sensitive and private,because they may be related to owners’personal activities and can be hijacked by adversaries to conduct inference attacks.Current solutions either consider private graph data as centralized contents or disregard the overlapping of graphs in distributed manners.Therefore,this work proposes a novel framework for distributed graph publication.In this framework,differential privacy is applied to justify the safety of the published contents.It includes four phases,i.e.,graph combination,plan construction sharing,data perturbation,and graph reconstruction.The published graph selection is guided by one data coordinator,and each graph is perturbed carefully with the Laplace mechanism.The problem of graph selection is formulated and proven to be NP-complete.Then,a heuristic algorithm is proposed for selection.The correctness of the combined graph and the differential privacy on all edges are analyzed.This study also discusses a scenario without a data coordinator and proposes some insights into graph publication.展开更多
The wide application of intelligent terminals in microgrids has fueled the surge of data amount in recent years.In real-world scenarios,microgrids must store large amounts of data efficiently while also being able to ...The wide application of intelligent terminals in microgrids has fueled the surge of data amount in recent years.In real-world scenarios,microgrids must store large amounts of data efficiently while also being able to withstand malicious cyberattacks.To meet the high hardware resource requirements,address the vulnerability to network attacks and poor reliability in the tradi-tional centralized data storage schemes,this paper proposes a secure storage management method for microgrid data that considers node trust and directed acyclic graph(DAG)consensus mechanism.Firstly,the microgrid data storage model is designed based on the edge computing technology.The blockchain,deployed on the edge computing server and combined with cloud storage,ensures reliable data storage in the microgrid.Secondly,a blockchain consen-sus algorithm based on directed acyclic graph data structure is then proposed to effectively improve the data storage timeliness and avoid disadvantages in traditional blockchain topology such as long chain construction time and low consensus efficiency.Finally,considering the tolerance differences among the candidate chain-building nodes to network attacks,a hash value update mechanism of blockchain header with node trust identification to ensure data storage security is proposed.Experimental results from the microgrid data storage platform show that the proposed method can achieve a private key update time of less than 5 milliseconds.When the number of blockchain nodes is less than 25,the blockchain construction takes no more than 80 mins,and the data throughput is close to 300 kbps.Compared with the traditional chain-topology-based consensus methods that do not consider node trust,the proposed method has higher efficiency in data storage and better resistance to network attacks.展开更多
Outlier detection has very important applied value in data mining literature. Different outlier detection algorithms based on distinct theories have different definitions and mining processes. The three-dimensional sp...Outlier detection has very important applied value in data mining literature. Different outlier detection algorithms based on distinct theories have different definitions and mining processes. The three-dimensional space graph for constructing applied algorithms and an improved GridOf algorithm were proposed in terms of analyzing the existing outlier detection algorithms from criterion and theory. Key words outlier - detection - three-dimensional space graph - data mining CLC number TP 311. 13 - TP 391 Foundation item: Supported by the National Natural Science Foundation of China (70371015)Biography: ZHANG Jing (1975-), female, Ph. D, lecturer, research direction: data mining and knowledge discovery.展开更多
With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this pap...With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.展开更多
This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information ...This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information of a data set and produces smooth and stable solutions. The main contributions are as follows: first, graph regularization is added into NMF to discover the hidden semantics and simultaneously respect the intrinsic geometric structure information of a data set. Second,the Lpsmoothing constraint is incorporated into NMF to combine the merits of isotropic(L_2-norm) and anisotropic(L_1-norm)diffusion smoothing, and produces a smooth and more accurate solution to the optimization problem. Finally, the update rules and proof of convergence of GSNMF are given. Experiments on several data sets show that the proposed method outperforms related state-of-the-art methods.展开更多
In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify pat...In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast.展开更多
Join operation is a critical problem when dealing with sliding window over data streams. There have been many optimization strategies for sliding window join in the literature, but a simple heuristic is always used fo...Join operation is a critical problem when dealing with sliding window over data streams. There have been many optimization strategies for sliding window join in the literature, but a simple heuristic is always used for selecting the join sequence of many sliding windows, which is ineffectively. The graph-based approach is proposed to process the problem. The sliding window join model is introduced primarily. In this model vertex represent join operator and edge indicated the join relationship among sliding windows. Vertex weight and edge weight represent the cost of join and the reciprocity of join operators respectively. Then good query plan with minimal cost can be found in the model. Thus a complete join algorithm combining setting up model, finding optimal query plan and executing query plan is shown. Experiments show that the graph-based approach is feasible and can work better in above environment.展开更多
Much data such as geometric image data and drawings have graph structures. Such data are called graph structured data. In order to manage efficiently such graph structured data, we need to analyze and abstract graph s...Much data such as geometric image data and drawings have graph structures. Such data are called graph structured data. In order to manage efficiently such graph structured data, we need to analyze and abstract graph structures of such data. The purpose of this paper is to find knowledge representations which indicate plural abstractions of graph structured data. Firstly, we introduce a term graph as a graph pattern having structural variables, and a substitution over term graphs which is graph rewriting system. Next, for a graph G, we define a multiple layer ( g,(θ 1,…,θ k )) of G as a pair of a term graph g and a list of k substitutions θ 1,…,θ k such that G can be obtained from g by applying substitutions θ 1,…,θ k to g. In the same way, for a set S of graphs, we also define a multiple layer for S as a pair ( D,Θ ) of a set D of term graphs and a list Θ of substitutions. Secondly, for a graph G and a set S of graphs, we present effective algorithms for extracting minimal multiple layers of G and S which give us stratifying abstractions of G and S, respectively. Finally, we report experimental results obtained by applying our algorithms to both artificial data and drawings of power plants which are real world data.展开更多
基金supported by National Natural Science Foundation of China(Nos.41571387,41201375 and 41501440)Tianjin Research Program of Application Foundation and Advanced Technology(No.14JCQNJC07900)+1 种基金Tianjin Science and Technology Planning Project(Nos.15ZCZDSF00390 and 14TXGCCX00015)Opening Fund of Tianjin Engineering Research Center of Geospatial Information Technology"Modeling and analysis of path graph in 3D indoor spatial environment"
文摘Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accurately, the concept of a grid is presented, and grid-based methods for modeling geospatial objects are described. The semantic constitution of a building environment and the methods for modeling rooms, corridors, and staircases with grid objects are described. Based on the topology relationship between grid objects, a grid-based graph for a building environment is presented, and the corresponding route algorithm for pedestrians is proposed. The main advantages of the graph model proposed in this paper are as follows: 1) consideration of both semantic and geometric information, 2) consideration of the need for accurate geometric representation of the micro-spatial environment and the efficiency of pedestrian route analysis, 3) applicability of the graph model to route analysis in both static and dynamic environments, and 4) ability of the multi-hierarchical route analysis to integrate the multiple levels of pedestrian decision characteristics, from the high to the low, to determine the optimal path.
基金This work was supported by the National Key R&D Program of China(2020YFB0905900).
文摘Integrating marketing and distribution businesses is crucial for improving the coordination of equipment and the efficient management of multi-energy systems.New energy sources are continuously being connected to distribution grids;this,however,increases the complexity of the information structure of marketing and distribution businesses.The existing unified data model and the coordinated application of marketing and distribution suffer from various drawbacks.As a solution,this paper presents a data model of"one graph of marketing and distribution"and a framework for graph computing,by analyzing the current trends of business and data in the marketing and distribution fields and using graph data theory.Specifically,this work aims to determine the correlation between distribution transformers and marketing users,which is crucial for elucidating the connection between marketing and distribution.In this manner,a novel identification algorithm is proposed based on the collected data for marketing and distribution.Lastly,a forecasting application is developed based on the proposed algorithm to realize the coordinated prediction and consumption of distributed photovoltaic power generation and distribution loads.Furthermore,an operation and maintenance(O&M)knowledge graph reasoning application is developed to improve the intelligent O&M ability of marketing and distribution equipment.
基金supported by the National Natural Science Foundation of China(Nos.U19A2059 and 61802050)Ministry of Science and Technology of Sichuan Province Program(Nos.2021YFG0018 and 20ZDYF0343)。
文摘Graph data publication has been considered as an important step for data analysis and mining.Graph data,which provide knowledge on interactions among entities,can be locally generated and held by distributed data owners.These data are usually sensitive and private,because they may be related to owners’personal activities and can be hijacked by adversaries to conduct inference attacks.Current solutions either consider private graph data as centralized contents or disregard the overlapping of graphs in distributed manners.Therefore,this work proposes a novel framework for distributed graph publication.In this framework,differential privacy is applied to justify the safety of the published contents.It includes four phases,i.e.,graph combination,plan construction sharing,data perturbation,and graph reconstruction.The published graph selection is guided by one data coordinator,and each graph is perturbed carefully with the Laplace mechanism.The problem of graph selection is formulated and proven to be NP-complete.Then,a heuristic algorithm is proposed for selection.The correctness of the combined graph and the differential privacy on all edges are analyzed.This study also discusses a scenario without a data coordinator and proposes some insights into graph publication.
文摘The wide application of intelligent terminals in microgrids has fueled the surge of data amount in recent years.In real-world scenarios,microgrids must store large amounts of data efficiently while also being able to withstand malicious cyberattacks.To meet the high hardware resource requirements,address the vulnerability to network attacks and poor reliability in the tradi-tional centralized data storage schemes,this paper proposes a secure storage management method for microgrid data that considers node trust and directed acyclic graph(DAG)consensus mechanism.Firstly,the microgrid data storage model is designed based on the edge computing technology.The blockchain,deployed on the edge computing server and combined with cloud storage,ensures reliable data storage in the microgrid.Secondly,a blockchain consen-sus algorithm based on directed acyclic graph data structure is then proposed to effectively improve the data storage timeliness and avoid disadvantages in traditional blockchain topology such as long chain construction time and low consensus efficiency.Finally,considering the tolerance differences among the candidate chain-building nodes to network attacks,a hash value update mechanism of blockchain header with node trust identification to ensure data storage security is proposed.Experimental results from the microgrid data storage platform show that the proposed method can achieve a private key update time of less than 5 milliseconds.When the number of blockchain nodes is less than 25,the blockchain construction takes no more than 80 mins,and the data throughput is close to 300 kbps.Compared with the traditional chain-topology-based consensus methods that do not consider node trust,the proposed method has higher efficiency in data storage and better resistance to network attacks.
文摘Outlier detection has very important applied value in data mining literature. Different outlier detection algorithms based on distinct theories have different definitions and mining processes. The three-dimensional space graph for constructing applied algorithms and an improved GridOf algorithm were proposed in terms of analyzing the existing outlier detection algorithms from criterion and theory. Key words outlier - detection - three-dimensional space graph - data mining CLC number TP 311. 13 - TP 391 Foundation item: Supported by the National Natural Science Foundation of China (70371015)Biography: ZHANG Jing (1975-), female, Ph. D, lecturer, research direction: data mining and knowledge discovery.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.2013RC0114111 Project of China under Grant No.B08004
文摘With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.
基金supported by the National Natural Science Foundation of China(61702251,61363049,11571011)the State Scholarship Fund of China Scholarship Council(CSC)(201708360040)+3 种基金the Natural Science Foundation of Jiangxi Province(20161BAB212033)the Natural Science Basic Research Plan in Shaanxi Province of China(2018JM6030)the Doctor Scientific Research Starting Foundation of Northwest University(338050050)Youth Academic Talent Support Program of Northwest University
文摘This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information of a data set and produces smooth and stable solutions. The main contributions are as follows: first, graph regularization is added into NMF to discover the hidden semantics and simultaneously respect the intrinsic geometric structure information of a data set. Second,the Lpsmoothing constraint is incorporated into NMF to combine the merits of isotropic(L_2-norm) and anisotropic(L_1-norm)diffusion smoothing, and produces a smooth and more accurate solution to the optimization problem. Finally, the update rules and proof of convergence of GSNMF are given. Experiments on several data sets show that the proposed method outperforms related state-of-the-art methods.
基金Supported by the National Natural Science Foundation of China (601133010)
文摘In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast.
文摘Join operation is a critical problem when dealing with sliding window over data streams. There have been many optimization strategies for sliding window join in the literature, but a simple heuristic is always used for selecting the join sequence of many sliding windows, which is ineffectively. The graph-based approach is proposed to process the problem. The sliding window join model is introduced primarily. In this model vertex represent join operator and edge indicated the join relationship among sliding windows. Vertex weight and edge weight represent the cost of join and the reciprocity of join operators respectively. Then good query plan with minimal cost can be found in the model. Thus a complete join algorithm combining setting up model, finding optimal query plan and executing query plan is shown. Experiments show that the graph-based approach is feasible and can work better in above environment.
文摘Much data such as geometric image data and drawings have graph structures. Such data are called graph structured data. In order to manage efficiently such graph structured data, we need to analyze and abstract graph structures of such data. The purpose of this paper is to find knowledge representations which indicate plural abstractions of graph structured data. Firstly, we introduce a term graph as a graph pattern having structural variables, and a substitution over term graphs which is graph rewriting system. Next, for a graph G, we define a multiple layer ( g,(θ 1,…,θ k )) of G as a pair of a term graph g and a list of k substitutions θ 1,…,θ k such that G can be obtained from g by applying substitutions θ 1,…,θ k to g. In the same way, for a set S of graphs, we also define a multiple layer for S as a pair ( D,Θ ) of a set D of term graphs and a list Θ of substitutions. Secondly, for a graph G and a set S of graphs, we present effective algorithms for extracting minimal multiple layers of G and S which give us stratifying abstractions of G and S, respectively. Finally, we report experimental results obtained by applying our algorithms to both artificial data and drawings of power plants which are real world data.