Pattern matching method is one of the classic classifications of existing online portfolio selection strategies. This article aims to study the key aspects of this method—measurement of similarity and selection of si...Pattern matching method is one of the classic classifications of existing online portfolio selection strategies. This article aims to study the key aspects of this method—measurement of similarity and selection of similarity sets, and proposes a Portfolio Selection Method based on Pattern Matching with Dual Information of Direction and Distance (PMDI). By studying different combination methods of indicators such as Euclidean distance, Chebyshev distance, and correlation coefficient, important information such as direction and distance in stock historical price information is extracted, thereby filtering out the similarity set required for pattern matching based investment portfolio selection algorithms. A large number of experiments conducted on two datasets of real stock markets have shown that PMDI outperforms other algorithms in balancing income and risk. Therefore, it is suitable for the financial environment in the real world.展开更多
Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplicat...Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplication of strings. This paper presents a fast string cross pattern matching algorithm based on extracting high frequency strings. Compared with existing algorithms including single-pattern algorithms and multi-pattern matching algorithms, this algorithm is featured by both low time complexity and low space complexity. Because Chinese alphabet is large and the average length of Chinese words is much short, this algorithm is more suitable to process the text written by Chinese, especially when the size of Σ is large and the number of strings is far more than the maximum length of strings of set U.展开更多
Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Chan...Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.展开更多
The traditional multiple pattern matching algorithm, deterministic finite state automata, is implemented by tree structure. A new algorithm is proposed by substituting sequential binary tree for traditional tree. It i...The traditional multiple pattern matching algorithm, deterministic finite state automata, is implemented by tree structure. A new algorithm is proposed by substituting sequential binary tree for traditional tree. It is proved by experiment that the algorithm has three features, its construction process is quick, its cost of memory is small. At the same time, its searching process is as quick as the traditional algorithm. The algorithm is suitable for the application which requires preprocessing the patterns dynamically.展开更多
Pattern matching is a fundamental approach to detect malicious behaviors and information over Internet, which has been gradually used in high-speed network traffic analysis. However, there is a performance bottleneck ...Pattern matching is a fundamental approach to detect malicious behaviors and information over Internet, which has been gradually used in high-speed network traffic analysis. However, there is a performance bottleneck for multi-pattern matching on online compressed network traffic(CNT), this is because malicious and intrusion codes are often embedded into compressed network traffic. In this paper, we propose an online fast and multi-pattern matching algorithm on compressed network traffic(FMMCN). FMMCN employs two types of jumping, i.e. jumping during sliding window and a string jump scanning strategy to skip unnecessary compressed bytes. Moreover, FMMCN has the ability to efficiently process multiple large volume of networks such as HTTP traffic, vehicles traffic, and other Internet-based services. The experimental results show that FMMCN can ignore more than 89.5% of bytes, and its maximum speed reaches 176.470MB/s in a midrange switches device, which is faster than the current fastest algorithm ACCH by almost 73.15 MB/s.展开更多
Most of the Point Pattern Matching (PPM) algorithm performs poorly when the noise of the point's position and outliers exist. This paper presents a novel and robust PPM algorithm which combined Point Pair Topologi...Most of the Point Pattern Matching (PPM) algorithm performs poorly when the noise of the point's position and outliers exist. This paper presents a novel and robust PPM algorithm which combined Point Pair Topological Characteristics (PPTC) and Spectral Matching (SM) together to solve the afore mentioned issues. In which PPTC, a new shape descriptor, is firstly proposed. A new comparability measurement based on PPTC is defined as the matching probability. Finally, the correct matching results are achieved by the spectral matching method. The synthetic data experiments show its robustness by comparing with the other state-of-art algorithms and the real world data experiments show its effectiveness.展开更多
Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many...Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many other applications highly depend on pattern matching or similarity searches. In this paper, we discuss some of the string matching solutions developed in the past. Then, we present a novel mathematical model to search for a given pattern and it’s near approximates in the text.展开更多
This paper discusses potential application of fuzzy set theory,more specifically, pattern matching, in assessing risk in chemicalplants. Risk factors have been evaluated using linguisticrepresentations of the quantity...This paper discusses potential application of fuzzy set theory,more specifically, pattern matching, in assessing risk in chemicalplants. Risk factors have been evaluated using linguisticrepresentations of the quantity of the hazardous substance involved,its frequency of interaction with the environment, severity of itsimpact and the uncertainty involved in its detection in advance. Foreach linguistic value there is a corresponding membership functionranging over a universe of discourse. The risk scenario created by ahazard/hazardous situation having highest degree of featural value istaken as the known pattern.展开更多
Graph pattern matching(GPM)can be used to mine the key information in graphs.Exact GPM is one of the most commonly used methods among all the GPM-related methods,which aims to exactly find all subgraphs for a given qu...Graph pattern matching(GPM)can be used to mine the key information in graphs.Exact GPM is one of the most commonly used methods among all the GPM-related methods,which aims to exactly find all subgraphs for a given query graph in a data graph.The exact GPM has been widely used in biological data analyses,social network analyses and other fields.In this paper,the applications of the exact GPM were first introduced,and the research progress of the exact GPM was summarized.Then,the related algorithms were introduced in detail,and the experiments on the state-of-the-art exact GPM algorithms were conducted to compare their performance.Based on the experimental results,the applicable scenarios of the algorithms were pointed out.New research opportunities in this area were proposed.展开更多
Based on the study of single pattern matching, MBF algorithm is proposed by imitating the string searching procedure of human. The algorithm preprocesses the pattern by using the idea of Quick Search algorithm and the...Based on the study of single pattern matching, MBF algorithm is proposed by imitating the string searching procedure of human. The algorithm preprocesses the pattern by using the idea of Quick Search algorithm and the already-matched pattern psefix and suffix information. In searching phase, the algorithm makes use of the!character using frequency and the continue-skip idea. The experiment shows that MBF algorithm is more efficient than other algorithms.展开更多
In order to identify any traces of suspicious activities for the networks security, Network Traffic Analysis has been the basis of network security and network management. With the continued emergence of new applicati...In order to identify any traces of suspicious activities for the networks security, Network Traffic Analysis has been the basis of network security and network management. With the continued emergence of new applications and encrypted traffic, the currently available approaches can not perform well for all kinds of network data. In this paper, we propose a novel stream pattern matching technique which is not only easily deployed but also includes the advantages of different methods. The main idea is: first, defining a formal description specification, by which any series of data stream can be unambiguously descrbed by a special stream pattern; then a tree representation is constructed by parsing the stream pattern; at last, a stream pattern engine is constructed with the Non-t-mite automata (S-CG-NFA) and Bit-parallel searching algorithms. Our stream pattern analysis system has been fully prototyped on C programming language and Xilinx Vn-tex2 FPGA. The experimental results show the method could provides a high level of recognition efficiency and accuracy.展开更多
The realization of quantum algorithms relies on specific quantum compilations according to the underlying quantum processors. However, there are various ways to physically implement qubits and manipulate those qubits ...The realization of quantum algorithms relies on specific quantum compilations according to the underlying quantum processors. However, there are various ways to physically implement qubits and manipulate those qubits in different physical devices. These differences lead to different communication methods and connection topologies, with each vendor implementing its own set of primitive gates. Therefore, quantum circuits have to be rewritten or transformed in order to be transplanted from one platform to another. We propose a pattern matching based framework for rewriting quantum circuits, called QRewriting. It takes advantage of a new representation of quantum circuits using symbolic sequences. Unlike the traditional approach using directed acyclic graphs, the new representation allows us to easily identify the patterns that appear non-consecutively but are reducible. Then, we convert the problem of pattern matching into that of finding distinct subsequences and propose a polynomial-time dynamic programming based pattern matching and replacement algorithm. We develop a rule library for basic optimizations and rewrite the arithmetic and Toffoli circuits from a commonly used gate set to the gate set supported by the Surface-17 quantum processor. Compared with a state-of-the-art quantum circuit optimization framework PaF optimized on the BIGD benchmarks, QRewriting further reduces the depth and the gate count by an average of 26.5% and 17.4%, respectively.展开更多
Pattern matching with wildcards(PMW) has great theoretical and practical significance in bioinformatics,information retrieval, and pattern mining. Due to the uncertainty of wildcards, not only is the number of all m...Pattern matching with wildcards(PMW) has great theoretical and practical significance in bioinformatics,information retrieval, and pattern mining. Due to the uncertainty of wildcards, not only is the number of all matches exponential with respect to the maximal gap flexibility and the pattern length, but the matching positions in PMW are also hard to choose. The objective to count the maximal number of matches one by one is computationally infeasible. Therefore,rather than solving the generic PMW problem, many research efforts have further defined new problems within PMW according to different application backgrounds. To break through the limitations of either fixing the number or allowing an unbounded number of wildcards, pattern matching with flexible wildcards(PMFW) allows the users to control the ranges of wildcards. In this paper, we provide a survey on the state-of-the-art algorithms for PMFW, with detailed analyses and comparisons, and discuss challenges and opportunities in PMFW research and applications.展开更多
Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design effic...Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures.We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns.The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized.We formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization problem.Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.展开更多
Point pattern matchingisanimportantproblem inthefieldsofcomputervision and patternrecognition.In this paper,new algorithms based onirreducible matrix andrelativeinvariantfor matchingtwosets ofpoints withthe same ca...Point pattern matchingisanimportantproblem inthefieldsofcomputervision and patternrecognition.In this paper,new algorithms based onirreducible matrix andrelativeinvariantfor matchingtwosets ofpoints withthe same cardinality are proposed.Theirfundamentalideaistransformingthetwo dimensionalpointsets with n points intothe vectorsin n dimensional space. Considering these vectors as one dimensional point patterns,these new algorithms aim atreducingthe point matching problem to thatofsorting vectorsin n dimensionalspace aslong asthe sensornoise does notalterthe order ofthe elementsinthe vectors.Theoreticalanalysis and simulationresults show thatthe new algorithms are effective .展开更多
Point pattern matching (PPM) is an important topic in computer vision and pattern recog-nition . It can be widely used in many areas such as image registration, object recognition, motion de-tection, target tracking, ...Point pattern matching (PPM) is an important topic in computer vision and pattern recog-nition . It can be widely used in many areas such as image registration, object recognition, motion de-tection, target tracking, autonomous navigation, and pose estimation. This paper discusses the in-complete matching problem of two point sets under Euclidean transformation. According to geometric reasoning, some definitions for matching clique, support point pair, support index set, and support in-dex matrix, etc. are given. Based on the properties and theorems of them, a novel reasoning algo-rithm is presented, which searches for the optimal solution from top to bottom and could find out as many consistent corresponding point pairs as possible. Theoretical analysis and experimental results show that the new algorithm is very effective, and could be, under some conditions, applied to the PPM problem under other kind of transformations.展开更多
Pattern matching is one of the most performance-critical components for the content inspection based applications of network security, such as network intrusion detection and prevention.To keep up with the increasing ...Pattern matching is one of the most performance-critical components for the content inspection based applications of network security, such as network intrusion detection and prevention.To keep up with the increasing speed network, this component needs to be accelerated by well designed custom coprocessor.This paper presents a parameterized multilevel pattern matching architecture (MPM) which is used on FPGAs.To achieve less chip area, the architecture is designed based on the idea of selected character decoding (SCD) and multilevel method which are analyzed in detail.This paper also proposes an MPM generator that can generate RTL-level codes of MPM by giving a pattern set and predefined parameters.With the generator, the efficient MPM architecture can be generated and embedded to a total hardware solution.The third contribution is a mathematical model and formula to estimate the chip area for each MPM before it is generated, which is useful for choosing the proper type of FPGAs.One example MPM architecture is implemented by giving 1785 patterns of Snort on Xilinx Virtex 2 Pro FPGA.The results show that this MPM can achieve 4.3 Gbps throughput with 5 stages of pipelines and 0.22 slices per character, about one half chip area of the most area-efficient architecture in literature.Other results are given to show that MPM is also efficient for general random pattern sets.The performance of MPM can be scalable near linearly, potential for more than 100 Gbps throughput.展开更多
LFC is a functional language based on recursive functions defined in context-free languages. In this paper, a new pattern matching algorithm for LFC is presented, which can represent a sequence of patterns as an integ...LFC is a functional language based on recursive functions defined in context-free languages. In this paper, a new pattern matching algorithm for LFC is presented, which can represent a sequence of patterns as an integer by an encoding method. It is a rather simple method and produces efficient case-expressions for pattern matching definitions of LFC. The algorithm can also be used for other functional languages, but for nested patterns it may become complicated and further studies are needed.展开更多
A pattern matching based tracking algorithm, named MdcPatRec, is used for the reconstruction of charged tracks in the drift chamber of the BESIII detector. This paper addresses the shortage of segment finding in the M...A pattern matching based tracking algorithm, named MdcPatRec, is used for the reconstruction of charged tracks in the drift chamber of the BESIII detector. This paper addresses the shortage of segment finding in the MdcPatRec algorithm. An extended segment construction scheme and the corresponding pattern dictionary are presented. Evaluation with Monte-Carlo and experimental data show that the new method can achieve higher efficiency for low transverse momentum tracks.展开更多
The research on graph pattern matching(GPM) has attracted a lot of attention. However, most of the research has focused on complex networks, and there are few researches on GPM in the medical field. Hence, with GPM th...The research on graph pattern matching(GPM) has attracted a lot of attention. However, most of the research has focused on complex networks, and there are few researches on GPM in the medical field. Hence, with GPM this paper is to make a breast cancer-oriented diagnosis before the surgery. Technically, this paper has firstly made a new definition of GPM, aiming to explore the GPM in the medical field, especially in Medical Knowledge Graphs(MKGs). Then, in the specific matching process, this paper introduces fuzzy calculation, and proposes a multi-threaded bidirectional routing exploration(M-TBRE) algorithm based on depth first search and a two-way routing matching algorithm based on multi-threading. In addition, fuzzy constraints are introduced in the M-TBRE algorithm, which leads to the Fuzzy-M-TBRE algorithm. The experimental results on the two datasets show that compared with existing algorithms, our proposed algorithm is more efficient and effective.展开更多
文摘Pattern matching method is one of the classic classifications of existing online portfolio selection strategies. This article aims to study the key aspects of this method—measurement of similarity and selection of similarity sets, and proposes a Portfolio Selection Method based on Pattern Matching with Dual Information of Direction and Distance (PMDI). By studying different combination methods of indicators such as Euclidean distance, Chebyshev distance, and correlation coefficient, important information such as direction and distance in stock historical price information is extracted, thereby filtering out the similarity set required for pattern matching based investment portfolio selection algorithms. A large number of experiments conducted on two datasets of real stock markets have shown that PMDI outperforms other algorithms in balancing income and risk. Therefore, it is suitable for the financial environment in the real world.
文摘Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplication of strings. This paper presents a fast string cross pattern matching algorithm based on extracting high frequency strings. Compared with existing algorithms including single-pattern algorithms and multi-pattern matching algorithms, this algorithm is featured by both low time complexity and low space complexity. Because Chinese alphabet is large and the average length of Chinese words is much short, this algorithm is more suitable to process the text written by Chinese, especially when the size of Σ is large and the number of strings is far more than the maximum length of strings of set U.
文摘Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.
基金This project was supported by the National "863" High Technology Research and Development Program of China(2003AA142160) and the National Natural Science Foundation of China (60402019)
文摘The traditional multiple pattern matching algorithm, deterministic finite state automata, is implemented by tree structure. A new algorithm is proposed by substituting sequential binary tree for traditional tree. It is proved by experiment that the algorithm has three features, its construction process is quick, its cost of memory is small. At the same time, its searching process is as quick as the traditional algorithm. The algorithm is suitable for the application which requires preprocessing the patterns dynamically.
基金supported by China MOST project (No.2012BAH46B04)
文摘Pattern matching is a fundamental approach to detect malicious behaviors and information over Internet, which has been gradually used in high-speed network traffic analysis. However, there is a performance bottleneck for multi-pattern matching on online compressed network traffic(CNT), this is because malicious and intrusion codes are often embedded into compressed network traffic. In this paper, we propose an online fast and multi-pattern matching algorithm on compressed network traffic(FMMCN). FMMCN employs two types of jumping, i.e. jumping during sliding window and a string jump scanning strategy to skip unnecessary compressed bytes. Moreover, FMMCN has the ability to efficiently process multiple large volume of networks such as HTTP traffic, vehicles traffic, and other Internet-based services. The experimental results show that FMMCN can ignore more than 89.5% of bytes, and its maximum speed reaches 176.470MB/s in a midrange switches device, which is faster than the current fastest algorithm ACCH by almost 73.15 MB/s.
文摘Most of the Point Pattern Matching (PPM) algorithm performs poorly when the noise of the point's position and outliers exist. This paper presents a novel and robust PPM algorithm which combined Point Pair Topological Characteristics (PPTC) and Spectral Matching (SM) together to solve the afore mentioned issues. In which PPTC, a new shape descriptor, is firstly proposed. A new comparability measurement based on PPTC is defined as the matching probability. Finally, the correct matching results are achieved by the spectral matching method. The synthetic data experiments show its robustness by comparing with the other state-of-art algorithms and the real world data experiments show its effectiveness.
文摘Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many other applications highly depend on pattern matching or similarity searches. In this paper, we discuss some of the string matching solutions developed in the past. Then, we present a novel mathematical model to search for a given pattern and it’s near approximates in the text.
文摘This paper discusses potential application of fuzzy set theory,more specifically, pattern matching, in assessing risk in chemicalplants. Risk factors have been evaluated using linguisticrepresentations of the quantity of the hazardous substance involved,its frequency of interaction with the environment, severity of itsimpact and the uncertainty involved in its detection in advance. Foreach linguistic value there is a corresponding membership functionranging over a universe of discourse. The risk scenario created by ahazard/hazardous situation having highest degree of featural value istaken as the known pattern.
文摘Graph pattern matching(GPM)can be used to mine the key information in graphs.Exact GPM is one of the most commonly used methods among all the GPM-related methods,which aims to exactly find all subgraphs for a given query graph in a data graph.The exact GPM has been widely used in biological data analyses,social network analyses and other fields.In this paper,the applications of the exact GPM were first introduced,and the research progress of the exact GPM was summarized.Then,the related algorithms were introduced in detail,and the experiments on the state-of-the-art exact GPM algorithms were conducted to compare their performance.Based on the experimental results,the applicable scenarios of the algorithms were pointed out.New research opportunities in this area were proposed.
文摘Based on the study of single pattern matching, MBF algorithm is proposed by imitating the string searching procedure of human. The algorithm preprocesses the pattern by using the idea of Quick Search algorithm and the already-matched pattern psefix and suffix information. In searching phase, the algorithm makes use of the!character using frequency and the continue-skip idea. The experiment shows that MBF algorithm is more efficient than other algorithms.
基金This work is supported by the following projects: National Natural Science Foundation of China grant 60772136, 111 Development Program of China NO.B08038, National Science & Technology Pillar Program of China NO.2008BAH22B03 and NO. 2007BAH08B01.
文摘In order to identify any traces of suspicious activities for the networks security, Network Traffic Analysis has been the basis of network security and network management. With the continued emergence of new applications and encrypted traffic, the currently available approaches can not perform well for all kinds of network data. In this paper, we propose a novel stream pattern matching technique which is not only easily deployed but also includes the advantages of different methods. The main idea is: first, defining a formal description specification, by which any series of data stream can be unambiguously descrbed by a special stream pattern; then a tree representation is constructed by parsing the stream pattern; at last, a stream pattern engine is constructed with the Non-t-mite automata (S-CG-NFA) and Bit-parallel searching algorithms. Our stream pattern analysis system has been fully prototyped on C programming language and Xilinx Vn-tex2 FPGA. The experimental results show the method could provides a high level of recognition efficiency and accuracy.
基金supported by the National Natural Science Foundation of China under Grant Nos.62472175,62072176,12271172,and 11871221the Shanghai Trusted Industry Internet Software Collaborative Innovation Center,and the“Digital Silk Road”Shanghai International Joint Lab of Trustworthy Intelligent Software under Grant No.22510750100.
文摘The realization of quantum algorithms relies on specific quantum compilations according to the underlying quantum processors. However, there are various ways to physically implement qubits and manipulate those qubits in different physical devices. These differences lead to different communication methods and connection topologies, with each vendor implementing its own set of primitive gates. Therefore, quantum circuits have to be rewritten or transformed in order to be transplanted from one platform to another. We propose a pattern matching based framework for rewriting quantum circuits, called QRewriting. It takes advantage of a new representation of quantum circuits using symbolic sequences. Unlike the traditional approach using directed acyclic graphs, the new representation allows us to easily identify the patterns that appear non-consecutively but are reducible. Then, we convert the problem of pattern matching into that of finding distinct subsequences and propose a polynomial-time dynamic programming based pattern matching and replacement algorithm. We develop a rule library for basic optimizations and rewrite the arithmetic and Toffoli circuits from a commonly used gate set to the gate set supported by the Surface-17 quantum processor. Compared with a state-of-the-art quantum circuit optimization framework PaF optimized on the BIGD benchmarks, QRewriting further reduces the depth and the gate count by an average of 26.5% and 17.4%, respectively.
基金supported in part by the National Natural Science Foundation of China under Grant Nos.61229301 and 60828005the Program for Changjiang Scholars and Innovative Research Team in University(PCSIRT)of the Ministry of Education,China,under Grant No.IRT13059the National Science Foundation(NSF)of USA under Grant No.0514819
文摘Pattern matching with wildcards(PMW) has great theoretical and practical significance in bioinformatics,information retrieval, and pattern mining. Due to the uncertainty of wildcards, not only is the number of all matches exponential with respect to the maximal gap flexibility and the pattern length, but the matching positions in PMW are also hard to choose. The objective to count the maximal number of matches one by one is computationally infeasible. Therefore,rather than solving the generic PMW problem, many research efforts have further defined new problems within PMW according to different application backgrounds. To break through the limitations of either fixing the number or allowing an unbounded number of wildcards, pattern matching with flexible wildcards(PMFW) allows the users to control the ranges of wildcards. In this paper, we provide a survey on the state-of-the-art algorithms for PMFW, with detailed analyses and comparisons, and discuss challenges and opportunities in PMFW research and applications.
基金supported by the National Natural Science Foundation of China under Grant Nos.60803030,60925009,60921002the National Basic Research 973 Program of China under Grant No.2011CB302502
文摘Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures.We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns.The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized.We formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization problem.Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.
文摘Point pattern matchingisanimportantproblem inthefieldsofcomputervision and patternrecognition.In this paper,new algorithms based onirreducible matrix andrelativeinvariantfor matchingtwosets ofpoints withthe same cardinality are proposed.Theirfundamentalideaistransformingthetwo dimensionalpointsets with n points intothe vectorsin n dimensional space. Considering these vectors as one dimensional point patterns,these new algorithms aim atreducingthe point matching problem to thatofsorting vectorsin n dimensionalspace aslong asthe sensornoise does notalterthe order ofthe elementsinthe vectors.Theoreticalanalysis and simulationresults show thatthe new algorithms are effective .
基金This work was supported by "985" Project of Tsinghua University.
文摘Point pattern matching (PPM) is an important topic in computer vision and pattern recog-nition . It can be widely used in many areas such as image registration, object recognition, motion de-tection, target tracking, autonomous navigation, and pose estimation. This paper discusses the in-complete matching problem of two point sets under Euclidean transformation. According to geometric reasoning, some definitions for matching clique, support point pair, support index set, and support in-dex matrix, etc. are given. Based on the properties and theorems of them, a novel reasoning algo-rithm is presented, which searches for the optimal solution from top to bottom and could find out as many consistent corresponding point pairs as possible. Theoretical analysis and experimental results show that the new algorithm is very effective, and could be, under some conditions, applied to the PPM problem under other kind of transformations.
基金Supported by the National Natural Science Foundation of China (Grant No 60803002)the Excellent Young Scholars Research Fund of Beijing Institute of Technology
文摘Pattern matching is one of the most performance-critical components for the content inspection based applications of network security, such as network intrusion detection and prevention.To keep up with the increasing speed network, this component needs to be accelerated by well designed custom coprocessor.This paper presents a parameterized multilevel pattern matching architecture (MPM) which is used on FPGAs.To achieve less chip area, the architecture is designed based on the idea of selected character decoding (SCD) and multilevel method which are analyzed in detail.This paper also proposes an MPM generator that can generate RTL-level codes of MPM by giving a pattern set and predefined parameters.With the generator, the efficient MPM architecture can be generated and embedded to a total hardware solution.The third contribution is a mathematical model and formula to estimate the chip area for each MPM before it is generated, which is useful for choosing the proper type of FPGAs.One example MPM architecture is implemented by giving 1785 patterns of Snort on Xilinx Virtex 2 Pro FPGA.The results show that this MPM can achieve 4.3 Gbps throughput with 5 stages of pipelines and 0.22 slices per character, about one half chip area of the most area-efficient architecture in literature.Other results are given to show that MPM is also efficient for general random pattern sets.The performance of MPM can be scalable near linearly, potential for more than 100 Gbps throughput.
基金the National Natural Science Foundation (No.69873042), the National'863' High-Tech Programme (No. 863- 306- 05-04- 1 ), and th
文摘LFC is a functional language based on recursive functions defined in context-free languages. In this paper, a new pattern matching algorithm for LFC is presented, which can represent a sequence of patterns as an integer by an encoding method. It is a rather simple method and produces efficient case-expressions for pattern matching definitions of LFC. The algorithm can also be used for other functional languages, but for nested patterns it may become complicated and further studies are needed.
基金Supported by Ministry of Science and Technology of China(2009CB825200)Joint Funds of National Natural Science Foundation of China(11079008,11121092)+1 种基金Natural Science Foundation of China(10905091)SRF for ROCS of SEM
文摘A pattern matching based tracking algorithm, named MdcPatRec, is used for the reconstruction of charged tracks in the drift chamber of the BESIII detector. This paper addresses the shortage of segment finding in the MdcPatRec algorithm. An extended segment construction scheme and the corresponding pattern dictionary are presented. Evaluation with Monte-Carlo and experimental data show that the new method can achieve higher efficiency for low transverse momentum tracks.
基金supported by the National Natural Science Foundation of China under grants 62076087&61906059the Program for Changjiang Scholars and Innovative Research Team in University(PCSIRT)of the Ministry of Education of China under grant IRT17R32
文摘The research on graph pattern matching(GPM) has attracted a lot of attention. However, most of the research has focused on complex networks, and there are few researches on GPM in the medical field. Hence, with GPM this paper is to make a breast cancer-oriented diagnosis before the surgery. Technically, this paper has firstly made a new definition of GPM, aiming to explore the GPM in the medical field, especially in Medical Knowledge Graphs(MKGs). Then, in the specific matching process, this paper introduces fuzzy calculation, and proposes a multi-threaded bidirectional routing exploration(M-TBRE) algorithm based on depth first search and a two-way routing matching algorithm based on multi-threading. In addition, fuzzy constraints are introduced in the M-TBRE algorithm, which leads to the Fuzzy-M-TBRE algorithm. The experimental results on the two datasets show that compared with existing algorithms, our proposed algorithm is more efficient and effective.