In this paper, we focus on efficient construction of restricted subtree (RSubtree) results for XML keyword queries on a multicore system. We firstly show that the perfor- mance bottlenecks for existing methods lie in 1) computing the set of relevant keyword nodes (RKNs) for each subtree root node, 2) constructing the corresponding RSubtree, and 3) parallel execution. We then propose a two-step generic top-down subtree construction algorithm, which computes SLCA/ELCA nodes in the first step, and parallelly gets RKNs and generates RSubtree results in the second step, where generic means that 1) our method can be used to compute dif- ferent kinds of subtree results, 2) our method is independent of the query semantics; top-down means that our method con- structs each RSubtree by visiting nodes of the subtree con- structed based on an RKN set level-by-level from left to right, such that to avoid visiting as many useless nodes as possible. The experimental results show that our method is much more efficient than existing ones according to various metrics.
Answering reachability queries is one of the fundamental graph operations.Existing approaches either accelerate index construction by constructing an index that covers only partial reachability relationship,which may result in performing cost traversing operation when answering a query;or accelerate query answering by constructing an index covering the complete reachability relationship,which may be inefficient due to comparing the complete node labels.We propose a novel labeling scheme,which covers the complete reachability relationship,to accelerate reachability queries processing.The idea is to decompose the given directed acyclic graph(DAG)G into two subgraphs,G1 and G2.For G1,we propose to use topological labels consisting of two integers to answer all reachability queries.For G2,we construct 2-hop labels as existing methods do to answer queries that cannot be answered by topological labels.The benefits of our method lie in two aspects.On one hand,our method does not need to perform the cost traversing operation when answering queries.On the other hand,our method can quickly answer most queries in constant time without comparing the whole node labels.We confirm the efficiency of our approaches by extensive experimental studies using 20 real datasets.