Welcome to the project 
XPath Benchmark Home
HomeC  Home
docgenC  Document generator
otherXMLC  Other XML benchmarks
otherBenchC  Other benchmarks

A Benchmark for XPath Evaluation

The fast-growing use of XML increases the need for efficient, flexible query languages specifically designed for XML. There are several query languages for XML data collections. Examples include XML-QL, LOREL, XSL, XQL, and XQuery. Among these query languages, XQuery is a W3C recommendation, and is likely to become the most widely used, just like SQL in the field of database languages. An important component in many XML query languages, especially those promulgated by the W3C, is XPath. XPath is a language for addressing locations in an XML document, and was developed in part by the XML Query and XSL working groups. In addition to being used in XQuery, XPath is also a core component in XSL Transformations (XSLT) and XPointer. To date, XQuery is still being developed as a working draft and most of the products that claim XQuery support are in the early stage of development, so we decided to build a benchmark for XPath as a number of XPath query engines already exist any have full support.

At this time, there's still no commonly agreed standard of XML application scenarios. So we reports on a generic benchmark. The benchmark focuses on measuring the cost of query processing. XPath queries are evaluated against a tree-like data model. Queries typically traverse part of the tree-like data model. The efficiency of the tree-traversal has a major impact on the cost of query processing. The tree can vary in depth, density, size, and the kind of information in each node. We designed an XML document generator which generates XML documents that conform to several factors which control the shape and size of the tree. By varying only one of the control factors (e.g., tree depth) and keeping the other factors constant the benchmark is able to isolate the impact of that factor on query performance. The benchmark also includes a suite of query templates that can be instantiated to produce a set of benchmark queries. Overall, the benchmark is designed to assess the impact of trees of different sizes and shapes on query performance. This will help query engine developers understand and evaluate implementation alternatives, and also help users to decide which query engine best fits their needs.

E-mail questions or comments to Hao Jin or Curtis.Dyreson at usu.edu