3 Approach and 3.1 Differential Testing for XML Processors
3.2 XPath Expression Generation
4.3 Comparison to the State of the Art
4.4 Analysis of BaseX Historical Bug Reports
6 Conclusion, Acknowledgments, and References
Figure 3 shows an overview of the approach using the same example as in Figure 1. At a high level, our approach consists of three main steps. First, we randomly generate an XML document as the context for the following queries (step 1). We then generate an XPath expression that we will subsequently validate (step 2 to step 5). Finally, we execute the XPath expression on the XML document using all engines under test and compare the resulting outputs to detect potential bugs (step 6). In the subsequent subsections, we explain these steps in reverse order to reflect their importance.
We guide the XPath expression generation towards queries that reference nodes and attributes present in the XML document and result in non-empty result sets based on the intuition that they are more likely to stress the underlying logic of the tested systems. To generate XPath expressions with non-empty result sets, we construct the query section-by-section and ensure that a non-empty result set is produced before proceeding with the next section. Each section consists of a section prefix and predicates, and we first generate the prefix (step 2) and then the predicate. By restricting the section prefix, we guarantee that the result contains at least one node. From the nodes selected by the section prefix, we randomly select a node as a target (step 3). We then generate a predicate aiming to select the targeted node using a bottom-up tree construction method (step 4). We rectify the predicate to ensure that the result set contains the targeted node (step 5). We repeat this process until the XPath query reaches the desired length.
As detailed subsequently, differential testing enables us to find both logic bugs as well as internal errors when comparing the results of XML processors implementing the same XPath standard.
Query execution. When passing XML documents and XPath queries to different systems, we must account for the different
input interfaces. For example, DBMSs use database connection interfaces to store and query data, while Saxon can be used as a library. To abstract this, we treat every XML processor implementation as a function that returns a result set and expects two string values, namely an XML document XML and an XPath query XPATH. Listing 1 shows an implementation of this interface for Oracle Database using SQL statements. It creates a table t, inserts the XML document—the XMLType constructor is used to convert the string to an XML data type—and uses an XMLQuery function call in a SELECT statement to compute the result set. For BaseX and eXist-db, similar to the commands shown for Oracle Database, we also start with an empty database and subsequently insert an XML document. Listing 2 shows an excerpt of the Java code for Saxon. First, the call to compile converts the textual XPath query to an executable object, which is then loaded. Unlike for the DBMSs, which require inserting data into a database, for Saxon, the XML document is simply associated with the query using the setContextItem call. The evaluate call computes the result, which is returned for comparison.
Bug identification. We identify both logic bugs and internal errors by comparing the returned results of different processors on the same XML document and query. We identify logic bugs when the tested systems return different node-set outputs for the same test cases. To parse and track the results easily under different output formats, we use unique node ids to identify element nodes. We detect internal errors as discrepancies with respect to errors. Rather than checking for an exact error message match, we validate whether all the systems produce an error, or all execute the XPath query successfully. If only a subset of the systems report an error for the same XPath query, we found a potential bug.
Different XPath standards. Our approach and tool are applicable to both XPath 1.0 and XPath 3.0. However, due to the differences in the formats, only processors using the same standard can be tested. Functionality that is supported only in XPath 3.0, can be disabled while generating test cases for processors that implement XPath 1.0. For example, sequence functions, such as subsequence, are defined only for the XPath 3.0 standard. When generating test cases for XPath 1.0 processors, we omit to generate subsequence function nodes for predicates. We did not encounter any functions or operators that were removed in the XPath 3.0 standard, so all expressions that we generate when testing XPath 1.0 processors can be used also when testing XPath 3.0 processors. By comparing only processors with the same XPath standard against each other, the difference in the results between different XPath standards (see Section 2) has no influence on the testing process.
This paper is available on arxiv under CC BY 4.0 DEED license.
Authors:
(1) Shuxin Li, Southern University of Science and Technology China and Work done during an internship at the National University of Singapore ([email protected]);
(2) Manuel Rigger, National University of Singapore Singapore ([email protected]).