3 Approach and 3.1 Differential Testing for XML Processors
3.2 XPath Expression Generation
4.3 Comparison to the State of the Art
4.4 Analysis of BaseX Historical Bug Reports
6 Conclusion, Acknowledgments, and References
Extensible Markup Language (XML) is a widely used file format for data storage and transmission. Many XML processors support XPath, a query language that enables the extraction of elements from XML documents. These systems can be affected by logic bugs, which are bugs that cause the processor to return incorrect results. In order to tackle such bugs, we propose a new approach, which we realized as a system called XPress. As a test oracle, XPress relies on differential testing, which compares the results of multiple systems on the same test input, and identifies bugs through discrepancies in their outputs. As test inputs, XPress generates both XML documents and XPath queries. Aiming to generate meaningful queries that compute non-empty results, XPress selects a so-called targeted node to guide the XPath expression generation process. Using the targeted node, XPress generates XPath expressions that reference existing context related to the targeted node, such as its tag name and attributes, while also guaranteeing that a predicate evaluates to true before further expanding the query. We tested our approach on six mature XML processors, BaseX, eXist-DB, Saxon, PostgreSQL, libXML2, and a commercial database system. In total, we have found 27 unique bugs in these systems, of which 25 have been verified by the developers, and 20 of which have been fixed. XPress is efficient, as it finds 12 unique bugs in BaseX in 24 hours, which is 2× as fast as naive random generation. We expect that the effectiveness and simplicity of our approach will help to improve the robustness of many XML processors.
CCS CONCEPTS
• Software and its engineering → Software testing and debugging
KEYWORDS
XML processors, XPath generation, differential testing
Extensible Markup Language (XML) is a widely used file format for data storage and transmission. XPath is an expression language, which provides the ability to navigate through XML documents to select wanted nodes. XPath is also at the core of other XML query language standards such as XSLT [7] and XQuery [6], making it a fundamental XML query language.
Various XML document processors have been developed for extracting information from XML documents efficiently. We loosely categorize them depending on whether they can store XML documents in addition to processing them—that is, whether they are Database Management Systems (DBMSs), or provide only processing functionality. In terms of DBMSs specialized for XML, popular examples include BaseX [8] and eXist-DB [10]. Many general-purpose DBMSs such as Oracle Database [14], MySQL [13], and PostgreSQL [15] have adopted support for processing XML documents. In fact, out of the 10 most popular DBMSs according to the DB-engines ranking [9], 6 support at least partial XML document parsing. A popular example of a processor without storage functionality is Saxon. Saxon [18] is an in-memory XML processor that can be either used in a standalone way or embedded as a library. Finally, libxml2 [12] is a popular XML processing library written in C. XPath is supported by all of these processors.
XML document processors can be affected by logic bugs. Logic bugs are bugs that cause the XML processor to produce incorrect results without crashing the system, meaning that they can often go unnoticed. In order to find such bugs, developers mainly rely on test suites such as the XPathMark test suite for XPath [25], the W3C qt3 test suite [19], or hand-written unit tests. Manually writing tests requires much effort, and it is challenging to comprehensively cover the XML processors’ functionality. To find logic bugs automatically, a so-called test oracle is required that can determine whether the system’s output is correct in order to find logic bugs. Todic and Uzelac have proposed an automated testing technique for SQLServer’s index support; their test oracle compared the results of a given query with and without index definition [41]. A limitation of this technique is that it is applicable only to finding index-related bugs in DBMSs. To the best of our knowledge, no other test oracles have been proposed in this context.
In order to detect XPath-related bugs in XML processors, we propose differential testing as an oracle. The core idea of differential testing is to use one input that is executed using multiple systems; any discrepancy in the results indicates a potential bug in the system. For testing XML processors, the input for the XML processors under test is an XML document and XPath expression, and the results are a sequence of XML nodes or values. Differential testing has been successfully applied in various related domains, such as relational DBMSs [38], compilers [43, 45], JVM implementations [23], ORM systems [39], and graph DBMSs [44, 46]. Its effectiveness hinges on two main requirements. First, multiple systems to be compared must be available. As discussed above, various XML processors with XPath support exist. Second, for any valid input, the systems should produce the same result, since otherwise, a differential-testing approach raises many false alarms. This requirement is not always met, for example, when applying differential testing to relational DBMSs, where the “common SQL subset is relatively small and changes with each release" and NULL handling differs between DBMSs [38]. As we found, XPath is a well-defined language by the W3C standard, and XPath implementations of the same standard follow the same language rules, making differential testing highly applicable.
To generate test cases, we propose an approach that selects a so-called targeted node from the XML document, based on which we generate a query that is guaranteed to fetch at least that node. As such, it tackles two challenges that might prevent testing from exercising interesting behaviors. First, by generating the query based on the targeted node, we can guarantee that we access a tag name, attributes, and relative paths that exist with respect to at least the targeted node. Second, by rectifying predicates so that they evaluate to true for the targeted node, we can ensure that the result set is non-empty even for complex queries. A similar highlevel idea has been proposed in the context of testing relational DBMSs, called Pivoted Query Synthesis (PQS) [36], where a pivot row was selected, based on which predicates were rectified to return true. Apart from applying that idea in a different context, we also propose a different rectification strategy that eschews mirroring the predicate’s execution logic in the testing tool, which was required for realizing PQS.
We implemented our approach as a tool named XPress,[1] which, to the best of our knowledge, is the first general automated testing tool for XML processors, and tested our method on six mature and widely-used XML processors BaseX, eXist-DB, Saxon, PostgreSQL, libXML2 and a commercial DBMS. The experimental results show that our approach is effective in detecting XPath-related logic bugs in XML processors. We found 27 previously unknown unique bugs, not covered by existing test suites, of which 19 were logic bugs. 25 of them have been confirmed, and 20 of them have been fixed. Furthermore, these test cases have been integrated into the aforementioned qt3 test suite, so that they can detect potential bugs in XML processors that we have not tested. Our experiments demonstrate that our proposed guided query generation process improves testing efficiency by finding 2× more unique bugs within 24 hours in BaseX as compared to random generation. Given the high effectiveness and efficiency of the approach, we believe it will likely be adopted by developers of XML processors to improve their systems.
To summarize, we make the following contributions:
• We propose the first general approach for automatically testing XML processors in order to find logic bugs.
• We implemented and evaluated our approach on six widelyused XML process systems, which successfully found 27 previously-unknown unique bugs.
This paper is available on arxiv under CC BY 4.0 DEED license.
[1] Our artifact is publicly available at https://zenodo.org/records/10473926
Authors:
(1) Shuxin Li, Southern University of Science and Technology China and Work done during an internship at the National University of Singapore ([email protected]);
(2) Manuel Rigger, National University of Singapore Singapore ([email protected]).