Mapping the data to MySQL 2: Using XSLT
Another approach to mapping the dataset would be to use XSLT to write out the necessary SQL Schema and Insert statements. We could then feed this "rendering" to MySQL using a PHP built pipe.
XSLT (as a xml dom to other format converter) was created to get this kind of job done. Hence XSLT is a great tool for generating code.
"I'd rather write programs that write programs than write programs" said ap.
We will stick to using XSLT 1.0 & the oXygen editor. If you don't have access to OXygen then use a tool like Kernow.
(On the second semester sister module ATWD2 we will consider XSLT 2.0/3.0, XPATH 3.0 & XQuery in much more detail. For instance, using XSLT to generate PHP code.)
Re-exmine the source file: transport_v1.xml
It's always good to examine the source file/s (once again) before embarking on a XSL stylesheet.
We won't make any structural changes to our input file but notice the following:
- the by 'car' percent value is derived. A sum of 'as driver' + 'as passenger';
- hence the last seven percent values for any ward should add up to 100.
(Points worth recalling when we come to develop a Schema.)
The eXtensible Stylesheet Language
XSL is a family of (W3C) recommendations for defining XML document transformation and presentation (formatting).
XSL is really a pattern oriented transformation language. XPATH is used to find the patterns in one or more input documents. When the (user defined) patterns are matched, XSL applies the transformations associated with those patterns and writes out the result.
This makes XSLT (the implementation of XSL) like SQL & Prolog, a declartive programming language. Expressing the logic of the computation rather than the control flow. This is in contrast to imperative programming languages (like PHP) where one or more algorithms are implemented in terms of explicit steps.
To write XSL productively you need some XML aware tools. oXygen and Kernow are great because you can load a source document and try out various XPath expressions interactively. The other reason is that Namespaces play a crucial role when developing with the XML family of technologies. Tools that are namespace aware can help greatly.
Developing the Stylesheet
In the script that follows we've used XSLT 1.0 & XPATH 1.0.
Note how the script takes advantage of the well-ordered data in the input file (i.e. the mode elements are in the same order within each ward element). This order (position) is used to generate keys.
That is, the position of the (parent) ward element in relation to its siblings is worked out using the xpath expression: count(../preceding-sibling::*) + 1 (line 25).
(Question: why is this order not significant for the PHP script (xmltodb.php) and how might we go about not relying on the order in the XSLT code.)
View the output file: transport_inserts.sql
Deconstruct the above script for questions, clarifications and criticisms. You can comment in the workshop sessions or ask a question below.