The original input file is a simple CSV file with 7 fields and 316 records. The first record is a header row.
We can process this CSV file and generate a simple XML file very easily.
The output file isn't optimal for processing and also contains some repeating (unecessary) data. The location node repeats data already in the lat./long. nodes and the year value is repeated too.
However, we've taken a big leap. The data is now in a DOM compatible format!
For readabilty & easier processing purposes we would prefer a structure like the following:
For this transformation we can use PHP's built-in DOM facade library simpleXML or use XSLT (built for such xml-to-xml transformations).
Unfortunatly current PHP builds don't provide native support for XPath 2.0/3.0 so we have to do with XPath 1.0.
XPath 2.0/3.0 provide very powerful built in functions like distinct-values() so an expression like distinct-values(//record/Ward) would generate an array with unique wards.
(We can get around this by installing and using an external library like Saxon but lets use what we have - the current UWE/CEMS setup and XPath 1.0.)
Let's first consider a PHP only solution using the simpleXML and XMLWriter modules & XPath 1.0.
You can view the generated file transport_v1.xml
We'll next use XSLT 1.0 to do the same task and take a look at XSLT 2.0/3.0.
No comments:
Post a Comment