Debugging XProc is mostly a process of elimination, sometimes with hints from the error messages. Here follows an example of debugging XProc with oXygen. The process is similar for other editors. The following is more of a story to show how debugging XProc can be done, and not so much a document to look up solutions to your particular problem.
This is the XProc script we are working on:
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step version="1.0"
xmlns:p="http://www.w3.org/ns/xproc"
xmlns:pef="http://www.daisy.org/ns/2008/pef"
xmlns:px="http://www.daisy.org/ns/pipeline/xproc"
xmlns:l="http://xproc.org/library"
exclude-inline-prefixes="#all"
type="pef:validate"
name="main">
<p:input port="source" primary="true"/>
<p:output port="result" primary="true"/>
<p:option name="assert-valid" required="false" select="'false'"/>
<p:import href="http://www.daisy.org/pipeline/modules/validation-utils/library.xpl"/>
<p:variable name="document-type" select="'PEF'"/>
<p:variable name="base-uri" select="base-uri()"/>
<p:variable name="document-name" select="tokenize($base-uri, '/')[last()]"/>
<l:relax-ng-report name="validate-against-relaxng">
<p:input port="schema">
<p:document href="schema/pef-2008-1.rng"/>
</p:input>
<p:input port="source">
<p:pipe step="main" port="source"/>
</p:input>
</l:relax-ng-report>
<px:combine-validation-reports name="combined-error-report">
<p:with-option name="document-type" select="$document-type"/>
<p:with-option name="document-name" select="$document-name"/>
<p:input port="source">
<p:pipe port="report" step="validate-against-relaxng"/>
</p:input>
</px:combine-validation-reports>
<px:validation-report-to-html name="html-report">
<p:input port="source">
<p:pipe port="result" step="combined-error-report"/>
</p:input>
</px:validation-report-to-html>
</p:declare-step>
The script has one required input port and no required options. We create a scenario in oXygen where we reference the following input document on the input port:
<?xml version="1.0" encoding="UTF-8"?>
<pef xmlns="http://www.daisy.org/ns/2008/pef" version="2008-1">
<head xmlns:dc="http://purl.org/dc/elements/1.1/">
<meta>
<dc:date>2015-12-20</dc:date>
<dc:format>application/x-pef+xml</dc:format>
<dc:identifier>X</dc:identifier>
</meta>
</head>
<body>
<volume rows="10" cols="10" rowgap="0" duplex="true">
<section>
<page>
<row>⠁⠁⠁⠁</row>
<row/>
<row/>
<row/>
<row/>
<row/>
<row/>
<row/>
<row/>
<row/>
</page>
</section>
</volume>
</body>
</pef>
Our XProc script has one p:import
, and it references a file that is not
on our file system. We find (or make) a local copy of the project containing
that file on our computer, and add that projects catalog.xml
file
to oXygen through Preferences -> XML -> XML Catalog.
Note that if the referenced project is part of a repository with other projects, then there will often be a "main catalog" in that repository importing all the sub-projects catalogs, so that you don't have to add all sub-project catalogs to oXygen one-by-one.
<p:import href="http://www.daisy.org/pipeline/modules/validation-utils/library.xpl"/>
We get this error in oXygen:
Scenario: validate.xpl
XProc file: /home/jostein/daisy-pipeline/pipeline-modules/pipeline-mod-braille/pipeline-braille-utils/pef-utils/pef-utils/src/main/resources/xml/validate.xpl
Engine name: Calabash XProc
Severity: error
Description: err:XC0053 : XC0053 It is a dynamic error if the assert-valid option is true and the input document is not valid.
So the error says that we're getting a XC0053
error. We look this up in the XProc specification
and find that this is an error thrown by the steps p:validate-with-relax-ng
, and p:validate-with-xml-schema
.
Reading up on the descriptions of those steps in the specification doesn't really bring us any further.
Since there's no line reference in the error message, we'll first need to locate where the error is thrown. We start commenting out bit by bit of the XProc script, re-running the script each time to see if the error disappears. Luckily this XProc script is a strictly linear script where the default output port of the first step automatically connects to the default input port of the next step, and so on, so we don't need to modify any port connections when commenting stuff out.
First we comment out px:validation-report-to-html
:
<!--<px:validation-report-to-html name="html-report">
<p:input port="source">
<p:pipe port="result" step="combined-error-report"/>
</p:input>
</px:validation-report-to-html>-->
We re-run the script, but we still get the same error. So we know that the error is not with
px:validation-report-to-html
. Next, we comment out px:combine-validation-reports
:
<!--<px:combine-validation-reports name="combined-error-report">
<p:with-option name="document-type" select="$document-type"/>
<p:with-option name="document-name" select="$document-name"/>
<p:input port="source">
<p:pipe port="report" step="validate-against-relaxng"/>
</p:input>
</px:combine-validation-reports>-->
We re-run the script, and voìla; no error! So there's something wrong with px:combine-validation-reports
.
But what? We need to dig deeper...
We look through our p:import
s, and in this case there's only one so there's not many
places to look:
<p:import href="http://www.daisy.org/pipeline/modules/validation-utils/library.xpl"/>
In oXygen, you can place your cursor on the URL and click CTRL + enter to open that file. The URL will be resolved to a file on our local file system because we have our catalogs set up properly.
The file we find turns out to be named validation-utils-library.xpl
and is a
XProc library file importing other XProc script. There's a reference to a file
called combine-validation-reports.xpl
, so we open that one:
<p:library version="1.0" (...)>
(...)
<p:import href="combine-validation-reports.xpl">
<p:documentation>Utility step that combines many validation reports into one XML document.</p:documentation>
</p:import>
</p:library>
In combine-validation-reports.xpl
we find the declaration of px:combine-validation-reports
,
which is the step we're having trouble with:
<p:declare-step version="1.0" name="combine-validation-reports" type="px:combine-validation-reports"
xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:px="http://www.daisy.org/ns/pipeline/xproc"
xmlns:pxi="http://www.daisy.org/ns/pipeline/xproc/internal"
xmlns:tmp="http://www.daisy.org/ns/pipeline/tmp" xmlns:d="http://www.daisy.org/ns/pipeline/data"
xmlns:l="http://xproc.org/library" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:dtb="http://www.daisy.org/z3986/2005/dtbook/"
xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:svrl="http://purl.oclc.org/dsdl/svrl"
exclude-inline-prefixes="#all">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">Combine validation reports</h1>
<p px:role="desc">Wrap one or more validation reports and optional document data. This
prepares it for the validation-report-to-html step.</p>
</p:documentation>
<p:input port="source" primary="true" sequence="true">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">source</h1>
<p px:role="desc">A validation report</p>
</p:documentation>
</p:input>
<p:option name="document-name" required="false">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">document-name</h1>
<p px:role="desc">The name of the document that was validated. Used for display
purposes.</p>
</p:documentation>
</p:option>
<p:option name="document-type" required="false">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">document-type</h1>
<p px:role="desc">The type of the document. Used for display purposes.</p>
</p:documentation>
</p:option>
<p:option name="document-path" required="false" select="''">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">document-path</h1>
<p px:role="desc">The full path to the document, if available.</p>
</p:documentation>
</p:option>
<p:option name="report-path" required="false" select="''">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">report-path</h1>
<p px:role="desc">The path to the validation report XML, if available.</p>
</p:documentation>
</p:option>
<p:option name="internal-info" required="false" select="''">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">internal-info</h1>
<p px:role="desc">A string to stash in the document-info/@internal attribute.</p>
</p:documentation>
</p:option>
<p:output port="result" primary="true"/>
<p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>
<!-- iterate through the documents on the source port -->
<p:for-each>
<p:variable name="root-element-name" select="*/name()"/>
<p:wrap match="/" wrapper="report" wrapper-prefix="d"
wrapper-namespace="http://www.daisy.org/ns/pipeline/data"/>
<p:choose>
<p:when test="$root-element-name = 'c:errors'">
<p:add-attribute match="d:report">
<p:with-option name="attribute-name" select="'type'"/>
<p:with-option name="attribute-value" select="'relaxng'"/>
</p:add-attribute>
</p:when>
<p:when test="$root-element-name = 'd:errors'">
<p:add-attribute match="d:report">
<p:with-option name="attribute-name" select="'type'"/>
<p:with-option name="attribute-value" select="'filecheck'"/>
</p:add-attribute>
</p:when>
<p:when test="$root-element-name = 'svrl:schematron-output'">
<p:add-attribute match="d:report">
<p:with-option name="attribute-name" select="'type'"/>
<p:with-option name="attribute-value" select="'schematron'"/>
</p:add-attribute>
</p:when>
<p:otherwise>
<p:add-attribute match="d:report">
<p:with-option name="attribute-name" select="'type'"/>
<p:with-option name="attribute-value" select="'unknown'"/>
</p:add-attribute>
</p:otherwise>
</p:choose>
</p:for-each>
<p:wrap-sequence name="combine-reports" wrapper="reports"
wrapper-namespace="http://www.daisy.org/ns/pipeline/data" wrapper-prefix="d"/>
<p:insert position="last-child">
<p:input port="insertion">
<p:pipe port="result" step="combine-reports"/>
</p:input>
<p:input port="source">
<p:inline>
<d:document-validation-report>
<d:document-info/>
</d:document-validation-report>
</p:inline>
</p:input>
</p:insert>
<p:group name="add-document-metadata">
<p:output port="result"/>
<p:choose>
<p:when test="string-length($document-path) > 0">
<p:insert match="d:document-validation-report/d:document-info"
position="first-child">
<p:input port="insertion">
<p:inline>
<d:document-path>@@</d:document-path>
</p:inline>
</p:input>
</p:insert>
<p:string-replace match="//d:document-path/text()">
<p:with-option name="replace"
select="concat('"', $document-path, '"')"/>
</p:string-replace>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
<p:choose>
<p:when test="string-length($document-type) > 0">
<p:insert match="d:document-validation-report/d:document-info"
position="first-child">
<p:input port="insertion">
<p:inline>
<d:document-type>@@</d:document-type>
</p:inline>
</p:input>
</p:insert>
<p:string-replace match="//d:document-type/text()">
<p:with-option name="replace"
select="concat('"', $document-type, '"')"/>
</p:string-replace>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
<p:choose>
<p:when test="string-length($document-name) > 0">
<p:insert match="d:document-validation-report/d:document-info"
position="first-child">
<p:input port="insertion">
<p:inline>
<d:document-name>@@</d:document-name>
</p:inline>
</p:input>
</p:insert>
<p:string-replace match="//d:document-name/text()">
<p:with-option name="replace"
select="concat('"', $document-name, '"')"/>
</p:string-replace>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
<p:choose>
<p:when test="string-length($report-path) > 0">
<p:insert match="d:document-validation-report/d:document-info" position="last-child">
<p:input port="insertion">
<p:inline>
<d:report-path>@@</d:report-path>
</p:inline>
</p:input>
</p:insert>
<p:string-replace match="//d:report-path/text()">
<p:with-option name="replace" select="concat('"', $report-path, '"')"
/>
</p:string-replace>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
<p:choose>
<p:when test="string-length($internal-info) > 0">
<p:add-attribute match="d:document-validation-report/d:document-info">
<p:with-option name="attribute-name" select="'internal'"/>
<p:with-option name="attribute-value" select="$internal-info"/>
</p:add-attribute>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
</p:group>
<p:choose>
<p:when test="//c:errors">
<!-- replace RelaxNG's c:error elements with our own d:error elements. This reduces the number of types of error descriptions we have to deal with. -->
<p:group name="replace-cerror-with-derror">
<!-- convert c:errors to d:errors -->
<p:xslt name="cerror-to-derror-xsl">
<p:input port="stylesheet">
<p:document href="../xslt/cerrors-to-derrors.xsl"/>
</p:input>
<p:input port="parameters">
<p:empty/>
</p:input>
<p:input port="source" select="//c:errors"/>
</p:xslt>
<!-- replace c:errors with the results of the conversion -->
<p:replace match="//c:errors">
<p:input port="replacement">
<p:pipe port="result" step="cerror-to-derror-xsl"/>
</p:input>
<p:input port="source">
<p:pipe port="result" step="add-document-metadata"/>
</p:input>
</p:replace>
</p:group>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
<p:group name="add-error-count">
<p:variable name="error-count"
select="count(//d:error) + count(//svrl:failed-assert) + count(//svrl:successful-report)"/>
<p:insert match="d:document-validation-report/d:document-info" position="last-child">
<p:input port="insertion">
<p:inline>
<d:error-count>@@</d:error-count>
</p:inline>
</p:input>
</p:insert>
<p:string-replace match="//d:error-count/text()">
<p:with-option name="replace" select="concat('"', $error-count, '"')"/>
</p:string-replace>
</p:group>
<p:validate-with-relax-ng assert-valid="true">
<p:input port="schema">
<p:document href="../schema/document-validation-report.rng"/>
</p:input>
</p:validate-with-relax-ng>
</p:declare-step>
So, where to start. We can start commenting out
parts to locate where the error occurs. However there's one step that already looks suspicious;
the p:validate-with-relax-ng
at the end. We know that the error comes from either an
invocation of p:validate-with-relax-ng
or p:validate-with-xml-schema
, and this step
is easy to comment out without breaking the XProc script, so we try that:
<p:declare-step version="1.0" name="combine-validation-reports" type="px:combine-validation-reports" (...)>
(...)
<!--<p:validate-with-relax-ng assert-valid="true">
<p:input port="schema">
<p:document href="../schema/document-validation-report.rng"/>
</p:input>
</p:validate-with-relax-ng>-->
</p:declare-step>
Now we re-run our script, making sure that px:combine-validation-reports
is not
commented out in validate.xpl
, and see what happens. And it succeeds!
Right, so why does commenting out this last validation step help? Is that step
needed? Is it a bug with px:combine-validation-reports
or are we just
using it wrong?
Well, let's see what the input to the p:validate-with-relax-ng
that we commented out is to try and shed some light on this.
We re-enable the p:validate-with-relax-ng
and
add a p:log
to right before that step:
<p:declare-step version="1.0" name="combine-validation-reports" type="px:combine-validation-reports" (...)>
(...)
<p:identity>
<p:log port="result" href="file:/tmp/out.xml"/>
</p:identity>
<p:validate-with-relax-ng assert-valid="true">
<p:input port="schema">
<p:document href="../schema/document-validation-report.rng"/>
</p:input>
</p:validate-with-relax-ng>
</p:declare-step>
Then we re-run the script, and open file:/tmp/out.xml
, which looks like this:
<px:document-sequence xmlns:px='http://xmlcalabash.com/ns/document-sequence'
port='result'
xpl-file='file:/home/jostein/daisy-pipeline/pipeline-modules/pipeline-modules-common/validation-utils/src/main/resources/xml/xproc/combine-validation-reports.xpl'
xpl-line='251'
dateTime='2016-01-21T13:11:55+01:00'>
<px:document>
<d:document-validation-report xmlns:d="http://www.daisy.org/ns/pipeline/data">
<d:document-info>
<d:document-name>pef_valid.pef</d:document-name>
<d:document-type>PEF</d:document-type>
<d:error-count>0</d:error-count>
</d:document-info>
<d:reports/></d:document-validation-report>
</px:document>
</px:document-sequence>
There's some wrapper elements from Calabash's logging here, so we unwrap
the px:document-sequence
and px:document
and end up with this document:
<d:document-validation-report xmlns:d="http://www.daisy.org/ns/pipeline/data">
<d:document-info>
<d:document-name>pef_valid.pef</d:document-name>
<d:document-type>PEF</d:document-type>
<d:error-count>0</d:error-count>
</d:document-info>
<d:reports/></d:document-validation-report>
So let's try to validate this document using the RNG referenced by
p:validate-with-relax-ng
. We need the full path to document-validation-report.rng
,
and one way to do this (there's many) is to select the href to it, press CTRL + enter
to open it, then run the XPath base-uri()
in oXygen, copy the output from that and paste it
somewhere. However you get it, you'll get something like
"file:/home/jostein/daisy-pipeline/pipeline-modules/pipeline-modules-common/validation-utils/src/main/resources/xml/schema/document-validation-report.rng"
(remember to put the file:
protocol at the beginning).
Now add a reference to this file as a xml-model
processing instruction
to the output we received earlier:
<?xml-model href="file:/home/jostein/daisy-pipeline/pipeline-modules/pipeline-modules-common/validation-utils/src/main/resources/xml/schema/document-validation-report.rng"?>
<d:document-validation-report xmlns:d="http://www.daisy.org/ns/pipeline/data">
<d:document-info>
<d:document-name>pef_valid.pef</d:document-name>
<d:document-type>PEF</d:document-type>
<d:error-count>0</d:error-count>
</d:document-info>
<d:reports/></d:document-validation-report>
You can alternatively create a validation scenario with the RNG file,
but I personally find it easiest to just add the xml-model
instruction.
Running validation in oXygen on this document gives this error:
System ID: /tmp/out.xml
Main validation file: /tmp/out.xml
Schema: /home/jostein/daisy-pipeline/pipeline-modules/pipeline-modules-common/validation-utils/src/main/resources/xml/schema/document-validation-report.rng
Engine name: Jing
Severity: error
Description: element "d:document-info" incomplete; missing required element "d:document-path"
Start location: 3:6
End location: 3:21
Ok! So there is actually a problem with this document.
There seems to be missing a d:document-path
element
inside the d:document-info
element.
Where should that element have been inserted? We look
back at combine-validation-reports.xpl
and do a search
for "document-path".
We find an option at the top:
<p:option name="document-path" required="false" select="''">
<p:documentation xmlns="http://www.w3.org/1999/xhtml">
<h1 px:role="name">document-path</h1>
<p px:role="desc">The full path to the document, if available.</p>
</p:documentation>
</p:option>
...and we find some logic in the middle of the document:
<p:choose>
<p:when test="string-length($document-path) > 0">
<p:insert match="d:document-validation-report/d:document-info"
position="first-child">
<p:input port="insertion">
<p:inline>
<d:document-path>@@</d:document-path>
</p:inline>
</p:input>
</p:insert>
<p:string-replace match="//d:document-path/text()">
<p:with-option name="replace"
select="concat('"', $document-path, '"')"/>
</p:string-replace>
</p:when>
<p:otherwise>
<p:identity/>
</p:otherwise>
</p:choose>
So, reading this logic, it seems that the d:document-path
element
is inserted only if the value of $document-path
, which references
the option, is not empty.
The default value of $document-path
in combine-validation-reports.xpl
is the empty string, and looking back at our original script,
we do not set the document-path
to anything else.
Well, that seems to be the problem then. Let's try by setting
the document-path
option in our validate.xpl
script:
<px:combine-validation-reports name="combined-error-report">
<p:with-option name="document-type" select="$document-type"/>
<p:with-option name="document-name" select="$document-name"/>
<p:with-option name="document-path" select="'test'"/>
<p:input port="source">
<p:pipe port="report" step="validate-against-relaxng"/>
</p:input>
</px:combine-validation-reports>
We re-run the script, and... yes! It works! We solved our problem :).
I hope this was at the very least slightly interesting, and I wish you good luck on your future XProc debugging journeys!
Jostein Austvik Jacobsen - 2016-01-21