|
Simple XML to HTML Conversion and
Rendition Example
By
Didier PH Martin, April 4, 1999
-
Home Page
Introduction
-
The HTMLFo library
-
Convert the SGML/XML document into HTML formatting object
-
The root element
-
A child container element
-
A container element with data
-
More complex aggregates
-
Inline insertion
-
An other way to create inline insertion with
style properties
-
The complete DSSSL script
-
The XML document rendered with the SGML/XML kit
viewer.
-
Introduction
Contrary to what you've heard, DSSSL
is not necessarily a language hard to learn. Of course, mastering its most
advanced features takes time but the language apprenticeship could be approached
a step at a time from the simple to the complex.
In fact, DSSSL entry level could
be quite simple as we will demonstrate in a simple DSSSL entry level script
that transform a XML document into a HTML document that is finally rendered
in a browser.
To publish a XML document that could
be rendered in a browser with a DSSSL script requires that the browser
includes a DSSSL script engine.
To be processed, the XML document
has to be associated to a DSSSL script by including a style sheet processing
instruction (see: How to link a SGML/XML document
to a HTMLFO DSSSL script). The script itself converts XML/SGML elements
into HTML formatting objects (see: Convert the SGML/XML
document into HTML formatting object). Then finally, the HTML document
is displayed in the browser.
The HTMLFo library
DSSSL specifications
contain several formatting objects. Thus, objects like paragraph, display-group,
sequence are basic formatting object part of the DSSSL style specifications.
However, we are not constrained to only these objects. In fact, new objects
could be created and packaged in libraries. Tony Graham from Mulberry Technologies
created such a library for HTML. Instead of using DSSSL formatting objects
in DSSSL rules, HTML formatting objects could be used instead.
DSSSL formatting
objects:
-
(element par
-
(make paragraph
-
(process-children)
-
)
-
)
-
-
HTMLFO objects:
-
(element par
-
(make div
-
(process-children)
-
)
-
)
|
The first step
in the DSSSL script is to define the DOCTYPE and used libraries. This is
done with a script document prolog like:
-
<!DOCTYPE style-sheet PUBLIC "-//Netfolder//DTD
DSSSL library//EN" >
-
<style-specification use="htmlfo">
-
-
... DSSSL script body.....
-
-
</style-specification>
-
<external-specification id="htmlfo"
document="htmlfo">
|
In the above example,
we defined the usage of an external library named htmlfo.
The library creates a collection of HTML formatting objects. The library
is included with the SGML/XML kit package.
How to link
an SGML/XML document to a HTMLFO DSSSL Script
For XML documents, a
processing instruction can be included at the beginning of the document
like in the following example.
<?xml-stylesheet
href="myscript.dsl" type="text/dsssl"?> or
<?xml-stylesheet
href="myscript.dsl" type="text/dsssl"? media="screen"?> |
if the media property
is not present in the processing instruction, it is automatically set by
default to media="screen".
An SGML document
including a stylesheet processing instruction is governed by the same rules
as for XML documents except that the style sheet keyword is different (you
can also use the same style sheet keyword for both)
<?stylesheet
href="myscript.dsl" type="text/dsssl"> or
<?stylesheet href="myscript.dsl"
type="text/dsssl"? media="screen"> |
Style
sheet PI properties |
Description
|
href |
a
URI pointing to the DSSSL script. The Talva DSSSL script engine engine
only supports URLs and so currently doesn't
support URNs.
URL protocol support is restricted to file and HTTP protocols. |
type |
Indicate
to the script router which script engine to load. The actual script router
supports the following MIME types:
-
text/dsssl = DSSSL engine
-
text/xsl = XSL engine
-
text/css= CSS engine
-
text/omnimark= Omnimark
engine
|
media |
If
this property is not included in the processing instruction, by default,
it is set to the "screen" value. This implies that the output is to be
displayed in the browser.
The DSSSL engine
also recognizes if the output format is specified. For example to transform
a SGML/XML document into a RTF document and display it in the browser would
require the media property to be set to:
media="screen, rtf"
If the output format
is HTML, this property do not have to be included in the stylesheet processing
instruction. The HTMLFO output format is the default DSSSL engine
output format. The default media is set to screen and corresponds to the
browser's window. |
Convert the XML document into
HTML formatting object
The conversion process
consist to take as input an SGML/XML document and to output an HTML document.
Thus, the DSSSL script has to associate one or several HTML element
to each SGML/XML elements. A DSSSL style script is a collection of
rules defining the association of HTML elements to SGML/XML elements.
The
root element:
The root element represents
the document. This is the first element in the grove. A grove is the document
parsed and translated into an internal model that the computer can process.
This is corresponding to the DSSSL root element. In the rule
below, we associate an HTML element to the SGML/XML document root element.
DSSSL rule:
-
( root
-
(make html
-
(process-children)
-
)
-
)
Resultant HTML elements
-
<HTML>
....
HTML elements produced by children's rules....
-
</HTML>
|
The above rule
named root creates HTML elements. Within this rule, procedures
start with the make keyword. This keyword tells the DSSSL
engine to make a formatting object or to make an element. The first HTML
element we create is the top element of an HTML document's element hierarchy.
The process- children keyword means that all other children
elements are to be inserted inside the HTML element.
A
child container element:
In the original SGML/XML
document, the top level element contains other elements. It is possible
to refer to the top level element simultaneously with the root rule and
also by its name like in the example below. Rules are executed in the same
order as defined in the script document. Because we placed the root rule
at the beginning of the script document, this rule is fired first. In the
rule below, the body element is associated to the original top level element
named <breakfast-menu>.
-
Original SGML/XML document:
-
-
<breakfast-menu>
-
<food>
-
<name>Belgian Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
-
<food>
-
<name>Strawberry Belgian Waffles</name>
-
<price>$7.95</price>
-
<description>light Belgian waffles
covered with strawberries and whipped cream</description>
-
<calories>900</calories>
-
</food>
-
<food>
-
<name>Berry-Berry Belgian Waffles</name>
-
<price>$8.95</price>
-
<description>light Belgian waffles
covered with an assortment of fresh berries and whipped cream</description>
-
<calories>900</calories>
-
</food>
-
<food>
-
<name>French Toast</name>
-
<price>$4.50</price>
-
<description>thick slices made from
our homemade sourdough bread</description>
-
<calories>600</calories>
-
</food>
-
<food>
-
<name>Homestyle Breakfast</name>
-
<price>$6.95</price>
-
<description>two eggs, bacon or sausage,
toast, and our ever-popular hash browns</description>
-
<calories>950</calories>
-
</food>
-
</breakfast-menu>
DSSSL rule:
-
(element breakfast-menu
-
(make body
-
css-style: "font-family:helvetica,sans
sherif;font-size:12pt;background-color:#EEEEEE"
-
(process-children)
-
)
-
)
Resultant HTML elements
-
<HTML>
-
<BODY style="font-family:helvetica,sans
sherif;font-size:12pt;background-color:#EEEEEE">
....
HTML elements produced by children's rules....
-
</BODY>
-
</HTML>
|
In this rule,
we defined to make a body HTML element with a style property.
The style property is defined in the rule as css-style. At
first, parenthesis could seem confusing. To know what is part of what,
let's decompose each rule's element.
The rule is identified
by the element keyword. Thus a rule is identified with the
element keyword and the SGML/XML document markup name. the
element construct is enclosed by parenthesis as below
(element breakfast-menu
) |
The above rule
does nothing. Now we have to tell the DSSSL engine to create or make
a HTML body element. To do so, we insert the make procedure. like
in the example below.
-
(element breakfast-menu
-
(make body)
-
)
|
The HTML element
we created has no properties. Make procedures properties are defined
as
-
(make formatting-object-name
-
property-name: property-value
|
We add indentation
to the rule definition to improve the readability. Thus, the make procedure
that includes the html element and its properties is like the example below.
-
(element breakfast-menu
-
(make body
-
ss-style:"font-family:helvetica,sanssherif;font-size:12pt;background-color:#EEEEEE"
-
)
-
)
|
The rule above
is not complete until we tell the DSSSL engine to process all the element's
children. Where you include the process-children keyword
is important. In the example below we include the process-children keyword
at the end of the make body procedure. This means that all the element's
children will be processed within the body element.
-
(element breakfast-menu
-
(make body
-
css-style: "font-family:helvetica,sans herif;font-size:12pt;background-color:#EEEEEE"
-
(process-children)
-
)
-
)
|
Note that the
process-children keyword is also a procedure and has to be enclosed by
parenthesis. If you use parenthesis convention like above, you'll reduce
the risk of errors because it more easy to see if a parenthesis is missing.
A
container element with data:
In the previous example,
the original document element just contained other elements but do not
contain any character data. The following example will show how to process
an element having character data.
-
Original SGML/XML document:
-
-
<breakfast-menu>
-
<food>
-
<name>Belgian Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
-
<food>
-
<name>Strawberry Belgian Waffles</name>
-
<price>$7.95</price>
-
<description>light Belgian waffles
covered with strawberries and whipped cream</description>
-
<calories>900</calories>
-
</food>
-
<food>
-
<name>Berry-Berry Belgian Waffles</name>
-
<price>$8.95</price>
-
<description>light Belgian waffles
covered with an assortment of fresh berries and whipped cream</description>
-
<calories>900</calories>
-
</food>
-
<food>
-
<name>French Toast</name>
-
<price>$4.50</price>
-
<description>thick slices made from
our homemade sourdough bread</description>
-
<calories>600</calories>
-
</food>
-
<food>
-
<name>Homestyle Breakfast</name>
-
<price>$6.95</price>
-
<description>two eggs, bacon or sausage,
toast, and our ever-popular hash browns</description>
-
<calories>950</calories>
-
</food>
-
</breakfast-menu>
DSSSL rule:
-
(element name
-
(make span
-
css-style: "font-weight:bold;color:white"
-
(process-children)
-
)
-
)
Resultant HTML elements
<span
style="font-weight:bold;color:white">Belgian
Waffles</span>
|
Each time the
<name> tag appears, the associated rule is fired. As showed in the example
above, the rule associated to the <name> markup make a span HTML
element. This element has a style property. Note that the style property
is defined with the css-style keyword. As you probably already noticed,
the <name> element doesn't have any children element. So, why use the
process-children procedure then?
In DSSSL, everything
is transformed into tree's node elements. Markup data is also considered
as an other kind of element's node. Thus, the process-children procedure
means that all nodes included in the current node are to be processed.
In clear terms, this means that the DSSSL engine should process the markup
data and all children markups. The data being processed first and
then are processed all markup children elements. Thus, the make
procedure creates a Span element having a style property and includes the
original <name> markup data into the new span element.
More
complex aggregates:
Ready for more complex
stuff? Don't worry, this is not so complex.
Sometime we may need
to have more than one element to be created for a single original document
element. In the following example we'll introduce the display-group keyword
which does this task quite well.
-
Original SGML/XML document:
-
-
<breakfast-menu>
-
<food>
-
<name>Belgian Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
-
<food>
-
<name>Strawberry Belgian Waffles</name>
-
<price>$7.95</price>
-
<description>light Belgian waffles
covered with strawberries and whipped cream</description>
-
<calories>900</calories>
-
</food>
-
<food>
-
<name>Berry-Berry Belgian Waffles</name>
-
<price>$8.95</price>
-
<description>light Belgian waffles
covered with an assortment of fresh berries and whipped cream</description>
-
<calories>900</calories>
-
</food>
-
<food>
-
<name>French Toast</name>
-
<price>$4.50</price>
-
<description>thick slices made from
our homemade sourdough bread</description>
-
<calories>600</calories>
-
</food>
-
<food>
-
<name>Homestyle Breakfast</name>
-
<price>$6.95</price>
-
<description>two eggs, bacon or sausage,
toast, and our ever-popular hash browns</description>
-
<calories>950</calories>
-
</food>
-
</breakfast-menu>
DSSSL rule:
-
(element food
-
(make
display-group
-
(make div
-
css-style:
"background-color:teal;color:white;padding:4px"
-
(process-matching-children
"name")
-
(literal
" - ")
-
(process-matching-children
"price")
-
)
-
(make div
-
css-style: "margin-left:20px;margin-bottom:1em;font-size:10pt;"
-
(process-matching-children "description")
-
(process-matching-children "calories" )
-
)
-
)
-
)
Resultant HTML elements
-
<div
style="background-color:teal;color:white;padding:4px">
-
<spanstyle="font-weight:bold;color:white">Belgian
Waffles</span>
-
-
$5.95
-
</div>
-
<div
style="margin-left:20px;margin-bottom:1em;font-size:10pt;">
-
two
of our famous Belgian waffles with plenty of real maple syrup
-
<span
style="font-style:italic"> 650 calories per serving</span>
-
</div>
|
As you noticed
in the example above, the rule creates two aggregated DIV HTML elements.
First the rule makes a display-group object. This object has no
equivalent in the HTML world. This object has the role to aggregate other
objects like in this case, DIV elements.
The display-group
element may aggregate any number of elements.
-
(make display-group
-
(make something)
-
(make
something)
-
(make
something)
-
)
|
Inside the display-group,
we make two DIV elements with a style property.
-
(element food
-
(make display-group
-
(make
div
-
css-style: "background-color:teal;color:white;padding:4px"
-
(process-matching-children "name")
-
(literal " - ")
-
(process-matching-children "price")
-
)
-
(make
div
-
css-style: "margin-left:20px;margin-bottom:1em;font-size:10pt;"
-
(process-matching-children "description")
-
(process-matching-children "calories")
-
)
-
)
-
)
Resultant HTML elements
-
<div
style="background-color:teal;color:white;padding:4px">
-
<spanstyle="font-weight:bold;color:white">Belgian
Waffles</span>
-
- $5.95
-
</div>
-
<div
style="margin-left:20px;margin-bottom:1em;font-size:10pt;">
-
two of our famous Belgian
waffles with plenty of real maple syrup
-
<span style="font-style:italic">
650 calories per serving</span>
-
</div>
|
|
We introduced
to the rule a new construct process-matching-children. What
is the difference between the (process-children) procedure and the (process-matching-children)
procedure? The former process all original document element children and
the element data. the latter process only children having their name matched.
For instance, in the first div make procedure, only the "name" and
"price" elements are matched and processed. In the second div make procedure,
only the "description" and "calories" elements are processed.
-
Original SGML/XML document:
-
-
<food>
-
<name>Belgian
Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
-
<food>
DSSSL rule:
-
(element food
-
(make display-group
-
(make
div
-
css-style: "background-color:teal;color:white;padding:4px"
-
(process-matching-children "name")
-
(literal " - ")
-
(process-matching-children "price")
-
)
-
(make div
-
css-style: "margin-left:20px;margin-bottom:1em;font-size:10pt;"
-
(process-matching-children "description")
-
(process-matching-children "calories")
-
)
-
)
-
)
|
The first make
procedure also contains a literal construct. A literal construct
is a string. In the example above we include a " - " between the processed
"name" element and the processed "price" element.
-
<div style="background-color:teal;color:white;padding:4px">
-
<spanstyle="font-weight:bold;color:white">Belgian
Waffles</span>
-
-
$5.95
-
</div>
|
The first DIV
procedure process the name element. Because there is a name rule in the
DSSSL script, this rule is fired and this latter creates HTML elements
as in the example below.
-
Original SGML/XML document:
-
-
<food>
-
<name>Belgian Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
DSSSL rules:
-
(element food
-
(make display-group
-
(make
div
-
css-style: "background-color:teal;color:white;padding:4px"
-
(process-matching-children "name")
-
(literal " - ")
-
(process-matching-children "price")
-
)
-
(make
div
-
css-style: "margin-left:20px;margin-bottom:1em;font-size:10pt;"
-
(process-matching-children "description")
-
(process-matching-children "calories")
-
)
-
)
-
)
-
-
(element name
-
(make
span
-
css-style: "font-weight:bold;color:white"
-
(process-children)
-
)
-
)
Resultant HTML elements
-
<div style="background-color:teal;color:white;padding:4px">
-
<span
style="font-weight:bold;color:white">Belgian
Waffles</span>
-
- $5.95
-
</div>
|
Inline
insertion:
Sometime we just want
to insert the content without any special HTML element. In our example,
the "price" markup is one of these cases. We saw that in the food rule
we make two DIV elements. The first DIV contains the result of the
"name" and the "price" markup processing. Both results are separated with
a " - " literal. The rule below, just inserts the markup content in the
resultant output.
-
Original SGML/XML document:
-
<food>
-
<name>Belgian Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
DSSSL rule:
-
(element price
-
(process-children)
-
)
Resultant HTML elements
-
<div style="background-color:teal;color:white;padding:4px">
-
<spanstyle="font-weight:bold;color:white">Belgian
Waffles</span>
-
- $5.95
-
</div>
|
An
other way to create inline insertion with style properties:
If the inline inserted
content should have different rendition properties, HTML has provisions
for such behavior with the SPAN element. CSS style properties could then
be particularly useful to define particular style for a content fragment.
The original XML
document just contains calories quantity. We want the result to be displayed
in italic and with the appended text "calories per serving". The example
below show how.
-
Original SGML/XML document:
-
-
<food>
-
<name>Belgian Waffles</name>
-
<price>$5.95</price>
-
<description>two of our famous Belgian
waffles with plenty of real maple syrup</description>
-
<calories>650</calories>
-
</food>
DSSSL rule:
-
(element calories
-
(make span
-
css-style:
"font-style:italic"
-
(literal
" ")
-
(process-children)
-
(literal " calories per serving")
-
)
-
)
Resultant HTML elements
-
<span
style="font-style:italic"> 900
calories per serving</span>
|
The rule makes
a SPAN element with a CSS style property. Then the resultant HTML element
contains a " " literal, the calories data content and finally the "calories
per serving" literal. Remember that a literal is simply a string. The resultant
output will be a italicized text with a string appended to the calories
quantity.
The complete
DSSSL script
A DSSSL script is an
SGML document. To use the HTMLFO DSSSL library some SGML elements should
be included in your script.
-
With the SGML/XML Kit,
the Doctype should be set to style-sheet PUBLIC
"-//Netfolder//DTD DSSSL library//EN"
-
The style-specification element should
specify that the script uses the htmlfo library.
-
The document should ends with a external-specification
declaration set to the htmlfo library.
The example below shows in bold the mandatory
script document declarations.
-
<!DOCTYPE style-sheet PUBLIC "-//Netfolder//DTD
DSSSL library//EN" >
-
<style-specification use="htmlfo">
-
(root
-
(make html
-
(process-children)
-
)
-
)
-
-
(element breakfast-menu
-
(make body
-
css-style: "font-family:helvetica,sans sherif;font-size:12pt;background-color:#EEEEEE"
-
(process-children)
-
)
-
)
-
-
(element food
-
(make display-group
-
(make div
-
css-style: "background-color:teal;color:white;padding:4px"
-
(process-matching-children "name")
-
(literal " - ")
-
(process-matching-children "price")
-
)
-
(
make div
-
css-style: "margin-left:20px;margin-bottom:1em;font-size:10pt;"
-
(process-matching-children "description")
-
(process-matching-children "calories")
-
)
-
)
-
)
-
-
(element name
-
(make span
-
css-style: "font-weight:bold;color:white"
-
)
-
)
-
-
(element price
-
(process-children)
-
)
-
-
(element description
-
(process-children)
-
)
-
-
(element calories
-
(make span
-
css-style: "font-style:italic"
-
(literal " ")
-
(process-children
-
(literal " calories per serving")
-
)
-
)
-
</style-specification>
-
<external-specification id="htmlfo"
document="htmlfo">
|
The SGML/XML Kit
already has the library entity declaration included in the DSSSL DTD. But,
if you
do not have the SGML/XML kit installed and use instead Jade as standalone
tool. The HTMLFo library should be declared first like in the example below.
-
<!DOCTYPE style-sheet PUBLIC "-//James
Clark//DTD DSSSL Style Sheet//EN" [
-
<!ENTITY htmlfo
SYSTEM "../scripts/htmlfo.dsl" CDATA DSSSL>
-
]>
-
<style-specification use="htmlfo">
......... DSSSL script
-
</style-specification>
-
<external-specification id="htmlfo"
document="htmlfo"
|
The result
displayed in the IE 5 browser
All trademarks herein are the property
of their respective owners. Copyright © 1999-2003 Didier
PH Martin,
All rights reserved. Created by Didier PH Martin,
modified: April 7, 2003 |