DSSSL - Document Style Semantics And Specification Language             
ISO/IEC 10179:1996

 

The Paragraph Flow Object

By Didier PH Martin, June 26, 1999

Contents

* Home Page
introduction Introduction 
visual model Visual Model 
logical model Logical Model
model synthesis Models Synthesis
paragraph characteristics Paragraph Characteristics 
OpenJade Paragraph Object Translation OpenJade Paragraph Object Translation 

Introduction

The paragraph object is a fundamental DSSSL object. It is as fundamental as paragraphs are to written documents. It is a container object which can contain other flow objects.

Throughout this text, when referring to the containment concept, we mean both visual containment and logical containment taking form of a collection. Also, the terms formatting object and flow objects are used in this document to express the same concept. The DSSSL specification uses the term "flow object", we also use the term "formatting object" to relate the concept to other "formatting languages" using this terminology. Thus, flow objects and formatting objects refer to the same concept in this document.

DSSSL allows to map paragraph objects to SGML or XML elements. Or, from an other perspective, each SGML or XML elements are mapped to a DSSSL formatting object.

A paragraph object has a property set through which values are set by a DSSSL script. In the DSSSL specification document, a formatting object property is called a characteristic. However,  a formatting object property set could also be related to groves' property set. A property set is an abstract data model associated to an object.

The paragraph object is equivalent to the CSS block object or to the XSL fo:block object.

A paragraph flow object is not solely restricted to visual rendition. Packages like BraiFo transform any XML/SGML document into Braille. Even if not all visual characteristics are not supported in Braille, useful ones in this context are still in use (indentation, quadding, etc..).

The Visual Model

In DSSSL, all visual objects are areas, as defined in the specifications:

"An area is a rectangular box with a fixed width and height. An area is also a specification of a set of marks that can be imaged on a presentation medium. An area may contain other areas."

DSSSL Area Layout Model

More particularly, in the DSSSL specifications, the paragraph object is a display area:

"Display areas are areas that are not directly parts of lines. A display area has an inherent absolute orientation.

NOTE 43 Informally, the box has an arrow on it saying ‘this way up’.

The positioning of display areas is specified by area containers. An area container has its own coordinate system with its origin at the lower left corner, the positive x-axis extending horizontally to the right and the positive y-axis extending vertically upward."

For instance, an area container could be a page flow object. Thus, a page can contain paragraphs. This is because, a page object is a display area.
The area container imposes a direction to contained display areas. For instance, a page object (an area container) imposes a direction to a set of paragraphs (display areas). In the case of most occidental languages, the direction is top down.
In the same vein, a paragraph being an area container and then a display area, can contain other flow objects.

 

The Logical Model

 
"A paragraph flow object represents a paragraph. It has a single principal port. The contents of this port may be either inlined or displayed. Inline flow objects are formatted to produce line areas. Displayed flow objects implicitly specify a break, and their areas shall be added to the resulting sequence of areas. A paragraph flow object may only be displayed."

Several DSSSL flow objects are collection containers. Thus, a particular flow object may be perceived as a flow object collection and simultaneously as a flow object layout configuration.  In the case of the paragraph flow object, this means that it contains a single flow objects collection (also called a stream), this collection is laid out within a bounded area. So, like all other DSSSL flow objects, a paragraph flow object has two facets:

  • An abstract one - a single collection of flow objects.
  • A visual one - a particular flow objects layout within a bounded area.

So, the paragraph object is contained in a parent's collection (i.e. a stream attached to the parent's port).  And itself, it is a flow object container having a single port (i.e. collection).

For example, a simple-page-sequence flow object contains several paragraph flow objects and one of these flow objects contains:

  1. a line-break flow object
  2. an embedded-text flow object
  3. a line-break flow object

A flow object collection (i.e. a stream) is ordered. Thus, in the example above, the paragraph object contains three (3) objects. The first one in the collection is to be displayed first, the last one is the last to be displayed. This is why we call these objects: flow objects. A flow has an implicit order. Thus, objects contained in a stream (i.e. collection) are placed within the container area one after the other, in the same position they have within the collection. 

Models Synthesis

Something is missing?

Yes, the SGML or XML processed content. This is the data content associated to a particular SGML or XML element. The element content is usually included in a paragraph object collection with the DSSSL process-children construct.

For example, a DSSSL script creates the following flow objects collection:

  1. Data content from a XML or SGML element
  2. a line-break flow object
  3. an embedded-text flow object
  4. a line-break flow object
  5. Data content from a XML or SGML element

This flow object's logical and visual model are produced from a XML or SGML fragment and a DSSSL script.

<par>this paragraph contains embedded text. In fact, it is a Japanese text enclosed by two line-field objects <price>定価 2800</price></par>

SGML or XML fragment


(element par)
  (make paragraph
    (process-children)
  )
)
(element price)
  (make display-group
    (make line-field)
    (process-children)
    (make line-field)
  )
)

DSSSL script fragment

In the example above, we used the display-group element to aggregate several flow objects into a single entity. 

Paragraph characteristics

The paragraph flow object property set provides more information to the rendering engine on how to display the paragraph. How paragraph object properties are translated into a visual model is shown in the figure below

The paragraph property set is composed of 59 properties as shown in the table below. Some of these properties are inherited from container objects some are not.
  
Property name
Description
lines: is a symbol specifying how the content of the paragraph shall be broken into lines in the formatted output.
asis-truncate-char: is either #f or a char object that determines the glyph to be inserted when the lines: characteristic has the value asis-truncate and a line is truncated. The initial value is #f.
asis-wrap-char: is either #f or a char object that determines the glyph to be inserted at the end of a line when the lines: characteristic has the value asis-wrap and the line is broken other than after a character flow object for which the record-end?: characteristic is true. The initial value is #f.
asis-wrap-indent: is a length-spec giving an indent to be added to the start-indent when the lines: characteristic has the value asis-wrap for a line following a break other than after a character flow object for which the record-end?: characteristic is true. The initial value is #f.
first-line-align: is either #f, #t, or a char object. If it is not #f, then the quadding: and last-line-quadding: characteristics are ignored for the first line of the paragraph, and the first line shall be aligned using an alignment point in the line. If the value is a char object, then the alignment point shall be the position point of the first area produced by the first occurrence on the line of a character flow object with a char: characteristic equal to that char object; otherwise, the alignment point shall be the position of the first alignment-point flow object in the line. If alignment-point-offset: is not #f, then the first line of the paragraph shall be aligned so that the percentage of the line length (that is, the display-size less the applicable start and end indents) before the alignment point is equal to the value of alignment-point-offset:. If alignment-point-offset: is #f, then the paragraph is an externally aligned paragraph and shall have an ancestor of class table-cell or aligned-column. Furthermore, the area container in which the areas from this paragraph are placed shall be the same as the area container in which the areas from that ancestor are placed; in this case, the paragraph shall be aligned so that its alignment point is aligned with other such paragraphs in the table-column or aligned-column. If an externally aligned paragraph occurs in a table-cell, then the table-auto-width feature shall be enabled. The initial value is #f.
alignment-point-offset: is either #f or a number between 0 and 100 specifying the percentage of the line length (that is, the display-size less the start and end indents) before the alignment point. The initial value is 50.
ignore-record-end?: is a boolean specifying whether a record-end shall be ignored. If this characteristic is true, then a character with the record-end? property true shall be ignored. The initial value is #f.
expand-tabs?: is either #f or a strictly positive integer specifying the tab interval. When a tab interval is specified, each character flow object that has the input-tab?: characteristic true shall be treated as equivalent to the smallest strictly positive number of spaces that when added to the number of character flow objects following the last preceding record-end character flow object shall be a multiple of the tab interval. The initial value is 8.
line-spacing: is a length-spec giving the normal spacing between the placement paths of lines in the paragraph as described in 12.6.6.1. The initial value is 12pt.
line-spacing-priority: is either an integer or the symbol force specifying the priority of any conditional space before the line. This shall be interpreted in the same manner as the priority: argument for the display-space procedure. The initial value is 0.
min-pre-line-spacing: is a length-spec specifying the minimum size of the line in the placement direction before the placement path as described in 12.6.6.1. A value of #f shall also be allowed, specifying that the value is determined from the paragraph's font. The initial value is #f.
min-post-line-spacing: is a length-spec specifying the minimum size of the line in the placement direction after the placement path as described in 12.6.6.1. A value of #f shall also be allowed, specifying that the value is determined from the paragraph's font. The initial value is #f.
min-leading: is either #f or a length-spec specifying the minimum space between the line areas in the placement direction as described in 12.6.6.1. A value of #f means that the line spacing shall not be automatically adjusted to take into account the size of the content of the lines. The initial value is #f.
first-line-start-indent: is a length-spec giving an indent to be added to the start-indent for the first line. The length may be negative. The initial value is 0pt.
last-line-end-indent: is a length-spec giving an indent to be added to the end-indent for the last line. The length may be negative. The initial value is 0pt.
hyphenation-char: is a char that is used to determine the glyph that is inserted when hyphenation is performed. The characteristics of the character flow object preceding the hyphenation point shall determine the mapping of the character to a glyph, as well as the font resource and font-size of the glyph. The initial value is #\- (the hyphen character).
hyphenation-ladder-count: is a strictly positive integer specifying the maximum number of consecutive lines ending with the same glyph as the glyph determined by the value of the hyphenation-char: characteristic, or #f indicating that there is no limit. The initial value is #f.
hyphenation-remain-char-count: is a positive integer specifying the minimum number of characters in a hyphenated word before the hyphenation character. This is the minimum number of characters in the word left on the line ending with the hyphenation character. The initial value is 2.
hyphenation-push-char-count: is a positive integer specifying the minimum number of characters in a hyphenated word after the hyphenation character. This is the minimum number of characters in the word pushed to the next line after the line ending with the hyphenation character. The initial value is 2.
hyphenation-exceptions: is a list of strings. Each string is a word which may contain hyphen characters, #\-, indicating where hyphenation may occur. If a word to be hyphenated occurs in the list, it may only be hyphenated in the specified places. The initial value is the empty list.
line-breaking-method: is #f or a string specifying a public identifier for the line-breaking-method to be used for this paragraph. The initial value is #f.
line-composition-method: is #f or a string specifying a public identifier for the line-composition-method to be used for this paragraph. The initial value is #f.
implicit-bidi-method: is #f or a string specifying a public identifier for the method to be used for implicitly determining the directionality of the content of the paragraph. This includes both the writing-mode of characters, which, when this characteristic is #f, is specified with the writing-mode characteristic, and how portions of content with a common writing-mode are nested within each other, which, when this characteristic is #f, is specified with embedded-text flow objects. It is part of the semantics of the method which characteristics of character flow objects, if any, it uses. A method may be specific to a particular character repertoire, in which case, it may not make use of any characteristics. It may be part of the semantics of a method for certain glyph substitutions to be applied depending on the writing-mode that is determined for a character, and possibly also on characteristics of the character. The initial value is #f.
glyph-alignment-mode: is one of the symbols base, center, top, bottom, or font specifying the alignment mode to be used for glyphs. font means that the nominal alignment mode of the font in the flow object's writing-mode should be used. The initial value is font.
font-family-name: is either #f, indicating that any font family is acceptable, or a string giving the font family name property of the desired font resource. The initial value is iso-serif.
font-weight: is either #f, indicating that any font weight is acceptable, or one of the symbols not-applicable, ultra-light, extra-light, light, semi-light, medium, semi-bold, bold, extra-bold, or ultra-bold, giving the weight property of the desired font resource. The initial value is medium. This characteristic is applicable when the glyph-alignment-mode: is font or when min-pre-line-spacing: or min-post-line-spacing: is #f.
font-posture: is either #f, indicating that any posture is acceptable, or one of the symbols not-applicable, upright, oblique, back-slanted-oblique, italic, or back-slanted-italic, giving the posture property of the desired font resource. The initial value is upright. This characteristic is applicable when the glyph-alignment-mode: is font or when min-pre-line-spacing: or min-post-line-spacing: is #f.
font-structure: is either #f, indicating that any structure is applicable, or one of the symbols not-applicable, solid, or outline. The initial value is solid. This characteristic is applicable when the glyph-alignment-mode: is font or when min-pre-line-spacing: or min-post-line-spacing: is #f.
font-proportionate-width: is either #f, indicating that any proportionate width is acceptable, or one of the symbols not-applicable, ultra-condensed, extra-condensed, condensed, semi-condensed, medium, semi-expanded, expanded, extra-expanded, or ultra-expanded. The initial value is medium. This characteristic is applicable when the glyph-alignment-mode: is font or when min-pre-line-spacing: or min-post-line-spacing: is #f.
font-name: is either #f, indicating that any font name is acceptable, or a string which is the public identifier for the font name property of the desired font resource. When the value is a string, the values of the font-family-name:, font-weight:, font-posture:, font-structure:, and font-proportionate-width: characteristics are not used in font selection. The initial value is #f. This characteristic is applicable when the glyph-alignment-mode: is font or when min-pre-line-spacing: or min-post-line-spacing: is #f.
font-size: is a length specifying the body size to which the font resource should be scaled. The initial value is 10pt. This characteristic is applicable when min-pre-line-spacing: or min-post-line-spacing: is #f.
numbered-lines?: is #t if the lines produced by this paragraph shall be considered for the purposes of line numbering, and #f otherwise. The initial value is #t.
line-number: is either #f or an unlabeled sosofo containing only inline flow objects. If it is a sosofo, then for each line in the paragraph, the sosofo is formatted to produce a single inline area that is positioned as an attachment area for the line. The initial value is #f.
line-number-side: is one of the symbols start, end, spread-inside, spread-outside, page-inside, or page-outside specifying the side of the line for the attachment specified with the line-number: characteristic. A value of spread-inside or spread-outside shall be allowed only if the flow object has an ancestor of class page-sequence. A value of page-inside or page-outside shall be allowed only if the flow object has an ancestor of column-set-sequence.
line-number-sep: is a length-spec specifying the separation for the attachment specified with the line-number: characteristic.
quadding: is one of the symbols start, end, spread-inside, spread-outside, page-inside, page-outside, center, or justify specifying the alignment of lines other than the last line in the paragraph in the direction determined by the writing-mode. A value of spread-inside or spread-outside shall be allowed only if the flow object has an ancestor of class page-sequence. A value of page-inside or page-outside shall be allowed only if the flow object has an ancestor of column-set-sequence. The initial value is start.
last-line-quadding: is one of the symbols relative, start, end, spread-inside, spread-outside, page-inside, page-outside, center, or justify specifying the alignment of the last line of the paragraph in the direction determined by the writing-mode. This shall apply also to any line in the paragraph that immediately precedes a break. A value of relative means that the value of the quadding: characteristic shall be used, except when that value is justify, in which case, a value of start shall be used. A value of spread-inside or spread-outside shall be allowed only if the flow object has an ancestor of class page-sequence. A value of page-inside or page-outside shall be allowed only if the flow object has an ancestor of column-set-sequence. The initial value is relative.
last-line-justify-limit: is a length-spec specifying the maximum amount of free space in the last line that shall cause the last line to be justified rather than aligned as specified by the last-line-quadding: characteristic. The initial value is 0.
justify-glyph-space-max-add: is a length-spec specifying the maximum space that may be added between glyphs in order to justify a line. The initial value is 0pt.
justify-glyph-space-max-remove: is a length-spec specifying the maximum space that may be removed between glyphs in order to justify a line. The initial value is 0pt.
hanging-punct?: is a boolean specifying whether the paragraph shall be formatted with the punctuation characters hanging into the margin or gutter of a column. The initial value is #f.
widow-count: is a positive integer specifying the minimum number of lines of the paragraph that shall be kept together at the beginning of an area. If the widow-count: is n, then no break shall be allowed between the last n lines of the paragraph. The initial value is 2.
orphan-count: is a positive integer specifying the minimum number of lines of the paragraph that shall be kept together at the end of an area. If the orphan-count: is n, then no break shall be allowed between the first n lines of the paragraph. The initial value is 2.
language: is #f or a symbol specifying the ISO 639 language code in upper-case. This affects line composition in a system-dependent way. The initial value is #f.
country: is #f or a symbol specifying the ISO 3166 country code in upper-case. This affects line composition in a system-dependent way. The initial value is #f.
position-preference: is either #f or one of the symbols top or bottom. This applies if the flow object is directed into a port on a column-set-sequence flow object that is flowed into both the top-float and bottom-float zones of a column-subset and indicates whether the areas from this flow object may be flowed into only one of the zones. This characteristic is not inherited. The default value is #f.
writing-mode: is one of the symbols left-to-right, right-to-left, or top-to- bottom. The direction determined by the writing-mode shall be perpendicular to the placement direction. The initial value is left-to-right. This controls the orientation of the placement path of the lines.
start-indent: is a length-spec specifying the indent for the edge of the area at the start in the direction of the writing-mode. The initial value is 0pt. This applies only to lines from the paragraph itself.
end-indent: is a length-spec specifying the indent for the edge of the area at the end in the direction of the writing-mode. The initial value is 0pt. This applies only to lines from the paragraph itself.
span: is a strictly positive integer specifying the number of columns that the areas resulting from this flow object shall span. This characteristic shall apply if the flow object is directed into a port on a column-set-sequence flow object that is flowed into the top-float, bottom-float, or body-text zone of a spannable column-subset. The initial value is 1.
span-weak?: is a boolean specifying whether the areas resulting from this flow object span weakly rather than strongly. See 12.6.5.1. This characteristic applies if the flow object is directed into a port on a column-set-sequence flow object that is flowed into the top-float, bottom-float, or body-text zone of a spannable column-subset and has a span: characteristic with a value greater than 1. The initial value is #f.
space-before: is an object of type display-space specifying space to be inserted before, in the placement direction, the areas produced by the flow object. This characteristic is not inherited. The default is for no space before to be inserted.
space-after: is an object of type display-space specifying space to be inserted after, in the placement direction, the areas produced by the flow object. This characteristic is not inherited. The default is for no space after to be inserted.
keep-with-previous?: is a boolean specifying whether the flow object shall be kept in the same area as the previous flow object. This characteristic is not inherited. The default value is #f.
keep-with-next?: is a boolean specifying whether the flow object shall be kept in the same area as the next flow object. This characteristic is not inherited. The default value is #f.
break-before: is #f or one of the symbols page, page-region, column, or column-set specifying that the flow object shall start an area of that type. This characteristic is not inherited. The default is #f.
break-after: is #f or one of the symbols page, page-region, column, or column-set specifying that the flow object shall end an area of that type. This characteristic is not inherited. The default is #f.
may-violate-keep-before?: is a boolean which, if true, specifies that constraints imposed by the keep: characteristics of ancestor flow objects on the relative positioning of this flow object and its previous flow object may not be respected. This characteristic is not inherited. The default value is #f.
may-violate-keep-after?: is a boolean which, if true, specifies that constraints imposed by keep: characteristics of ancestor flow objects on the relative positioning of this flow object and its next flow object may not be respected. This characteristic is not inherited. The default value is #f.
 

OpenJade Paragraph Object Translation

The paragraph object is supported by most OpenJade backend processors and thus could be translated into MIF, Tex, RTF and HTML elements. We will illustrate this with a XML document transformed into these different formats with a DSSSL script as shown below. 
  
<?xml version="1.0"?>
<?xml-stylesheet type="text/dsssl" href="Par.dsl" media="screen,mif"?>
<test>
<par>This is a paragraph object</par>
</test>
XML document
   
<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN">
 
(element root
   (make simple-page-sequence
     (process-children)
   )
)
 
(element par
    (make paragraph
         (process-children)
   )
)

In the case of the HTML output, and because of the on-line nature of HTML, the following DSSSL rule is used instead:

(element root
   (make scroll
     (process-children)
   )
)

Thus for all formats based on a page model, the root element (i.e. the XML document) "makes" a simple-page-sequence formatting object and in the case of on-line formats rendered in a browser, the root element "makes" a scroll object.
 
DSSSL scripts
  

The following table illustrates a sample XML document translated into different target formats
  
Formats
Created outputs
MIF
Generated MIF document in the Talva SGML/XML Kit using the following stylesheet processing instruction:
<?xml-stylesheet type="text/dsssl" href="Par.dsl" media="screen,mif"?>
With the OpenJade command line, use the t -mif option. 
Tex Generated Tex document in the Talva SGML/XML Kit using the following stylesheet processing instruction: 
<?xml-stylesheet type="text/dsssl" href="Par.dsl" media="screen,tex"?>
With the OpenJade command line, use the t -tex option. 
RTF
Generated RTF document in the Talva SGML/XML Kit using the following stylesheet processing instruction:
<?xml-stylesheet type="text/dsssl" href="Par.dsl" media="screen,rtf"?>
With the OpenJade command line, use the t -rtf option. 
HTML
Generated HTML document in the Talva SGML/XML Kit using the following stylesheet processing instruction:
<?xml-stylesheet type="text/dsssl" href="Par.dsl" media="screen"?>
With the OpenJade command line, use the t -html option.

The resultant HTML document is associated with a CSS stylesheet. When the HTML option is set, OpenJade creates a HTML and CSS document.

For more information on this option, read: DSSSL formatting objects mapping to HTML+CSS. The html output format mode.

 


All trademarks herein are the property of their respective owners. 
Copyright ©  1999-2003 Didier PH Martin, All rights reserved. Created by Didier PH Martin, modified: April 7, 2003