This manual page describes the intermediate output format of the GNU
roff(7)
text processing system.
This output is produced by a run of the GNU
troff(1)
program before it is fed into a device postprocessor program.
As the GNU roff processor
groff(1)
is a wrapper program around troff that automatically calls a
postprocessor, this output does not show up normally.
This is why it is called
intermediate
within the
groffsystem.
The
groff
program provides the option
-Z
to inhibit postprocessing, such that the produced intermediate output
is sent to standard output just like calling
troff
manually.
In this document, the term
troff output
describes what is output by the GNU troff program, while
intermediate output
refers to the language that is accepted by the parser that prepares
this output for the postprocessors.
This parser is smarter on whitespace and implements obsolete elements
for compatibility, otherwise both formats are the same.
The pre-groff roff versions are denoted as
classicaltroff.
The main purpose of the intermediate output concept is to facilitate
the development of postprocessors by providing a common programming
interface for all devices.
It has a language of its own that is completely different from the
groff(7)
language.
While the
groff
language is a high-level programming language for text processing, the
intermediate output language is a kind of low-level assembler language
by specifying all positions on the page for writing and drawing.
The intermediate output produced by
groff
is fairly readable, while
classical troff
output was hard to understand because of strange habits that are
still supported, but not used any longer by
GNUtroff.
LANGUAGE CONCEPTS
During the run of
troff,
the roff input is cracked down to the information on what has to be
printed at what position on the intended device.
So the language of the intermediate output format can be quite small.
Its only elements are commands with or without arguments.
In this document, the term "command" always refers to the intermediate
output language, never to the roff language used for document
formatting.
There are commands for positioning and text writing, for drawing, and
for device controlling.
Separation
Classical troff output
had strange requirements on whitespace.
The
groff
output parser, however, is smart about whitespace by making it
maximally optional.
The whitespace characters, i.e. the
tab,
space,
and
newline
characters, always have a syntactical meaning.
They are never printable because spacing within the output is always
done by positioning commands.
Any sequence of
space
or
tab
characters is treated as a single
syntacticalspace.
It separates commands and arguments, but is only required when there
would occur a clashing between the command code and the arguments
without the space.
Most often, this happens when variable length command names,
arguments, argument lists, or command clusters meet.
Commands and arguments with a known, fixed length need not be
separated by syntactical space.
A line break is a syntactical element, too.
Every command argument can be followed by whitespace, a comment, or a
newline character.
Thus a
syntactical line break
is defined to consist of optional syntactical space that is optionally
followed by a comment, and a newline character.
The normal commands, those for positioning and text, consist of a
single letter taking a fixed number of arguments.
For historical reasons, the parser allows to stack such commands on
the same line, but fortunately, in groff intermediate output, every
command with at least one argument is followed by a line break, thus
providing excellent readability.
The other commands [em] those for drawing and device controlling [em]
have a more complicated structure; some recognize long command names,
and some take a variable number of arguments.
So all
D
and
x
commands were designed to request a
syntactical line break
after their last argument.
Only one command,
`x X'
has an argument that can stretch over several lines, all other
commands must have all of their arguments on the same line as the
command, i.e. the arguments may not be splitted by a line break.
Empty lines, i.e. lines containing only space and/or a comment, can
occur everywhere.
They are just ignored.
Argument Units
Some commands take integer arguments that are assumed to represent
values in a measurement unit, but the letter for the corresponding
scale indicator
is not written with the output command arguments; see
groff(7)
and the groff info file for more on this topic.
Most commands assume the scale indicator~
the basic unit of the device, some use~
the
scaled point unit
of the device, while others, such as the color commands expect plain
integers.
Note that these scale indicators are relative to the chosen device.
They are defined by the parameters specified in the device's
DESC
file; see
groff_font(5).
Note that single characters can have the eighth bit set, as can the
names of fonts and special characters.
The names of characters and fonts can be of arbitrary length.
A character that is to be printed will always be in the current font.
A string argument is always terminated by the next whitespace
character (space, tab, or newline); an embedded
#
character is regarded as part of the argument, not as the beginning of
a comment command.
An integer argument is already terminated by the next non-digit
character, which then is regarded as the first character of the next
argument or command.
Document Parts
A correct intermediate output document consists of two parts, the
prologue and the body.
The task of the
prologue
is to set the general device parameters using three exactly specified
commands.
The
groff prologue
is guaranteed to consist of the following three lines (in that order):
x Tdevice x resn h v x init
with the arguments set as outlined in the section
Device Control Commands.
But the parser for the intermediate output format is able to swallow
additional whitespace and comments as well.
The
body
is the main section for processing the document data.
Syntactically, it is a sequence of any commands different from the
ones used in the prologue.
Processing is terminated as soon as the first
x stop
command is encountered; the last line of any groff intermediate output
always contains such a command.
Semantically, the body is page oriented.
A new page is started by a
p~command.
Positioning, writing, and drawing commands are always done within the
current page, so they cannot occur before the first
p~command.
Absolute positioning (by the
H
and
V~commands)
is done relative to the current page, all other positioning
is done relative to thSegmentation fault (core dumped)