
onsgmls Command in Linux
The onsgmls command is an SGML (Standard Generalized Markup Language) and XML (Extensible Markup Language) parser and validator commonly used in Unix and Linux systems. It parses and validates the SGML document whose document entity is specified by the system identifiers.
- The onsgmls command prints a simple text representation of the document's Element Structure Information Set (ESIS) on the standard output. This ESIS is the information that a structure-controlled conforming SGML application should act upon.
- If more than one system identifier is specified, the corresponding entities will be concatenated to form the document entity. This allows the document entity to be spread among several files, such as having the SGML declaration, prolog, and document instance set in separate files.
- If no system identifiers are specified, onsgmls will read the document entity from the standard input. You can also use a command line system identifier of - to refer to the standard input. Typically, <OSFD>0 is used to refer to the standard input in a system identifier.
Table of Contents
Here is a comprehensive guide to the options available with the onsgmls command −
Syntax of onsgmls Command
The general syntax for the onsgmls command is as follows −
onsgmls [options] [sysid...]
Where −
- options − Various options to customize the parsing and validation process.
- sysid − System identifiers specifying the document entity or entities.
onsgmls Command Options
The following table highlights the common options available for the onsgmls command −
Tag | Description |
---|---|
-alinktype, --activate=linktype | Make the specified link type active. Note that not all ESIS information is output in this case. The active LPDs are not explicitly reported, and when multiple link rules apply to the current element, onsgmls always chooses the first. |
-A, --architecture=NAME | Parse with respect to the specified architecture. |
-bbctf, --bctf=bctf, -bencoding, --encoding=encoding | Determine the encoding used for output. If in fixed character set mode, this specifies the name of an encoding; otherwise, it specifies the name of a BCTF. |
-B, --batch_mode | Enable batch mode, parsing each specified document separately rather than concatenating them. If -tfilename is also specified, the specified filename will be prefixed to the sysid to make the filename for the RAST result for each sysid. |
-c, --catalog=SYSID |
Map public identifiers and entity names to system identifiers using the catalog entry file specified by the sysid. Multiple -c options are allowed. If a catalog entry file called "catalog" exists in the same place as the document entity, it will be searched immediately after those specified by -c. |
-C, --catalogs | Specify catalog files rather than the document entity. The document entity is specified by the first DOCUMENT entry in the catalog files. |
-D, --directory=DIRECTORY | Search the specified directory for files in system identifiers. Multiple -D options are allowed. |
-e, --open-entities | Include descriptions of open entities in error messages. Error messages always include the position of the most recently opened external entity. |
-E, --max-errors=NUMBER | Exit after the specified number of errors. If max_errors is 0, there is no limit on the number of errors. The default is 200. |
-f, --error-file=FILE | Redirect errors to the specified file. This is useful mainly with shells that do not support redirection of stderr. |
-g, --open-elements | -g, --open-elements |
-h, --help | Show a help message and exit. |
-i, --include=NAME | Pretend that <!ENTITY % name "INCLUDE"> occurs at the start of the document type declaration subset. This will take precedence over any other definitions of this entity. |
-n, --error-numbers | Show message numbers in error messages. |
-o, --option=OPTION | Produce output according to OPTION. |
-p, --only-prolog | Parse only the prolog and exit after parsing the document type declaration. Implies -s. |
-R, --restricted | Restrict file reading to specified directories. Prevents reading of arbitrary files on a web server and limits filenames to specific characters. |
-s, --no-output | Suppress output, but error messages will still be printed. |
-t, --rast-file=FILE | Output the RAST result to the specified file. |
-v, --version | Print the version number and exit. |
-w, --warning=TYPE | Enable warning TYPE. |
Examples of onsgmls Command in Linux
The following examples illustrate the versatility and power of the onsgmls command for parsing and validating SGML documents.
Parsing and Validating an SGML Document
To parse and validate an SGML document, you can use the following command −
sudo onsgmls -c my_catalog example.sgml
In this example, "-c" specifies the catalog file where public identifiers and entity names are mapped to system identifiers.

Reading from Standard Input
To read the SGML document entity from the standard input, simply run −
sudo onsgmls
This command is used when no system identifiers are specified. The onsgmls command will read the document entity from the standard input, making it useful in scenarios where the document content is piped from another command.

Activating a Link Type
To activate a specific link type, use the following command −
sudo onsgmls -a alink example.sgml
In this example, "-a" specifies the link type to be activated. This option is used to process documents with specific link types active, although some ESIS information may not be output.

Parsing with Respect to an Architecture
To parse with respect to a specific architecture, you can use the onsgmls command with the "âA" flag −
sudo onsgmls -A architecture_name example.sgml
This command is useful when dealing with documents that need to be parsed according to a certain set of constraints or standards, such as specialized DTDs, processing models, or document formats.

Batch Mode Parsing
To parse each document specified on the command line separately rather than concatenating them, use the following command −
sudo onsgmls -B file1.sgml file2.sgml
In this example, "-B" enables batch mode parsing, treating file1.sgml and file2.sgml as separate documents instead of concatenating them. This option is useful for processing multiple documents independently.

Limiting the Number of Errors
To set a maximum number of errors before onsgmls exits, use the following command −
sudo onsgmls -E10 file1.sgml
In this example, -E10 sets the maximum number of errors to 10. If the number of errors exceeds this limit, onsgmls will exit.

Redirecting the Error Messages
To redirect errors to a specific file, you can use the following command −
sudo onsgmls -f file_errors.log file2.sgml
In this example, "-f" specifies the file to which errors will be redirected. This option is useful for capturing error messages in a log file.

Conclusion
The onsgmls command is a powerful tool for parsing and validating SGML and XML documents within Unix and Linux environments. By offering a range of customizable options and a flexible syntax, it allows you to efficiently handle complex document processing tasks, including validation, error handling, and working with multiple files.
Whether you're working with standard input, managing catalogs, or fine-tuning the parsing process for specific architectures or link types, onsgmls provides the necessary functionality to ensure your documents are properly structured and error-free.