onsgmls Command in Linux



The onsgmls command is an SGML (Standard Generalized Markup Language) and XML (Extensible Markup Language) parser and validator commonly used in Unix and Linux systems. It parses and validates the SGML document whose document entity is specified by the system identifiers.

  • The onsgmls command prints a simple text representation of the document's Element Structure Information Set (ESIS) on the standard output. This ESIS is the information that a structure-controlled conforming SGML application should act upon.
  • If more than one system identifier is specified, the corresponding entities will be concatenated to form the document entity. This allows the document entity to be spread among several files, such as having the SGML declaration, prolog, and document instance set in separate files.
  • If no system identifiers are specified, onsgmls will read the document entity from the standard input. You can also use a command line system identifier of - to refer to the standard input. Typically, <OSFD>0 is used to refer to the standard input in a system identifier.

Table of Contents

Here is a comprehensive guide to the options available with the onsgmls command −

Syntax of onsgmls Command

The general syntax for the onsgmls command is as follows −

onsgmls [options] [sysid...]

Where −

  • options − Various options to customize the parsing and validation process.
  • sysid − System identifiers specifying the document entity or entities.

onsgmls Command Options

The following table highlights the common options available for the onsgmls command −

Tag Description
-alinktype, --activate=linktype Make the specified link type active. Note that not all ESIS information is output in this case. The active LPDs are not explicitly reported, and when multiple link rules apply to the current element, onsgmls always chooses the first.
-A, --architecture=NAME Parse with respect to the specified architecture.
-bbctf, --bctf=bctf, -bencoding, --encoding=encoding Determine the encoding used for output. If in fixed character set mode, this specifies the name of an encoding; otherwise, it specifies the name of a BCTF.
-B, --batch_mode Enable batch mode, parsing each specified document separately rather than concatenating them. If -tfilename is also specified, the specified filename will be prefixed to the sysid to make the filename for the RAST result for each sysid.
-c, --catalog=SYSID

Map public identifiers and entity names to system identifiers using the catalog entry file specified by the sysid. Multiple -c options are allowed.

If a catalog entry file called "catalog" exists in the same place as the document entity, it will be searched immediately after those specified by -c.

-C, --catalogs Specify catalog files rather than the document entity. The document entity is specified by the first DOCUMENT entry in the catalog files.
-D, --directory=DIRECTORY Search the specified directory for files in system identifiers. Multiple -D options are allowed.
-e, --open-entities Include descriptions of open entities in error messages. Error messages always include the position of the most recently opened external entity.
-E, --max-errors=NUMBER Exit after the specified number of errors. If max_errors is 0, there is no limit on the number of errors. The default is 200.
-f, --error-file=FILE Redirect errors to the specified file. This is useful mainly with shells that do not support redirection of stderr.
-g, --open-elements -g, --open-elements
-h, --help Show a help message and exit.
-i, --include=NAME Pretend that <!ENTITY % name "INCLUDE"> occurs at the start of the document type declaration subset. This will take precedence over any other definitions of this entity.
-n, --error-numbers Show message numbers in error messages.
-o, --option=OPTION Produce output according to OPTION.
-p, --only-prolog Parse only the prolog and exit after parsing the document type declaration. Implies -s.
-R, --restricted Restrict file reading to specified directories. Prevents reading of arbitrary files on a web server and limits filenames to specific characters.
-s, --no-output Suppress output, but error messages will still be printed.
-t, --rast-file=FILE Output the RAST result to the specified file.
-v, --version Print the version number and exit.
-w, --warning=TYPE Enable warning TYPE.

Examples of onsgmls Command in Linux

The following examples illustrate the versatility and power of the onsgmls command for parsing and validating SGML documents.

Parsing and Validating an SGML Document

To parse and validate an SGML document, you can use the following command −

sudo onsgmls -c my_catalog example.sgml

In this example, "-c" specifies the catalog file where public identifiers and entity names are mapped to system identifiers.

onsgmls Command in Linux1

Reading from Standard Input

To read the SGML document entity from the standard input, simply run −

sudo onsgmls

This command is used when no system identifiers are specified. The onsgmls command will read the document entity from the standard input, making it useful in scenarios where the document content is piped from another command.

onsgmls Command in Linux2

Activating a Link Type

To activate a specific link type, use the following command −

sudo onsgmls -a alink example.sgml

In this example, "-a" specifies the link type to be activated. This option is used to process documents with specific link types active, although some ESIS information may not be output.

onsgmls Command in Linux3

Parsing with Respect to an Architecture

To parse with respect to a specific architecture, you can use the onsgmls command with the "‑A" flag −

sudo onsgmls -A architecture_name example.sgml

This command is useful when dealing with documents that need to be parsed according to a certain set of constraints or standards, such as specialized DTDs, processing models, or document formats.

onsgmls Command in Linux4

Batch Mode Parsing

To parse each document specified on the command line separately rather than concatenating them, use the following command −

sudo onsgmls -B file1.sgml file2.sgml

In this example, "-B" enables batch mode parsing, treating file1.sgml and file2.sgml as separate documents instead of concatenating them. This option is useful for processing multiple documents independently.

onsgmls Command in Linux5

Limiting the Number of Errors

To set a maximum number of errors before onsgmls exits, use the following command −

sudo onsgmls -E10 file1.sgml

In this example, -E10 sets the maximum number of errors to 10. If the number of errors exceeds this limit, onsgmls will exit.

onsgmls Command in Linux6

Redirecting the Error Messages

To redirect errors to a specific file, you can use the following command −

sudo onsgmls -f file_errors.log file2.sgml

In this example, "-f" specifies the file to which errors will be redirected. This option is useful for capturing error messages in a log file.

onsgmls Command in Linux7

Conclusion

The onsgmls command is a powerful tool for parsing and validating SGML and XML documents within Unix and Linux environments. By offering a range of customizable options and a flexible syntax, it allows you to efficiently handle complex document processing tasks, including validation, error handling, and working with multiple files.

Whether you're working with standard input, managing catalogs, or fine-tuning the parsing process for specific architectures or link types, onsgmls provides the necessary functionality to ensure your documents are properly structured and error-free.

Advertisements