What is Text Steganography in Information Security?

Text steganography is an approach of hiding secret text message within another text as a covering message or creating a cover message associated to the initial secret message.

Text steganography can include anything from transforming the formatting of an existing text, to changing words within a text, to producing random character sequences or utilizing context-free grammars to make readable texts.

Text steganography is held to be the trickiest because of deficiency of redundant data which is present in image, audio or a video file. The mechanism of text documents is identical with what it can identify, while in another types of documents including in picture, the structure of document is different from what it can identify.

Thus, in such documents, it can hide information by learning changes in the structure of the document without creating a famous change in the concerned output.

Unperceivable changes can be create to an image or an audio file, but, in text files, even an additional letter or punctuation can be notable by a casual reader. It can be saving text file require less memory and its faster as well as simpler communication create it preferable to other types of steganographic methods.

Text steganography can be generally defined into three types including Format based Random and Statistical generation, Linguistic methods which are as follows −

Format Based Methods − Format based methods include altering physically the format of text to conceal the data. This method has specific flaws. If the stego file is opened with a word processor, misspellings and additional white spaces will get identified.

Changed fonts sizes can excite suspicion to a human reader. Moreover, if the initial plaintext is accessible, comparing this plaintext with the suspected steganographic text can create manipulated element of the text quite visible.

Random and Statistical Generation − In Random and Statistical Generation, it can be prevented corresponding with a known plaintext, steganographers provide resort to creating their own cover texts. One method is concealing data in random viewing sequence of characters.

In another method, the statistical features of word length and letter frequencies are used to produce words which will occur to have similar statistical properties as actual words in the given language.

Linguistic Steganography − Linguistic steganography particularly considers the linguistic properties of generated and altered text, and in some cases, uses linguistic mechanism as the space in which messages are secret.

CFG can create tree structure which can be used for concealing the bits where left branch defines ‘0’ and right branch correlate to ‘1’.

A grammar in GNF can also be used where the first choice in a production defines bit 0 and the second choice defines bit 1. This method has some disadvantage. First, a small grammar will lead to several text repetition.

Secondly, although the text is grammatically flawless, but there is a drawback of semantic architecture. The result is a string of sentences which have no association to one another.