AI with Python – Natural Language Processing


Natural Language Processing (NLP) refers to AI method of communicating with intelligent systems using a natural language such as English.

Processing of Natural Language is required when you want an intelligent system like robot to perform as per your instructions, when you want to hear decision from a dialogue based clinical expert system, etc.

The field of NLP involves making computers erform useful tasks with the natural languages humans use. The input and output of an NLP system can be −

  • Speech
  • Written Text

Components of NLP

In this section, we will learn about the different components of NLP. There are two components of NLP. The components are described below −

Natural Language Understanding (NLU)

It involves the following tasks −

  • Mapping the given input in natural language into useful representations.

  • Analyzing different aspects of the language.

Natural Language Generation (NLG)

It is the process of producing meaningful phrases and sentences in the form of natural language from some internal representation. It involves −

  • Text planning − This includes retrieving the relevant content from the knowledge base.

  • Sentence planning − This includes choosing the required words, forming meaningful phrases, setting tone of the sentence.

  • Text Realization − This is mapping sentence plan into sentence structure.

Difficulties in NLU

The NLU is very rich in form and structure; however, it is ambiguous. There can be different levels of ambiguity −

Lexical ambiguity

It is at a very primitive level such as the word-level. For example, treating the word “board” as noun or verb?

Syntax level ambiguity

A sentence can be parsed in different ways. For example, “He lifted the beetle with red cap.” − Did he use cap to lift the beetle or he lifted a beetle that had red cap?

Referential ambiguity

Referring to something using pronouns. For example, Rima went to Gauri. She said, “I am tired.” − Exactly who is tired?

NLP Terminology

Let us now see a few important terms in the NLP terminology.

  • Phonology − It is study of organizing sound systematically.

  • Morphology − It is a study of construction of words from primitive meaningful units.

  • Morpheme − It is a primitive unit of meaning in a language.

  • Syntax − It refers to arranging words to make a sentence. It also involves determining the structural role of words in the sentence and in phrases.

  • Semantics − It is concerned with the meaning of words and how to combine words into meaningful phrases and sentences.

  • Pragmatics − It deals with using and understanding sentences in different situations and how the interpretation of the sentence is affected.

  • Discourse − It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.

  • World Knowledge − It includes the general knowledge about the world.

Steps in NLP

This section shows the different steps in NLP.

Lexical Analysis

It involves identifying and analyzing the structure of words. Lexicon of a language means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs, sentences, and words.

Syntactic Analysis (Parsing)

It involves analysis of words in the sentence for grammar and arranging words in a manner that shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by English syntactic analyzer.

Semantic Analysis

It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer disregards sentence such as “hot ice-cream”.

Discourse Integration

The meaning of any sentence depends upon the meaning of the sentence just before it. In addition, it also brings about the meaning of immediately succeeding sentence.

Pragmatic Analysis

During this, what was said is re-interpreted on what it actually meant. It involves deriving those aspects of language which require real world knowledge.

Advertisements