Elasticsearch - IngestNode


Advertisements

index.blocks.read_only 1 true/false Set to true to make the index and index metadata read only, false to allow writes and metadata changes.

Sometimes we need to transform a document before we index it. For instance, we want to remove a field from the document or rename a field and then index it. This is handled by Ingest node.

Every node in the cluster has the ability to ingest but it can also be customized to be processed only by specific nodes.

Steps Involved

There are two steps involved in the working of the ingest node −

  • Creating a pipeline
  • Creating a doc

Create a Pipeline

First creating a pipeline which contains the processors and then executing the pipeline, as shown below −

PUT _ingest/pipeline/int-converter
{
   "description": "converts the content of the seq field to an integer",
   "processors" : [
      {
         "convert" : {
            "field" : "seq",
            "type": "integer"
         }
      }
   ]
}

On running the above code, we get the following result −

{
   "acknowledged" : true
}

Create a Doc

Next we create a document using the pipeline converter.

PUT /logs/_doc/1?pipeline=int-converter
{
   "seq":"21",
   "name":"Tutorialspoint",
   "Addrs":"Hyderabad"
}

On running the above code, we get the response as shown below −

{
   "_index" : "logs",
   "_type" : "_doc",
   "_id" : "1",
   "_version" : 1,
   "result" : "created",
   "_shards" : {
      "total" : 2,
      "successful" : 1,
      "failed" : 0
   },
   "_seq_no" : 0,
   "_primary_term" : 1
}

Next we search for the doc created above by using the GET command as shown below −

GET /logs/_doc/1

On running the above code, we get the following result −

{
   "_index" : "logs",
   "_type" : "_doc",
   "_id" : "1",
   "_version" : 1,
   "_seq_no" : 0,
   "_primary_term" : 1,
   "found" : true,
   "_source" : {
      "Addrs" : "Hyderabad",
      "name" : "Tutorialspoint",
      "seq" : 21
   }
}

You can see above that 21 has become an integer.

Without Pipeline

Now we create a document without using the pipeline.

PUT /logs/_doc/2
{
   "seq":"11",
   "name":"Tutorix",
   "Addrs":"Secunderabad"
}
GET /logs/_doc/2

On running the above code, we get the following result −

{
   "_index" : "logs",
   "_type" : "_doc",
   "_id" : "2",
   "_version" : 1,
   "_seq_no" : 1,
   "_primary_term" : 1,
   "found" : true,
   "_source" : {
      "seq" : "11",
      "name" : "Tutorix",
      "Addrs" : "Secunderabad"
   }
}

You can see above that 11 is a string without the pipeline being used.

Advertisements