- Apache Flume Tutorial
- Apache Flume - Home
- Apache Flume - Introduction
- Data Transfer in Hadoop
- Apache Flume - Architecture
- Apache Flume - Data Flow
- Apache Flume - Environment
- Apache Flume - configuration
- Apache Flume - Fetching Twitter Data
- Sequence Generator Source
- Apache Flume - NetCat Source
- Apache Flume Resources
- Apache Flume - Quick Guide
- Apache Flume - Useful Resources
- Apache Flume - Discussion
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Apache Flume - NetCat Source
This chapter takes an example to explain how you can generate events and subsequently log them into the console. For this, we are using the NetCat source and the logger sink.
To run the example provided in this chapter, you need to install Flume.
We have to configure the source, the channel, and the sink using the configuration file in the conf folder. The example given in this chapter uses a NetCat Source, Memory channel, and a logger sink.
While configuring the NetCat source, we have to specify a port while configuring the source. Now the source (NetCat source) listens to the given port and receives each line we entered in that port as an individual event and transfers it to the sink through the specified channel.
While configuring this source, you have to provide values to the following properties −
Source type − netcat
bind − Host name or IP address to bind.
port − Port number to which we want the source to listen.
We are using the memory channel. To configure the memory channel, you must provide a value to the type of the channel. Given below are the list of properties that you need to supply while configuring the memory channel −
type − It holds the type of the channel. In our example, the type is MemChannel.
Capacity − It is the maximum number of events stored in the channel. Its default value is 100. (optional)
TransactionCapacity − It is the maximum number of events the channel accepts or sends. Its default value is 100. (optional).
This sink logs all the events passed to it. Generally, it is used for testing or debugging purpose. To configure this sink, you must provide the following details.
type − logger
Example Configuration File
Given below is an example of the configuration file. Copy this content and save as netcat.conf in the conf folder of Flume.
# Naming the components on the current agent NetcatAgent.sources = Netcat NetcatAgent.channels = MemChannel NetcatAgent.sinks = LoggerSink # Describing/Configuring the source NetcatAgent.sources.Netcat.type = netcat NetcatAgent.sources.Netcat.bind = localhost NetcatAgent.sources.Netcat.port = 56565 # Describing/Configuring the sink NetcatAgent.sinks.LoggerSink.type = logger # Describing/Configuring the channel NetcatAgent.channels.MemChannel.type = memory NetcatAgent.channels.MemChannel.capacity = 1000 NetcatAgent.channels.MemChannel.transactionCapacity = 100 # Bind the source and sink to the channel NetcatAgent.sources.Netcat.channels = MemChannel NetcatAgent.sinks.LoggerSink.channel = MemChannel
Browse through the Flume home directory and execute the application as shown below.
$ cd $FLUME_HOME $ ./bin/flume-ng agent --conf $FLUME_CONF --conf-file $FLUME_CONF/netcat.conf --name NetcatAgent -Dflume.root.logger=INFO,console
If everything goes fine, the source starts listening to the given port. In this case, it is 56565. Given below is the snapshot of the command prompt window of a NetCat source which has started and listening to the port 56565.
Passing Data to the Source
To pass data to NetCat source, you have to open the port given in the configuration file. Open a separate terminal and connect to the source (56565) using the curl command. When the connection is successful, you will get a message “connected” as shown below.
$ curl telnet://localhost:56565 connected
Now you can enter your data line by line (after each line, you have to press Enter). The NetCat source receives each line as an individual event and you will get a received message “OK”.
Whenever you are done with passing data, you can exit the console by pressing (Ctrl+C). Given below is the snapshot of the console where we have connected to the source using the curl command.
Each line that is entered in the above console will be received as an individual event by the source. Since we have used the Logger sink, these events will be logged on to the console (source console) through the specified channel (memory channel in this case).
The following snapshot shows the NetCat console where the events are logged.