Amazon Web Services - Kinesis
Amazon Kinesis is a managed, scalable, cloud-based service that allows real-time processing of streaming large amount of data per second. It is designed for real-time applications and allows developers to take in any amount of data from several sources, scaling up and down that can be run on EC2 instances.
It is used to capture, store, and process data from large, distributed streams such as event logs and social media feeds. After processing the data, Kinesis distributes it to multiple consumers simultaneously.
How to Use Amazon KCL?
It is used in situations where we require rapidly moving data and its continuous processing. Amazon Kinesis can be used in the following situations −
Data log and data feed intake − We need not wait to batch up the data, we can push data to an Amazon Kinesis stream as soon as the data is produced. It also protects data loss in case of data producer fails. For example: System and application logs can be continuously added to a stream and can be available in seconds when required.
Real-time graphs − We can extract graphs/metrics using Amazon Kinesis stream to create report results. We need not wait for data batches.
Real-time data analytics − We can run real-time streaming data analytics by using Amazon Kinesis.
Limits of Amazon Kinesis?
Following are certain limits that should be kept in mind while using Amazon Kinesis Streams −
Records of a stream can be accessible up to 24 hours by default and can be extended up to 7 days by enabling extended data retention.
The maximum size of a data blob (the data payload before Base64-encoding) in one record is 1 megabyte (MB).
One shard supports up to 1000 PUT records per second.
For more information related to limits, visit the following link − https://docs.aws.amazon.com/kinesis/latest/dev/service-sizes-and-limits.html
How to Use Amazon Kinesis?
Following are the steps to use Amazon Kinesis −
Step 1 − Set up Kinesis Stream using the following steps −
Sign into AWS account. Select Amazon Kinesis from Amazon Management Console.
Click the Create stream and fill the required fields such as stream name and number of shards. Click the Create button.
The Stream will now be visible in the Stream List.
Step 2 − Set up users on Kinesis stream. Create New Users & assign a policy to each user.(We have discussed the procedure above to create Users and assigning policy to them)
Step 3 − Connect your application to Amazon Kinesis; here we are connecting Zoomdata to Amazon Kinesis. Following are the steps to connect.
Log in to Zoomdata as Administrator and click Sources in menu.
Select the Kinesis icon and fill the required details. Click the Next button.
Select the desired Stream on the Stream tab.
On the Fields tab, create unique label names, as required and click the Next button.
On the Charts Tab, enable the charts for data. Customize the settings as required and then click the Finish button to save the setting.
Features of Amazon Kinesis
Real-time processing − It allows to collect and analyze information in real-time like stock trade prices otherwise we need to wait for data-out report.
Easy to use − Using Amazon Kinesis, we can create a new stream, set its requirements, and start streaming data quickly.
High throughput, elastic − It allows to collect and analyze information in real-time like stock trade prices otherwise we need to wait for data-out report.
Integrate with other Amazon services − It can be integrated with Amazon Redshift, Amazon S3 and Amazon DynamoDB.
Build kinesis applications − Amazon Kinesis provides the developers with client libraries that enable the design and operation of real-time data processing applications. Add the Amazon Kinesis Client Library to Java application and it will notify when new data is available for processing.
Cost-efficient − Amazon Kinesis is cost-efficient for workloads of any scale. Pay as we go for the resources used and pay hourly for the throughput required.