Kinesis
Kinesis Data Streams
- Public service & scalable within a region streaming service. It streams data in real time i.e < 200ms delay
- Producers send data into Kinesis
- Streams store 24 hours worth of data. Older data is discarded
- The retention period can be increasd upto 365 days for additional cost
- Multiple consumers can receive data in different ways , different granularity etc
- Streams consist of shards. Each shard provides 1MB/s of ingestion & 2MB/s of consumption capacity
- Kinesis data is stored in form of Kinesis Data Record. Each data record is of 1 MB size
Kindesis Firehose
- Fully managed service to persist data beyond rolling window
- It delivers the data in near real time i.e. < 60s
- Supports transformation on fly
- Billing is based on volume streamed through Firehose
- It can deliver to Http endpoints, Splunk, Redshit, ElasticSearch & S3
- Firehose does bufferin. It waits for 1MB of data or 60s whichever happens earlier & then delivers the data
- Firehose does not deliver directly to S3. It first stores into S3 bucket & then copies to Redshift
Kinesis Data Analytics
- Real time data analytics using SQL
- Data is ingested either from DataSteams or Firehose. Static refernce data can be pulled from S3
- Destinations are all Firehose destinations, Lambda & Kinesis Datastreams