Kinesis
Security Specialty
Solution Architect Associate
- Allows for streaming of real-time data
- Streaming data is data generated continuously by thousands of data sources, which typically send the data records simultaneously and in small sizes (order of KBs)
- Kinesis is a platform on AWS that you send your streaming data to
- Three core services, you need to know the differences for the exam (do know this, especially between Streams and Firehose!!!)
Kinesis Data Streams
- Real-time streaming for ingesting data (know this for exam!!)
- You are responsible for creating the consumer and scaling the stream
- Stores data for 24 hours by default, but can be increased up to 7 days
- Data is stored in shards
- Once data is stored in shards, you can have data consumers (typically in EC2 instances) that can take the data and work on it
- Data Consumers then send the result of these calculations onto longer lived storage like S3/etc
- The data capacity of your stream is the sum of the capacities of your shards
Kinesis Data Firehose
- Data transfer tool to get information to S3, Redshit, CloudSearch or Splunk
- Similar to Kinesis streams, however Kinesis Firehose doesn’t require you to manage shards yourself. Firehose manages all of that for you (know this for exam!!)
- Data transerred in near real-time (know this for exam!!)
- Data can be analyzed by Lambda in real time, and then you can send the results out to longer lived storage
- In Firehose, as soon as data is received it is can be analyzed by Lambda. Data is not kept within Firehose like it was within Streams.
Kinesis Data Analytics
- Allows you to run SQL queries of the data as it exists within Firehose or Streams, and you can then take the results of that data to export to S3, Redshit or CloudSearch