Elasticache

Solution Architect Associate

Developer Associate

  • In-memory cache in the cloud, used to improve performance of web applications
  • Sits between your application and database
  • For exam, you’ll be given scenarios where your DB is under load and which service you should use to help, Elasticache is a great answer if the app is read heavy.
  • Requires heavy code changes, not automatic. You have to both read and write data to the cache manually, AWS doesn’t do this for you
  • Good if your application/database is particularly read heavy
  • Also really good for compute heavy workloads, can store results of I/O or Compute intensive results
  • Suppports Memcached and Redis
    • Memcached: widely adopted memory object caching system. Does not support multi-threadding, and no Multi-AZ support either. If AWS loses the service to outage, you lose all data in cache as its a single node, not a cluster. Does not support backups.
    • Redis: open source in memory key/value store. Supports more complex data structures like sets and lists, and has support for master/slave replication as well as Multi-AZ for redundancy. Much better choice if keeping data from cache is important. Does support backups.
  • 2 caching strategies available (know these two for exam!!)
    • Lazy loading: loads data into cache only when necessary. I.e. when the request is made and a cache-hit occurs, only then does the app write the data into Elasticache
      • Advantage is only requested data is added, and that a node failure is not going to be fatal because cache misses will simply add data to an empty cache from a failure
      • Disadvantages are large number of cache misses for initial requests. Additionally, data can become stale as data is only updated with cache misses, as well as data doesn’t update automatically for cache hits.
      • You need to use a TTL to avoid keeping stale data in lazy loading. Lazy loading treats expired data as a cache miss. (know this for exam!!)
    • Write-Through Caching: Adds or updates data in the cache whenever data is written to the database.
      • Advantage is that data is never stale.
      • Disadvantage is that you have more latency when writing as we have to both write to cache and the true data store. Also, if a node fails and a new one is spun up, data is missing until new data is updated or added in database.
      • You end up with a lot of wasted resources in cache when data exists which is hardly ever read.
  • Difference between this and DAX for DynamoDB is that DAX was optimized for DynamoDB, but also only supports Read. Also, it only supports Write-Through, so it cannot be used for Lazy-Loading. Generally though, you’ll want to use DAX for DynamoDB and Elasticache for RDS
  • Cache eviction can occur in three ways:
    • If you delete an item explicitly
    • If your cache is full and not recently used (Least Recently Used LRU)
    • You set a TTL manually
  • If you have too many cache evictions, its usually an indicator that you need to scale your cache size up
  • TTLs can range from a few seconds to hours or days, can be very useful
  • Comes with advanced compression as well. When you load data into an empty table, Redshift will automatically sample your data and select the best compression type
  • Redshift doesn’t require any indexes either, so this reduces the amount of space needed for storing the data
  • Massively Parallel Processing
    • Automatically distributes data and query load across all compute nodes. Makes it easy to add nodes and enables fast query performance as your data grows
  • Pricing
    • Compute node hours, billed for 1 unit per node per hour
    • You are not charged for the leader node
    • Also charged for backups and data transfer within a VPC
  • Securtiy
    • Encrypted in transit via SSL
    • Encrypted at Rest via AES-256 encryption
    • By default Redshift manages keys for you, but you can manage yourself or using the key service in AWS
  • Currently only available in 1 AZ
  • You can restore snapshots to new AZs in the event of an outage