Redshift

Solution Architect Associate

  • Fast and powerful, fully managed data warehousing service in the cloud
  • Supports up to 16 Petabytes of data
  • Can start out with a single node, which can have up to 160GB of data.
  • When you need to scale, you setup a Multi-Node configuration
    • Leader node manages client connections and receives queries
    • Compute nodes store data and perform queries and computations. You can have up to 128 compute nodes
  • Columnar data storage
    • Instead of storing data as a series of rows, Redshift organizes data by column.
    • Require far fewer I/O and greatly improve query performance
  • Comes with advanced compression as well. When you load data into an empty table, Redshift will automatically sample your data and select the best compression type
  • Redshift doesn’t require any indexes either, so this reduces the amount of space needed for storing the data
  • Massively Parallel Processing
    • Automatically distributes data and query load across all compute nodes. Makes it easy to add nodes and enables fast query performance as your data grows
  • Currently only available in 1 AZ
    • You can restore snapshots to new AZs in the event of an outage

Pricing

  • Start for just $0.25 per hour, but can scale up significantly. Still cheaper than competitors in the area.
  • Compute node hours, billed for 1 unit per node per hour
  • You are not charged for the leader node
  • Also charged for backups and data transfer within a VPC

Securtiy

  • Encrypted in transit via SSL
  • Encrypted at Rest via AES-256 encryption
  • By default Redshit manages keys for you, but you can manage yourself or using the key service in AWS