I. Introduction

  1. Fully managed, petabyte-scale data warehousing solution (OLAP) for BI
  2. Based on pgsql, 10x better performance
  3. Customers can start at $0.25/hr with no commitment
  4. Two configurations
  5. Uses column-based storage and compression
  6. Doesn't require indexing so uses less space than traditional solutions
  7. Comes with Massively Parallel Processing (MPP) and distributes data/query load across all nodes enabling fast query performance
  8. Default backup with 1-day retention period (max 35)
  9. Redshift always attempts to keep at least three copies of the data
  10. Priced based on compute node-hours i.e. 1 unit per node per hour, backup and data transfer (within VPC)
  11. Always encrypted with AES-256
  12. Multi-AZ is NOT supported
  13. Data can be loaded from S3, DynamoDB, DMS, other services
  14. Supports all popular open data formats (Avro, Parquet, ORC etc)

II. Redshift spectrum

  1. Query data that is already in S3 without loading it.
  2. To use it, you must have Redshift cluster to start the query.
  3. The query is then submitted to thousands of Redshift spectrum nodes.