Elastic MapReduce
Solution Architect Associate
- Managed big data platform that allows you to process vast amounts of data using open source tools like Spark, HBase, Hudi, etc. Essentially ETL in the cloud
- ETL - Extract, Transform, Load
- EMR runs in clusters, generally runs on either EC2, EKS or Outposts
- Transformed data is stored in S3
- You can use reserved instances and spot instances to save money
- Instances always live in a VPC