[AWS SAA-C02 Study Note] Database Service: RDS, Aurora, Redshift, DynamoDB, DMS

RDS

Relational Database Service

A managed relational database service.v Support multiple SQL engines, easy to scale, backup and secure.

RDS is the AWS Solution for relational databases. There are 6 relational database options currently available on AWS.

/img/AWS/Database/Untitled.png

Encryption

/img/AWS/Database/Untitled%201.png

Backup

There are two backup solutions available for RDS

  • Automated Backups

/img/AWS/Database/Untitled%202.png

  • Manual Snapshots

/img/AWS/Database/Untitled%203.png

Restoring Backup

When recovering AWS will take the most recent daily backup, and apply transaction log data relevant to that day. This allows point-in-time recovery down to a second inside the retention period.

/img/AWS/Database/Untitled%204.png

Multi-AZ

ensure database remains available if another AZ becomes unavailable

/img/AWS/Database/Untitled%205.png

Synchronize

makes an exact copy of your database in another AZ. Automatically synchronizes changes in the database over to the standby copy.

/img/AWS/Database/Untitled%206.png

Automatic Failover Protection

If one AZ goes down, failover will occur and the standby slave will be promoted to master

/img/AWS/Database/Untitled%207.png

Read Replicas

Read Replicas allow you to run multiples copies of your db, these copies only allows reads (no writes) and is intended to alleviate the workload of your primary db to improve performance.

/img/AWS/Database/Untitled%208.png

you can have up to 5 replicas of a database

each RR will have its own DNS endpoint

you can have Multi-AZ replicas, replicas in another region or even replicas of other replicas

/img/AWS/Database/Untitled%209.png

Replicas can be promoted to their own db, but this breaks replication.

No automatic failover, if primary copy fails, you must manually update urls to point at copy.

Multi-AZ vs. Read Replicas

/img/AWS/Database/Untitled%2010.png

RDS Cheat Sheet

/img/AWS/Database/Untitled%2011.png

Aurora

Fully Managed Postgres or MySQL compatible database designed by default to scale and fine-tuned to be really fast.

Intro to Aurora

combines the speed and availability of high-end db with simplicity and cost-effectiveness of open source db.

/img/AWS/Database/Untitled%2012.png

Scaling with Aurora

start with 10GB storage and scale in 10GB increments up to 64TB

storage is autoscaling

computing resources can scale all the way up to 32 vCPUs and 244GB of memory

/img/AWS/Database/Untitled%2013.png

Availability with Aurora

/img/AWS/Database/Untitled%2014.png

Fault Tolerance and Durabillity

Back-up and Failover is handled automatically

/img/AWS/Database/Untitled%2015.png

storage is self-healing, in that data blocks and disks are continuously scanned for errors and repaired automatically.

Aurora Replicas

2 types of replicas available

Up to 15 replicas

/img/AWS/Database/Untitled%2016.png

Aurora Serverless

/img/AWS/Database/Untitled%2017.png

serverless is more elastic

compare to one writer and multiple readers, serverless is less expensive

Aurora Cheat Sheet

/img/AWS/Database/Untitled%2018.png

Redshift

Fully managed PB-size Data Warehouse

Analyze (Run complex SQL queries) on massive amounts of data Columnar Store Database.

What is Data Warehouse?

what is database transcation?

A transaction symbolizes a unit of work performed within a database management system

ex. read and write

/img/AWS/Database/Untitled%2019.png

Intro of Redshift

less cost

use for Business Intelligence

OLAP

/img/AWS/Database/Untitled%2020.png

Use Case

/img/AWS/Database/Untitled%2021.png

different source

Columnar Storage stores data together as columns instead of rows

/img/AWS/Database/Untitled%2022.png

Configuration

Single Node

Node come in size of 160GB. You can launch a single node to get started with Redshift.

/img/AWS/Database/Untitled%2023.png

Multi-Node

you can launch a cluster of nodes with Multi-Node mode

  • Leader Node: manage client connections and receiving queries
  • Compute Node: store data and performs queries up to 128 compute node

/img/AWS/Database/Untitled%2024.png

Node Type and Sizes

There are two types of Nodes

/img/AWS/Database/Untitled%2025.png

Compression

/img/AWS/Database/Untitled%2026.png

Processing

Massively Parallel Processing (MPP)

Automatically distributes data and query loads across all nodes

Lets you easily add new nodes to your data warehouse while still maintaining fast query performance

/img/AWS/Database/Untitled%2027.png

Back-up

backups are enabled by default with a 1 day retention period.

retention period can be modified up to 35 days.

/img/AWS/Database/Untitled%2028.png

Billing

/img/AWS/Database/Untitled%2029.png

Security

Data-in-transit: Encrypted using SSL

Data-at-rest: Encrypted using AES-256 encryption

Database Encryption can be applied using

  • KMS multi-tenant HSM
  • CloudHSM single-tenant HSM

/img/AWS/Database/Untitled%2030.png

Availability

RS is a single AZ.

To run in Multi-AZ you would have to run multiple RS Clusters in different AZs with same inputs.

Snapshots can be restored to a different AZ in the event an outage occurs.

/img/AWS/Database/Untitled%2031.png

[Supplement] EMR

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto.

Amazon EMR makes it easy to set up, operate, and scale your big data environments by automating time-consuming tasks like provisioning capacity and tuning clusters. With EMR you can run PB-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark.

You can run workloads on Amazon EC2 instances, on Amazon Elastic Kubernetes Service (EKS) clusters, or on-premises using EMR on AWS Outposts.

Redshift CheatSheet

/img/AWS/Database/Untitled%2032.png

DynamoDB

A key-value and document database (NoSQL) which can guarantees consistent reads and writes at any scale.

Intro to DynamoDB

/img/AWS/Database/Untitled%2033.png

define your read and write capacity

/img/AWS/Database/Untitled%2034.png

Table Structure

/img/AWS/Database/Untitled%2035.png

Consistent Read

When data needs to update it has to write update to all copies. It is possible for data to be inconsistent if you are reading from a copy which has yet to be updated. You have the ability to choose the read consistency in DynamoDB to meet your need.

/img/AWS/Database/Untitled%2036.png

/img/AWS/Database/Untitled%2037.png

DynamoDB Accelerator (DAX)

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.

DAX does all the heavy lifting required to add in-memory acceleration to your DynamoDB tables, without requiring developers to manage cache invalidation, data population, or cluster management.

DynamoDB CheatSheet

/img/AWS/Database/Untitled%2038.png

DMS

Database Migration Service

AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases.

/img/AWS/Database/Untitled%2039.png