[AWS SAA-C02 Study Note] S3 & Snowball

S3

Overview

simple storage service

objected-based storage service (manage data as object)

as opposed to other storage architecture

  • file system which manages data as a files and fire hierarchy
  • block storage which manages data as blocks within sectors and tracks

[S3 Object]

  • Key
  • Value
  • Version ID: when versioning enabled
  • Metadata: additional information attached to the object

store data from 0 Bytes to 5 Terabytes

[S3 Bucket]

hold objects, like top folder

universal namespace

Storage Class and Comparsion

! trade retrieval time, accessibility for cheaper storage !

11 9’s (11 nines) = 99.999999999%

9 9’s (9 nines) = 99.9999999%

[Type]

standard (classes)

intelligent tiering

standard infrequently accessed (IA)

one zone IA

glacier

glacier deep archive

/img/AWS/S3/Untitled.png

S3 Glacier Retrieving Options

from minutes to hours

  • Expedited — Expedited retrievals allow you to quickly access your data when occasional urgent requests for a subset of archives are required. For all but the largest archives (250 MB+), data accessed using Expedited retrievals are typically made available within 1–5 minutes. Provisioned Capacity ensures that retrieval capacity for Expedited retrievals is available when you need it.
  • Standard — Standard retrievals allow you to access any of your archives within several hours. Standard retrievals typically complete within 3–5 hours. This is the default option for retrieval requests that do not specify the retrieval option.
  • Bulk — Bulk retrievals are S3 Glacier’s lowest-cost retrieval option, which you can use to retrieve large amounts, even petabytes, of data inexpensively in a day. Bulk retrievals typically complete within 5–12 hours.

S3 Security

All new buckets are PRIVATE when created by default

Logging per request can be turned on a bucket

Log files are generated and saved in a different bucket (even a bucket is a different AWS account if desired)

/img/AWS/S3/Untitled%201.png

Access control is configured using Bucket policies and Access Control Lists (ACL)

/img/AWS/S3/Untitled%202.png

Encryption

encryption in transit - traffic between your lock host and S3 is achieved via SSL/TSL

server-side encryption (SSE) - encryption at rest

S3 managed keys

  • SSE-AES: S3 handle the key, uses AES-256 algorithm
  • SSE-KMS: envelope encryption, AWS KMS and you manage the keys
  • SSE-C: customer provide key

/img/AWS/S3/Untitled%203.png

Data Consistency

  • [new objects] read after write consistency

upload new object to S3, you are able to read immediately

  • [overwrite (PUTS) or delete objects (DELETE)] eventual consistency

when overwrite or delete an object it takes time for S3 to replicate versions to AZs

if you read immediately, S3 may return you an old copy.

Cross Region Replication (CRR)

/img/AWS/S3/Untitled%204.png

When enabled, any object that is uploaded will be automatically replicated to another regions.

Must have versioning turn on both source and dest bucket.

S3 Versioning

  • store all version of an object in S3
  • once enabled it cannot be disabled, only suspended on the bucket
  • fully integrates with S3 lifecycle rules
  • MFA delete feature provides extra protection against deletion of your data

/img/AWS/S3/Untitled%205.png

/img/AWS/S3/Untitled%206.png

Lifecycle Management

Automate the process of moving objects to different storage classes and deleting objects all together

can used together with versioning

can be applied to both current and previous version

/img/AWS/S3/Untitled%207.png

lifecycle rule

/img/AWS/S3/Untitled%208.png

Transfer Acceleration

Fast and secure transfer of files over long distances between your end users and an S3 bucket

CloudFront → Edge Location

Instead of uploading to your bucket, users use a distinct URL for an Edge Location

/img/AWS/S3/Untitled%209.png

Pre-signed URLs

generate a URL which provides you temporary access to an object to either upload or download object data.

provide access to private objects

by AWS CLI or AWS SDK

aws s3 presign s3://mybucket/myobject --expires-in 300
# output
...url...

ex. web application which needs to allow users to download files from a password protected part of your app. generates pre-signed url which expires after 5s.

MFA Delete

ensures users cannot delete objects from a bucket unless they provide their MFA code.

/img/AWS/S3/Untitled%2010.png

delete only in this condition:

  • AWS CLI must be used to turn on MFA
  • the bucket must have versioning turned on

Only the bucket owner logged in as root user can DELETE objects from bucket

Lab on S3

S3 in region free!

AWS CLI S3 command

# list all buckets
aws s3 ls
# list all folders and objects in bucket
aws s3 ls s3://liuyuchen777
# list all folders and objects in folder
aws s3 ls s3://liuyuchen777/pic
# download object
aws s3 cp s3://liuyuchen777/pic/001.jpg ~/Desktop/001.jpg
# upload object to s3 bucket (002.jpg is not in bucket)
aws s3 cp s3://liuyuchen777/pic/002.jpg ~/Desktop/002.jpg
# create pre-signed url that expires in 300s
aws s3 presign s3://liuyuchen777 --expires-in 300

Cheatsheet on S3

/img/AWS/S3/Untitled%2011.png

/img/AWS/S3/Untitled%2012.png

/img/AWS/S3/Untitled%2013.png

AWS Snow Family

Snowball

Low Cost cost hundreds dollar to transfer 100TB over Internet and snow ball only cost 1/5

Speed snowball takes only less than a week

/img/AWS/S3/Untitled%2014.png

Features and Limitations:

  • E-link display
  • Tamper and weather proof
  • Data is encrypted end-to-end
  • Trusted Platform Module (TPM)
  • completed within 90 days
  • Snowball can Import and Export from S3

50TB and 80 TB

AWS Snowball Edge

more storage and with local processing

/img/AWS/S3/Untitled%2015.png

Snowball Edge Compute Optimized is a secure, rugged device that brings AWS computing and storage capabilities

Snowmobile

45 foot-long ruggedized shipping container, pulled by a semi-trailer truck

up to 100PB

/img/AWS/S3/Untitled%2016.png

for S3 and S3 Glacier

Cheatsheet

/img/AWS/S3/Untitled%2017.png