[AWS SAA-C02 Study Note] S3 & Snowball
S3
Overview
simple storage service
objected-based storage service (manage data as object)
as opposed to other storage architecture
- file system which manages data as a files and fire hierarchy
- block storage which manages data as blocks within sectors and tracks
[S3 Object]
- Key
- Value
- Version ID: when versioning enabled
- Metadata: additional information attached to the object
store data from 0 Bytes to 5 Terabytes
[S3 Bucket]
hold objects, like top folder
universal namespace
Storage Class and Comparsion
! trade retrieval time, accessibility for cheaper storage !
11 9’s (11 nines) = 99.999999999%
9 9’s (9 nines) = 99.9999999%
[Type]
standard (classes)
intelligent tiering
standard infrequently accessed (IA)
one zone IA
glacier
glacier deep archive
S3 Glacier Retrieving Options
from minutes to hours
- Expedited — Expedited retrievals allow you to quickly access your data when occasional urgent requests for a subset of archives are required. For all but the largest archives (250 MB+), data accessed using Expedited retrievals are typically made available within 1–5 minutes. Provisioned Capacity ensures that retrieval capacity for Expedited retrievals is available when you need it.
- Standard — Standard retrievals allow you to access any of your archives within several hours. Standard retrievals typically complete within 3–5 hours. This is the default option for retrieval requests that do not specify the retrieval option.
- Bulk — Bulk retrievals are S3 Glacier’s lowest-cost retrieval option, which you can use to retrieve large amounts, even petabytes, of data inexpensively in a day. Bulk retrievals typically complete within 5–12 hours.
S3 Security
All new buckets are PRIVATE when created by default
Logging per request can be turned on a bucket
Log files are generated and saved in a different bucket (even a bucket is a different AWS account if desired)
Access control is configured using Bucket policies and Access Control Lists (ACL)
Encryption
encryption in transit - traffic between your lock host and S3 is achieved via SSL/TSL
server-side encryption (SSE) - encryption at rest
S3 managed keys
- SSE-AES: S3 handle the key, uses AES-256 algorithm
- SSE-KMS: envelope encryption, AWS KMS and you manage the keys
- SSE-C: customer provide key
Data Consistency
- [new objects] read after write consistency
upload new object to S3, you are able to read immediately
- [overwrite (PUTS) or delete objects (DELETE)] eventual consistency
when overwrite or delete an object it takes time for S3 to replicate versions to AZs
if you read immediately, S3 may return you an old copy.
Cross Region Replication (CRR)
When enabled, any object that is uploaded will be automatically replicated to another regions.
Must have versioning turn on both source and dest bucket.
S3 Versioning
- store all version of an object in S3
- once enabled it cannot be disabled, only suspended on the bucket
- fully integrates with S3 lifecycle rules
- MFA delete feature provides extra protection against deletion of your data
Lifecycle Management
Automate the process of moving objects to different storage classes and deleting objects all together
can used together with versioning
can be applied to both current and previous version
lifecycle rule
Transfer Acceleration
Fast and secure transfer of files over long distances between your end users and an S3 bucket
CloudFront → Edge Location
Instead of uploading to your bucket, users use a distinct URL for an Edge Location
Pre-signed URLs
generate a URL which provides you temporary access to an object to either upload or download object data.
provide access to private objects
by AWS CLI or AWS SDK
aws s3 presign s3://mybucket/myobject --expires-in 300
# output
...url...
ex. web application which needs to allow users to download files from a password protected part of your app. generates pre-signed url which expires after 5s.
MFA Delete
ensures users cannot delete objects from a bucket unless they provide their MFA code.
delete only in this condition:
- AWS CLI must be used to turn on MFA
- the bucket must have versioning turned on
Only the bucket owner logged in as root user can DELETE objects from bucket
Lab on S3
S3 in region free!
AWS CLI S3 command
# list all buckets
aws s3 ls
# list all folders and objects in bucket
aws s3 ls s3://liuyuchen777
# list all folders and objects in folder
aws s3 ls s3://liuyuchen777/pic
# download object
aws s3 cp s3://liuyuchen777/pic/001.jpg ~/Desktop/001.jpg
# upload object to s3 bucket (002.jpg is not in bucket)
aws s3 cp s3://liuyuchen777/pic/002.jpg ~/Desktop/002.jpg
# create pre-signed url that expires in 300s
aws s3 presign s3://liuyuchen777 --expires-in 300
Cheatsheet on S3
AWS Snow Family
Snowball
Low Cost cost hundreds dollar to transfer 100TB over Internet and snow ball only cost 1/5
Speed snowball takes only less than a week
Features and Limitations:
- E-link display
- Tamper and weather proof
- Data is encrypted end-to-end
- Trusted Platform Module (TPM)
- completed within 90 days
- Snowball can Import and Export from S3
50TB and 80 TB
AWS Snowball Edge
more storage and with local processing
Snowball Edge Compute Optimized is a secure, rugged device that brings AWS computing and storage capabilities
Snowmobile
45 foot-long ruggedized shipping container, pulled by a semi-trailer truck
up to 100PB
for S3 and S3 Glacier