Skip to content

Latest commit

 

History

History
132 lines (118 loc) · 6.02 KB

File metadata and controls

132 lines (118 loc) · 6.02 KB

S3 Buckets

  • Amazon S3 allows people to store objects (files) in “buckets” (directories)
  • Buckets must have a globally unique name
    • Naming convention:
      • No uppercase
      • No underscore
      • 3-63 characters long
      • Not an IP
      • Must start with lowercase letter or number
  • Objects
    • Objects (files) have a Key. The key is the FULL path:
      • <my_bucket>/my_file.txt
      • <my_bucket>/my_folder/another_folder/my_file.txt
    • There’s no concept of “directories” within buckets (although the UI will trick you to think otherwise)
    • Just keys with very long names that contain slashes (“/“)
    • Object Values are the content of the body:
      • Max Size is 5TB
      • If uploading more than 100MB , must use “multipart upload”
    • Metadata (list of text key / value pairs - system or user metadata)
    • Tags (Unicode key / value pair - up to 10) - useful for security / lifecycle
    • Version ID (if versioning is enabled)
    • 3500 PUTS / 5500 GETS limit per prefix (this is why it's important to never group items with a fixed name like "user-123", "user-493", etc.)

AWS S3 - Versioning

  • Enabled at the bucket level
  • Same key overwrite will increment the “version”: 1, 2, 3
  • It is best practice to version your buckets
    • Protect against unintended deletes (ability to restore a version)
    • Easy roll back to previous versions
  • Any file that is not version prior to enabling versioning will have the version “null”

S3 Encryption for Objects

  • There are 4 methods of encrypt objects in S3
    • SSE-S3: encrypts S3 objects
      • Encryption using keys handled & managed by AWS S3
      • Object is encrypted server side
      • AES-256 encryption type
      • Must set header: “x-amz-server-side-encryption”:”AES256”
    • SSE-KMS: encryption using keys handled & managed by KMS
      • KMS Advantages: user control + audit trail
      • Object is encrypted server side
      • Maintain control of the rotation policy for the encryption keys
      • Must set header: “x-amz-server-side-encryption”:”aws:kms”
    • SSE-C: server-side encryption using data keys fully managed by the customer outside of AWS
      • Amazon S3 does not store the encryption key you provide
      • HTTPS must be used
      • Encryption key must be provided in HTTP headers, for every HTTP request made
    • Client Side Encryption
      • Client library such as the amazon S3 Encryption Client
      • Clients must encrypt data themselves before sending to S3
      • Clients must decrypt data themselves when retrieving from S3
      • Customer fully manages the keys and encryption cycle

Encryption in transit (SSL)

  • AWS S3 exposes:
    • HTTP endpoint: non encrypted
    • HTTPS endpoint: encryption in flight
  • You’re free to use the endpoint your want, but HTTPS is recommended
  • HTTPS is mandatory for SSE-C
  • Encryption in flight is also called SSL / TLS

S3 Security

  • User based
    • IAM policies - which API calls should be allowed for a specific user from IAM console
  • Resource based
    • Bucket policies - bucket wide rules from the S3 console - allows cross account
    • Object Access Control List (ACL) - finer grain
    • Bucket Access Control List (ACL) - less common
  • Networking
    • Support VPC endpoints (for instances in VPC without www internet)
  • Logging and Audit:
    • S3 access logs can be stored in other S3 buckets
    • API calls can be logged in AWS CloudTrail
  • User Security:
  • MFA (multi factor authentication) can be required in versioned buckets to delete objects
  • Signed URLs: URLS that are valid only for a limited time (ex: premium video services for logged in users)

S3 Bucket Policies

  • JSON based policies
    • Resources: buckets and objects
    • Actions: Set of API to Allow or Deny
    • Effect: Allow / Deny
    • Principal: The account or user to apply the policy to
  • Use S3 bucket for policy to:
    • Grant public access to the bucket
    • Force objects to be encrypted at upload
    • Grant access to another account (Cross Account)

S3 Websites

  • S3 can host static website and have them accessible on the world wide web
  • The website URL will be:
    • http://bucket-name.s3-website-AWS-region.amazonaws.com
    • OR
    • http://bucket-name.s3-website.AWS-region.amazonaws.com
  • If you get a 403 (forbidden) error, make sure the bucket policy allows public reads!

S3 Cors

  • If you request data from another S3 bucket, you need to enable CORS
  • Cross Origin Resource Sharing allows you to limit the number of websites that can request your files in S3 (and limit your costs)
  • This is a popular exam question

AWS S3 - Consistency Model

  • Read after write consistency for PUTS of new objects

    • As soon as an object is written, we can retrieve it e.g: (PUT 200 -> GET 200)
    • This is true, except if we did a GET before to see if the object existed

    ex: (GET 404 -> PUT 200 -> GET 404) - eventually consistent

  • Eventual Consistency for DELETES and PUTS of existing objects

    • If we read an object after updating, we might get the older version

    ex: (PUT 200 -> PUT 200 -> GET 200 (might be older version))

    • If we delete an object, we might still be able to retrieve it for a short time

    ex: (DELETE 200 -> GET 200)

AWS S3 - Other

  • S3 can send notifications on changes to
    • AWS SQS: queue service
    • AWS SNS: notification service
    • AWS Lambda: serverless service
  • S3 has a cross region replication feature (managed)

AWS S3 Performance

  • Faster upload of large objects (>100MB), use multipart upload
    • Parallelize PUTs for greater throughput
    • Maximize your network bandwidth
    • Decrease time to retry in case a part fails
  • Use CloudFront to cache S3 objects around the world (improves reads)
  • S3 Transfer Acceleration (uses edge locations) - just need to change the endpoint you write to, not the code
  • If using SSE-KMS encryption, you may be limited to your AWS limits for KMS usage (~100s - 1000s downloads / uploads per second)