- DynamoDB is schemaless
- use DynamoDb Stream when item is updated to primary table and also inserted into a secondary table
- When looking for a good partition key use one with Automatically generated GUID
- If large table, use queries instead of scans
- In order to work with search queuries:
- Specify a key condition expression in the query
- Specify a partition key name and value in the equality condition
- provides in-memory caching for DynamoDB tables
- improves response times for Eventually Consistent reads only
- Point to API DAX cluster instead of table
- allows you to define when items in a table expire so that they can automatically deleted from the databases
- Offers full SSE-KMS encryption at rest
- Enable during creation of DynamoDB Table
- indexes - enable fast queries on specific data columns, fast than querying the whole table
- give you a differen view of your data based on alternate Partition/sort keys
- 2 types of indexes:
- Local Secondary Index
- must be created at when you create table
- same partition key as your table
- different sort key
- Global Secondary Index
- Can create any time - at table creation or after
- different partition key
- different sorty key
- Local Secondary Index
- faster than sequential scan
- when to use:
- when table size is 20gb or larger
- table's provisioned throughput is not fully used
- sequential scan operations are too slow
- deploys multi-region, multi-master database w/o having to build or maintain own replication solution
- captures a time-ordered sequence of item-lvel modifications in any table and stores this information in a log for 24 hours.
- Create a new table with encryption enabled
- Copy data from exisitng table to new table
- Read-after-write Consistency - If you write a new key to S3, you will be able to retrieve any object immediately afterwards. Also, any newly created object or file will be visible immediately, without any delay
- Objects stored in your bucket before you see the versioning state have a version ID of null.
- when you enable versioning, existing objects in your bucket do not change
- what changes is how S3 handles the objects in future requests
- 409 Conflict - Bucket name does not exist
- S3 buckets in ALL regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS andf DELETES
- S3 is perfect for uploading videos
- Can enable server side encryption for all objects stored in S3 through bucket policy
- Server Access Logs enabled can accumulate overtime and take more space in your s3 bucket
- Use lifecylcle config if you want to delete overtime
- If want to enable MFA on s3 bucket, then make sure
aws:MultiFactorAuthPresent
set tofalse
- Multipart upload - API enables you to upload large objects in parts or make copy of an existing object
- 3-step-process -
- initiate upload
- upload object parts
- after uploaded all parts, you complete the multipart upload
- 3-step-process -
- Multi-Object Delete - Delete large number of objects from S3
- If KMS Encryption enabled, then you are actually making extra KMS API calls which will throttle performance issues
2 main approaches to Performance optimization:
- GET-Intensive workloads - Use Cloudfront
- Mixed-Workloads - Avoid sequential key names for your s3 objects.
- Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition
exampleawsbucket/12310sd-13-3-11/photo1.jpg
- Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition
- CloudFront - uses edge locations(where location is cached, can write content)
- do not forget to give permission to your
index.html
to ensure EVERYONE has access to read for Static Web Hosting - By default all S3 resources are private, only AWS account that created the resurces can access them
- To allow read access, add bucket policy that allows s3:GetObject permission with a condition using AWS:referer key, that the get request must originate from specific webpages
- Buckets can contain both encrypted and non-encrypted objects
- Use
x-amz-server-side-encryption
in request header to cause an object to be SSE
Server-Side Encryption w/ Amazon S3-Managed Keys (SSE-S3) - Each object is encrypted with a unique key employing strong multi-factor encryption.
- As an addtional safeguard, it encrypts the key itself with a master key that it regularly rotates.
- S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard(AES-256), to encrypt your data
User Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS) - Uses envelope key(a key that protects your data's encryption key) that provides added protection against unauthorized access of your objects in S3
- similar to SSE-S3, but w/ addtional benefits
- provides audit trail of when your keys was last used and by whom
- Option of managing encryption keys on your own
- Encryption Process
- use customer master key to generate a data key for the encryption process
- use plaintext data encryption key to encrypt data locally, then erase from memory
- Store encrypted data key alongside the locally encrypted data
Use Server-side Encryption with Customer-Provided Keys(SSE-C) - You manage the encryption keys and S3 manages the encryption, as it writes to disks and decryption, when you access your objects
- Total volume of data and number of objects you can store are unlimited
- Individual S3 object - range in size from 0bytes - 5TB
- Largest object in single put - 5GB
- Objects more than 100MB - use multipart upload
- Max Number of s3 buckets by default - 100 buckets per account
- Best practice for IAM is to create roles which has specific access to an AWS service and then give the user permission to the AWS service via the role
- AWS STS:AssumeRole - API that assumes role which passes the ARN of the role to use
- When authenticate and authorize users to access images in S3 bucket
- Authenticate user at the application level, use STS tokens to grant access
- use key-based naming scheme from all UserIDs for all user objects in a single S3 bucket
- For Elastic Contianer Service, to ensure that instances of containers can not access other containers, use Task IAM Roles
- aws sts decode-authorization-message:
- decodes addtional information about authorization status of an error message.
- creates human readable error message
- Get context keys first
- Use the aws iam simulate-custom-policy command
- sign in & sign up services
- social sign in with FB, Google, etc
- user directory management
- comes with MFA
- Provides temp AWS credentials for users who are guests(unauthenticated) and for users who have been authenticated & receive a token.
- Identity Pool is a store of user identity data specifc to your account
- Supports authenication with SAML
- give developers control and insight into their data store in Cognito
-Uses Route 53 and DNS for request routing
- external in-memory cache key-value store, required for improving session management
- Makes it easy to deploy and run
Memcached
orRedis
- Session state in a central location, so all web servers can share a single copy.
- Allows ELB to send requests to any web server, better distribution
- Auto-scaling can terminate web servers without losing session state info
- Redis
- Redis sorted sets helps sort applications featuring leaderboard
- Use Redis if you want high availability
- non-even distribution of sessions across ELB
- ELB sends every request from a specific user to the same web server
- greatly limits elasticity
- ELB cannot distribute traffic evenly, sends disporportionate ammount of traffic to one server
- Auto-scaling cannot terminate web servers w/o losing some user's session state
- captures detailed information about requests sent to your load balancer
- Can make an
internal load balancer
orinternet-facing load balancer
- Internet-facing load balancer -create in public subnet
- if
UNPROTECTED PRIVATE KEY FILE
, then runchmod 400 -i yourPem.pem
- if want to bootstrap script to an instance, place script in the UserData for the instance
- static IPv4 address designed for dynamic cloud computing
- Can mask failure of an instance or software by rapidly remapping the address to another instance in your account
- Can share AMI with specific AWS accounts w/o making the AMI public.
- Can only be shared within a region
- AMIs are regional resource. Therefore, sharing an AMI makes it available in that region
- To make available in a different region, copy the AMI to the region and then share it.
EBS - Elastic Block Storage
- By Default, root volume is deleted when the instance terminates
- Data on other EBS volumes persists after instance terminates by default
- If EC2 with default parameters,
DeleteOnTermination
will be true
Instance Stores
- Data persists only during the life of the instance/ data lost when instance terminated
EBS Encryption
- Uses SSE-KMS encryption with Customer Master Keys(CMKs)
- CMK created automatically, but you can create your own
- RegisterImage - Final step when creating an AMI
- DescribeImages - Describe one or more images(AMIs) that are available to you
- EC2 will need EIP or public IP assigned to it in order to connect to the Internet and send data in or out
- VPC does not charge, no hourly rate
- contains set of rules called
routes
that are used to determine where network traffic is directed - Each subnet MUST be associated with a route table.
- A subnet can only be associated with on route table
- A route table could have multiple subnets
- Connect your VPC to remote networks by using VPN connection
SQS - fast, reliable, scalable, fully managed message queing service.
- makes simple & cost-effective to decouple the components of a cloud application
- You can use SQS to transmit any volume of data, without losing messages or requiring other services to be always available
fanout - common design pattern where message published to an SNS topic is distributed to a number of SQS queues in parallel
- can build applications that take advantage parallel, asynchronous processing
- SQS messages can be delivered to applications that require immediate notification of an event and messages are also persistent in an Amazon SQS queue for other apps to process later in time
- message attribute Name, type, value, and message body should not be empty or null
- if pricing has 2 tiers(customer & guests), use SQS to process application by high priority queue first for the customer
Lazy Loading - loads data into the cache only when requested
- Advantages
- Only requested data is cached
- Node failures are not fatal
- Disadvantages
- cache miss penalty
- stale data Write-Through - adds data or updates data in the cache whenever data is written to the database
- Advantages
- Data is never stale
- 2 trips(write to cache, write to db)
- Disadvantage
- Missing Data
- Waste of resources since some are never read
Max SQS Message size - 256KB
- Use
SetQueueAttributes
to setMaximumMessageSize
attribute - To send messages larger than 256KB, use Amazon SQS Extended Client Library for Java
MAX SQS queues created - no limit
MAX SQS quesues in free tier - 1 million
MAX SQS maximum visibility timeout - 12 hours
SQS PCI DSS certified - yes it is
anonymous access - Yes it is allowed
Standard Queues(default) -best-effort ordering; message delivered at least once
- loose FIFO capability
- receiving message in exact order is not guaranteed
FIFO Queues(First in first out) - ordering strictly preserved, message delivered once, no duplicates
Short Polling - returns immediately, even if the message queue being polled is empty
Long Polling - doesn't return response until a message arrives in the message queue, or the long poll times out
-
makes it inexpensive to retrieve messages from your SQS
-
use to reduce costs, because it reduces empty receives
-
To enable: set value of
ReceiveMessageWaitTimeSeconds
to greater than 0 and less than or equal to 20 seconds -
Cost Effective - use Long Polling and SQS API in Batches
- What your expected to see in SNS message body:
- Type
- TopicArn
- Subject
- Signature
- MessageId
- Message
- Timestampe
- Signature Version
- SigningCertURL
- UnsubscribeURL
- submit notification credentials to SNS
- receive Registration ID for each mobile device
- pass device token to SNS
- SNS creates a mobile subscription endpoint for each device
- Real time application and system monitoring
- track metrics, collect & monitor log files, set alarms
- High-resolution metric - you can set alarm and specify a high-resolution alarm with a period of 10 seconds or 30 seconds
- if error data is being received intermittently, then collect and aggregate the results at regular intervals then the data to CloudWatch
- Set CloudWatch agent on an instance then configure it to send logs for the web server to a central location in cloudWatch
Cross-Account Role
- You can configure access to AWS CodeCommit repositories for IAM users and groups in another AWS account.
- Create cross account role, give the role the priveleges.
- Provide the role ARN to the developers
- fully managed build service in cloud
- compiles your source code, runs unit tests, and produces artifacts that are ready to deploy
- Use AWS CLI to specify different parameters that need to be run for the build
- Run command
buildspec-location
property to set new buildspec.yml file
- Run command
-
continuous delivery service that enables you to model, visualize and automate steps required to release your serveless application
-
if code will be picked up from S3 bucket and would like to encrypt at rest:
- Ensure server-side encryption is enabled on S3 bucket
- Configure AWS KMS with customer managed keys and use it for S3 bucket encryption
-
Use one account for pipeline and another for AWS CodeDeploy for security reasons
- to do so, must create customer master key in KMS and add cross-account access
-
You can build custom action for your pipeline
-
CodePipeline Wizard - creates S3 artifact bucket and default AWS-managed SSE-KMS encryption keys
-
If failure detected in build stage then the entire process will stop
-
Jenkins
- if you Jenkins as build provider, then configure EC2 instance with Jenkins installed, then allow IAM Role for EC2 to access Code Pipeline
- provides deployments according to establised best-practice methods
- AppSpec file can be in JSON or YML, and can be changed in console
- tells what lambda version to deploy
- tells which function to be used as validation tests
- Specify --with-decryption option, this allows CodeDeploy service to decrypt password so that it can be used in the application
- Use IAM Roles to ensure the CodeDeploy service can access KMS service
- 3 ways traffice can shift during deployment
- canary - shift traffic in two increments
- linear - shift traffic in equal elements
- All at Once - All traffic shifted from original lambda function at once
- Can develop, build, and deploy applications on AWS
- Integrates AWS services for your project toolchain
- Helps managae complete lifecycle of a project
- can increase limit on concurrency on Lambda executions
- i.e. a recursive Lambda function
- concurrency - when 2 tasks overlap execution
- Suggested to avoid using recursive code all together
- can create different environment variables in Lambda function to point to different services
- i.e. dev, test, production
- to access data in VPC, must configure:
- Subnet ID
- Security Group ID
- Can change the timeout for a Lambda function
- To validate if your code is working as expected:
- insert logging statements into your code
- Lambda automatically integrates with Amazon Cloudwatch Logs
- Need to enable in IAM role
- NOT Cloudwatch metrics, since metrics will only give the rate at which the function is executing, will not actually help you debug
- If deployment package of lambda has many external libraries:
- Selectively only include the libraries that
- Default settings for lambda function is 3 second timeouts and memory is 128gb
- Any Lambda function invoked asynchronously is retried twice before the event is discarded
- If retries fail, use Dead Letter Queue to direct unprocessed events to SQS or SNS
- See traces of Lambda function which can allow you to see detailed level of tracing to your downstream services
- Use if you would like how to increase performance
- if hosted on EC2 Instance and unable to see XRay trails, make sure x-ray daemon is installed and Ensure IAM role attached to the instance has permission to upoload data on x-ray
- To Enable X-ray, must assign
AWSXrayWriteOnlyAccess
to Lambda function to is has access to X-Ray Service
- Captures API calls and sends to S3 bucket
- recordds what request was made, source ip, who made the request, when was request made, etc.
- allows you to run code across aws locations globally without provisioning or managing servers to be triggered by Amazon Cloudfront requests
- extension of Lambda, compute service that lets you execute functions that customize the content that CloudFront delivers.
- allows you to visualize and test serverless apps in a series of steps
- automatically triggers and tracks each step and stops when errors.
- logs the state of each step so you can diagnose what is wrong
- alias points to a single function version
- when alias updated it points to diff function version, then all requests instantly points to the updated version
- this exposes to potential instabilities
- -routing-config helps with this by allowing yo to point to two different versions of lambda function and dictate what percentage of incoming traffic is sent to each version
- RDS supports Transparent Data Encryption(TDE) to encrypt stored data on your DB instances running Microsoft SQL servers
- API Stage - If customers need to switch to different new API within a certain amount of time, then use API stage to create 'v2'
- state variables - name-value pairs that you can define as config attributes associated with a deployment stage of an API
- act like environment variables
- API Frontend Interaction
- Modify Method Request and Method Response
- API Backend Interaction
- Modify Integration Request and Integration Response
- If need to interact with backend(DynamoDB), then must create integration request to forward incoming method request
- For client to call your API, you must create a deployment and associate a stage to it
- define Request and Response Data Mapping if one content type is JSON and other is XML
- To control access to API gateway use AWS Cognito User Pool or Lambda Authorizers
- Canary Release Deployment - api traffic separated to production release and canary release
- updated api features only visible in canary
- good for test coverage or performance
- setting up RESTful API
- an api gateway with a lambda function to process customer information
- Expose GET method in API Gateway
- To customize error response set up gateway response to API
- Configuration files can be in YAML or JSON and saved in .ebextensions directory
- created and managed locally
- if currently on t1 micro and want to change to m4.large, then us Auto Scaling Group CLI command
- When you create web server environment, Elastic Beanstalk creates one or more EC2 vm to run web apps on the platform you choose
- if planning to deploy on worker role use cron.yaml
- Run on EC@ instances that have no persistent local storage
- Custom AMI can improve provisioning times when instances are launced in your environment if you need to install a lot of software that isnt included in standard AMI's
- Everytime you upload new version of your application, it creates new application version, if you don't delete, then you will reach an application version limit
- lifecycle policy helps by deleteing old versions or when total limit number has been excedded
- if you cant see any relavant environments in beanstalk service(i.e docker), then use custom platforms to create from scratch
All at once – Deploy the new version to all instances simultaneously. All instances in your environment are out of service for a short time while the deployment occurs.
Rolling – Deploy the new version in batches. Each batch is taken out of service during the deployment phase, reducing your environment's capacity by the number of instances in a batch.
Rolling with additional batch – Deploy the new version in batches, but first launch a new batch of instances to ensure full capacity during the deployment process.
Immutable – temp Auto Scaling group launched outside of your environment with seperate set of instances.
- old and new instances serve traffic until new instance pass health checks
- then new instances are moved to your current Auto Scaling Environment, then temp Auto Scaling Group and instances are terminated
Blue/Green Deployments - deploy new version to a separate environment, then swap CNAMEs to redirect traffic to the new version instantly
- ECS - highly scalable container orchestration service that supports docker containers
- Systems Manager Parameter Store - provides secure, hierarchical storage for configuration data management and secrets management
- can store data such as passwords, db strings, and license codes as parameter values
- Kinesis - ingest REAL TIME data, analyze, and persist streaming data
- If you have multiple shards for streams, You cannot guarantee the order of multiple shards, only with one
- Server side encryption is a feature in Amazon Kinesis
- query data in your stream
- build streaming applications using SQL
- can preprocess data with Lambda
- delivers real time streaming data to S3, Redshift, Elastic Search, and Splunk
- if need to transform data before sent to S3, use Lambda to transform
- Enabled server-side data encryption for Kinesis Firehose.
- ONLY possible if you use Kinesis stream as your data source
- Data now only stored in Kinesis stream
- CloudFormation makes system engineers lives easier, whearas Elastic Beanstalk(sets up automatically)makes lives easier for developers
- define all resourcss needed for deployment
- if want to deploy lambda function to Multiple AWS account, then use CloudFormation because its infrastruce not development
- if cloudformation template has huge list of resources, break templates into smaller managble templates then use AWS::CloudFormation::Stack to reference other templates
- if need to configure EC2 instances like NGINX, then use cfn-init helper script
Route 53 Weighted
- allows you to associate multiple resources with one domain name or subdomain so that you can choose how much traffic is routed to each resource
- good for load balancing and testing new versions of software
- to compensate for network latency use
- retries in application code
- Exponential backoff algoritm
- progressively longer waits between retries for consecutive error responses
- Can help stagger the rate of API calls
- Lambda(used to host code) and API Gateway(used to accesss API's to point to which Lambda function)
- EC2(creat API in EC2 Instance) and Elastic Load Balancer(to do routing)
- OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments.
- secure storage and or configuration data management and secrets management
- can store passwords, database strings, and license codes
- Data warehouse