AWS Certified Developer Notes (June 2018 Release)

Dynamo DB

General Info

DynamoDB is schemaless
use DynamoDb Stream when item is updated to primary table and also inserted into a secondary table
When looking for a good partition key use one with Automatically generated GUID
If large table, use queries instead of scans
In order to work with search queuries:
- Specify a key condition expression in the query
- Specify a partition key name and value in the equality condition

Accelerator DAX

provides in-memory caching for DynamoDB tables
improves response times for Eventually Consistent reads only
Point to API DAX cluster instead of table

Time to Live (TTL)

allows you to define when items in a table expire so that they can automatically deleted from the databases

DynamoDB Encryption

Offers full SSE-KMS encryption at rest
Enable during creation of DynamoDB Table

Indexes

indexes - enable fast queries on specific data columns, fast than querying the whole table
give you a differen view of your data based on alternate Partition/sort keys
2 types of indexes:
- Local Secondary Index
  - must be created at when you create table
  - same partition key as your table
  - different sort key
- Global Secondary Index
  - Can create any time - at table creation or after
  - different partition key
  - different sorty key

Parallel Scans

faster than sequential scan
when to use:
- when table size is 20gb or larger
- table's provisioned throughput is not fully used
- sequential scan operations are too slow

Global Tables

deploys multi-region, multi-master database w/o having to build or maintain own replication solution

DynamoDB Streams

captures a time-ordered sequence of item-lvel modifications in any table and stores this information in a log for 24 hours.

Encryption at Rest

Create a new table with encryption enabled
Copy data from exisitng table to new table

S3

General Info

Read-after-write Consistency - If you write a new key to S3, you will be able to retrieve any object immediately afterwards. Also, any newly created object or file will be visible immediately, without any delay
Objects stored in your bucket before you see the versioning state have a version ID of null.
- when you enable versioning, existing objects in your bucket do not change
- what changes is how S3 handles the objects in future requests
409 Conflict - Bucket name does not exist
S3 buckets in ALL regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS andf DELETES
S3 is perfect for uploading videos
Can enable server side encryption for all objects stored in S3 through bucket policy
Server Access Logs enabled can accumulate overtime and take more space in your s3 bucket
- Use lifecylcle config if you want to delete overtime
- If want to enable MFA on s3 bucket, then make sure aws:MultiFactorAuthPresent set to false

S3 API

Multipart upload - API enables you to upload large objects in parts or make copy of an existing object
- 3-step-process -
  1. initiate upload
  2. upload object parts
  3. after uploaded all parts, you complete the multipart upload
Multi-Object Delete - Delete large number of objects from S3
If KMS Encryption enabled, then you are actually making extra KMS API calls which will throttle performance issues

s3 Performance Optimization

2 main approaches to Performance optimization:

GET-Intensive workloads - Use Cloudfront
Mixed-Workloads - Avoid sequential key names for your s3 objects.
- Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition
  - exampleawsbucket/12310sd-13-3-11/photo1.jpg

CloudFront - uses edge locations(where location is cached, can write content)

Permissions

do not forget to give permission to your index.html to ensure EVERYONE has access to read for Static Web Hosting
By default all S3 resources are private, only AWS account that created the resurces can access them
- To allow read access, add bucket policy that allows s3:GetObject permission with a condition using AWS:referer key, that the get request must originate from specific webpages

Encryption

Buckets can contain both encrypted and non-encrypted objects
Use x-amz-server-side-encryption in request header to cause an object to be SSE

Server-Side Encryption w/ Amazon S3-Managed Keys (SSE-S3) - Each object is encrypted with a unique key employing strong multi-factor encryption.

As an addtional safeguard, it encrypts the key itself with a master key that it regularly rotates.
S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard(AES-256), to encrypt your data

User Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS) - Uses envelope key(a key that protects your data's encryption key) that provides added protection against unauthorized access of your objects in S3

similar to SSE-S3, but w/ addtional benefits
provides audit trail of when your keys was last used and by whom
Option of managing encryption keys on your own
Encryption Process
1. use customer master key to generate a data key for the encryption process
2. use plaintext data encryption key to encrypt data locally, then erase from memory
3. Store encrypted data key alongside the locally encrypted data

Use Server-side Encryption with Customer-Provided Keys(SSE-C) - You manage the encryption keys and S3 manages the encryption, as it writes to disks and decryption, when you access your objects

Storage Numbers

Total volume of data and number of objects you can store are unlimited
Individual S3 object - range in size from 0bytes - 5TB
Largest object in single put - 5GB
Objects more than 100MB - use multipart upload
Max Number of s3 buckets by default - 100 buckets per account

IAM

General Information

Best practice for IAM is to create roles which has specific access to an AWS service and then give the user permission to the AWS service via the role
AWS STS:AssumeRole - API that assumes role which passes the ARN of the role to use
When authenticate and authorize users to access images in S3 bucket
- Authenticate user at the application level, use STS tokens to grant access
- use key-based naming scheme from all UserIDs for all user objects in a single S3 bucket
For Elastic Contianer Service, to ensure that instances of containers can not access other containers, use Task IAM Roles
aws sts decode-authorization-message:
- decodes addtional information about authorization status of an error message.
- creates human readable error message

Testing custom policies

Get context keys first
Use the aws iam simulate-custom-policy command

Amazon Cognito

User Pools Cognito:

sign in & sign up services
social sign in with FB, Google, etc
user directory management
comes with MFA

Identity Pools(Federated Identities)

Provides temp AWS credentials for users who are guests(unauthenticated) and for users who have been authenticated & receive a token.
Identity Pool is a store of user identity data specifc to your account
Supports authenication with SAML

Cognito Streams

give developers control and insight into their data store in Cognito

ELB - Elastic Load Balancing

General Information

-Uses Route 53 and DNS for request routing

Elasticache

external in-memory cache key-value store, required for improving session management
Makes it easy to deploy and run Memcached or Redis
Session state in a central location, so all web servers can share a single copy.
Allows ELB to send requests to any web server, better distribution
Auto-scaling can terminate web servers without losing session state info
Redis
- Redis sorted sets helps sort applications featuring leaderboard
- Use Redis if you want high availability

Sticky Sessions

non-even distribution of sessions across ELB
ELB sends every request from a specific user to the same web server
greatly limits elasticity
1. ELB cannot distribute traffic evenly, sends disporportionate ammount of traffic to one server
2. Auto-scaling cannot terminate web servers w/o losing some user's session state

Access Logs

captures detailed information about requests sent to your load balancer

In VPC

Can make an internal load balancer or internet-facing load balancer
Internet-facing load balancer -create in public subnet

EC2 - Elastic Cloud Computing

General Information

if UNPROTECTED PRIVATE KEY FILE, then run chmod 400 -i yourPem.pem
if want to bootstrap script to an instance, place script in the UserData for the instance

EIP - Elastic IP Address

static IPv4 address designed for dynamic cloud computing
Can mask failure of an instance or software by rapidly remapping the address to another instance in your account

AMI - Amazon Machine Images

Can share AMI with specific AWS accounts w/o making the AMI public.
Can only be shared within a region
AMIs are regional resource. Therefore, sharing an AMI makes it available in that region
- To make available in a different region, copy the AMI to the region and then share it.

EC2 Volumes

EBS - Elastic Block Storage

By Default, root volume is deleted when the instance terminates
Data on other EBS volumes persists after instance terminates by default
If EC2 with default parameters, DeleteOnTermination will be true

Instance Stores

Data persists only during the life of the instance/ data lost when instance terminated

EBS Encryption

Uses SSE-KMS encryption with Customer Master Keys(CMKs)
CMK created automatically, but you can create your own

EC2 API

RegisterImage - Final step when creating an AMI
DescribeImages - Describe one or more images(AMIs) that are available to you

VPC - Virtual Private Cloud

General Info

EC2 will need EIP or public IP assigned to it in order to connect to the Internet and send data in or out
VPC does not charge, no hourly rate

Route Table

contains set of rules called routes that are used to determine where network traffic is directed
Each subnet MUST be associated with a route table.
A subnet can only be associated with on route table
A route table could have multiple subnets

VPN - Virtual Private Network

Connect your VPC to remote networks by using VPN connection

SQS - Simple Queue Service

General Information

SQS - fast, reliable, scalable, fully managed message queing service.

makes simple & cost-effective to decouple the components of a cloud application
You can use SQS to transmit any volume of data, without losing messages or requiring other services to be always available

fanout - common design pattern where message published to an SNS topic is distributed to a number of SQS queues in parallel

can build applications that take advantage parallel, asynchronous processing
SQS messages can be delivered to applications that require immediate notification of an event and messages are also persistent in an Amazon SQS queue for other apps to process later in time
message attribute Name, type, value, and message body should not be empty or null
if pricing has 2 tiers(customer & guests), use SQS to process application by high priority queue first for the customer

Caching Strategies

Lazy Loading - loads data into the cache only when requested

Advantages
- Only requested data is cached
- Node failures are not fatal
Disadvantages
- cache miss penalty
- stale data Write-Through - adds data or updates data in the cache whenever data is written to the database
Advantages
- Data is never stale
- 2 trips(write to cache, write to db)
Disadvantage
- Missing Data
- Waste of resources since some are never read

Storage Numbers

Max SQS Message size - 256KB

Use SetQueueAttributes to set MaximumMessageSize attribute
To send messages larger than 256KB, use Amazon SQS Extended Client Library for Java

MAX SQS queues created - no limit

MAX SQS quesues in free tier - 1 million

MAX SQS maximum visibility timeout - 12 hours

SQS PCI DSS certified - yes it is

anonymous access - Yes it is allowed

Queue Types

Standard Queues(default) -best-effort ordering; message delivered at least once

loose FIFO capability
receiving message in exact order is not guaranteed

FIFO Queues(First in first out) - ordering strictly preserved, message delivered once, no duplicates

Retrieving Messages

Short Polling - returns immediately, even if the message queue being polled is empty

Long Polling - doesn't return response until a message arrives in the message queue, or the long poll times out

makes it inexpensive to retrieve messages from your SQS
use to reduce costs, because it reduces empty receives
To enable: set value of ReceiveMessageWaitTimeSeconds to greater than 0 and less than or equal to 20 seconds
Cost Effective - use Long Polling and SQS API in Batches

SNS - Simple Notification Service

General Information

What your expected to see in SNS message body:
- Type
- TopicArn
- Subject
- Signature
- MessageId
- Message
- Timestampe
- Signature Version
- SigningCertURL
- UnsubscribeURL

Process of SNS to mobile

submit notification credentials to SNS
receive Registration ID for each mobile device
pass device token to SNS
SNS creates a mobile subscription endpoint for each device

Amazon CloudWatch

General Information

Real time application and system monitoring
track metrics, collect & monitor log files, set alarms
High-resolution metric - you can set alarm and specify a high-resolution alarm with a period of 10 seconds or 30 seconds
if error data is being received intermittently, then collect and aggregate the results at regular intervals then the data to CloudWatch
Set CloudWatch agent on an instance then configure it to send logs for the web server to a central location in cloudWatch

CI / CD

CodeCommit

Cross-Account Role

You can configure access to AWS CodeCommit repositories for IAM users and groups in another AWS account.
- Create cross account role, give the role the priveleges.
- Provide the role ARN to the developers

CodeBuild

fully managed build service in cloud
- compiles your source code, runs unit tests, and produces artifacts that are ready to deploy
Use AWS CLI to specify different parameters that need to be run for the build
- Run command buildspec-location property to set new buildspec.yml file

CodePipeline

continuous delivery service that enables you to model, visualize and automate steps required to release your serveless application
if code will be picked up from S3 bucket and would like to encrypt at rest:
- Ensure server-side encryption is enabled on S3 bucket
- Configure AWS KMS with customer managed keys and use it for S3 bucket encryption
Use one account for pipeline and another for AWS CodeDeploy for security reasons
- to do so, must create customer master key in KMS and add cross-account access
You can build custom action for your pipeline
CodePipeline Wizard - creates S3 artifact bucket and default AWS-managed SSE-KMS encryption keys
If failure detected in build stage then the entire process will stop
Jenkins
- if you Jenkins as build provider, then configure EC2 instance with Jenkins installed, then allow IAM Role for EC2 to access Code Pipeline

CodeDeploy

provides deployments according to establised best-practice methods
AppSpec file can be in JSON or YML, and can be changed in console
- tells what lambda version to deploy
- tells which function to be used as validation tests
Specify --with-decryption option, this allows CodeDeploy service to decrypt password so that it can be used in the application
Use IAM Roles to ensure the CodeDeploy service can access KMS service
3 ways traffice can shift during deployment
- canary - shift traffic in two increments
- linear - shift traffic in equal elements
- All at Once - All traffic shifted from original lambda function at once

CodeStar

Can develop, build, and deploy applications on AWS
Integrates AWS services for your project toolchain
Helps managae complete lifecycle of a project

Lambda

General Information

can increase limit on concurrency on Lambda executions
- i.e. a recursive Lambda function
- concurrency - when 2 tasks overlap execution
- Suggested to avoid using recursive code all together
can create different environment variables in Lambda function to point to different services
- i.e. dev, test, production
to access data in VPC, must configure:
1. Subnet ID
2. Security Group ID
Can change the timeout for a Lambda function
To validate if your code is working as expected:
- insert logging statements into your code
- Lambda automatically integrates with Amazon Cloudwatch Logs
  - Need to enable in IAM role
- NOT Cloudwatch metrics, since metrics will only give the rate at which the function is executing, will not actually help you debug
If deployment package of lambda has many external libraries:
- Selectively only include the libraries that
Default settings for lambda function is 3 second timeouts and memory is 128gb

Dead Letter Queue

Any Lambda function invoked asynchronously is retried twice before the event is discarded
If retries fail, use Dead Letter Queue to direct unprocessed events to SQS or SNS

X-ray

See traces of Lambda function which can allow you to see detailed level of tracing to your downstream services
Use if you would like how to increase performance
if hosted on EC2 Instance and unable to see XRay trails, make sure x-ray daemon is installed and Ensure IAM role attached to the instance has permission to upoload data on x-ray
To Enable X-ray, must assign AWSXrayWriteOnlyAccess to Lambda function to is has access to X-Ray Service

CloudTrail

Captures API calls and sends to S3 bucket
recordds what request was made, source ip, who made the request, when was request made, etc.

Lambda@Edge

allows you to run code across aws locations globally without provisioning or managing servers to be triggered by Amazon Cloudfront requests
extension of Lambda, compute service that lets you execute functions that customize the content that CloudFront delivers.

Step Functions

allows you to visualize and test serverless apps in a series of steps
automatically triggers and tracks each step and stops when errors.
logs the state of each step so you can diagnose what is wrong

ALIAS with -routing-config

alias points to a single function version
- when alias updated it points to diff function version, then all requests instantly points to the updated version
- this exposes to potential instabilities
-routing-config helps with this by allowing yo to point to two different versions of lambda function and dictate what percentage of incoming traffic is sent to each version

RDS

General Information

RDS supports Transparent Data Encryption(TDE) to encrypt stored data on your DB instances running Microsoft SQL servers

API Gateway

API Stage - If customers need to switch to different new API within a certain amount of time, then use API stage to create 'v2'
state variables - name-value pairs that you can define as config attributes associated with a deployment stage of an API
- act like environment variables
API Frontend Interaction
- Modify Method Request and Method Response
API Backend Interaction
- Modify Integration Request and Integration Response
If need to interact with backend(DynamoDB), then must create integration request to forward incoming method request
For client to call your API, you must create a deployment and associate a stage to it
define Request and Response Data Mapping if one content type is JSON and other is XML
To control access to API gateway use AWS Cognito User Pool or Lambda Authorizers
Canary Release Deployment - api traffic separated to production release and canary release
- updated api features only visible in canary
- good for test coverage or performance
setting up RESTful API
- an api gateway with a lambda function to process customer information
- Expose GET method in API Gateway
To customize error response set up gateway response to API

Amazon Elastic Beanstalk

General Information

Configuration files can be in YAML or JSON and saved in .ebextensions directory
- created and managed locally
if currently on t1 micro and want to change to m4.large, then us Auto Scaling Group CLI command
- When you create web server environment, Elastic Beanstalk creates one or more EC2 vm to run web apps on the platform you choose
if planning to deploy on worker role use cron.yaml
Run on EC@ instances that have no persistent local storage
Custom AMI can improve provisioning times when instances are launced in your environment if you need to install a lot of software that isnt included in standard AMI's

Application Lifecycle policy

Everytime you upload new version of your application, it creates new application version, if you don't delete, then you will reach an application version limit
lifecycle policy helps by deleteing old versions or when total limit number has been excedded

Custom Platforms

if you cant see any relavant environments in beanstalk service(i.e docker), then use custom platforms to create from scratch

Deployment Options

All at once – Deploy the new version to all instances simultaneously. All instances in your environment are out of service for a short time while the deployment occurs.

Rolling – Deploy the new version in batches. Each batch is taken out of service during the deployment phase, reducing your environment's capacity by the number of instances in a batch.

Rolling with additional batch – Deploy the new version in batches, but first launch a new batch of instances to ensure full capacity during the deployment process.

Immutable – temp Auto Scaling group launched outside of your environment with seperate set of instances.

old and new instances serve traffic until new instance pass health checks
then new instances are moved to your current Auto Scaling Environment, then temp Auto Scaling Group and instances are terminated

Blue/Green Deployments - deploy new version to a separate environment, then swap CNAMEs to redirect traffic to the new version instantly

Elastic Container Service

General Information

ECS - highly scalable container orchestration service that supports docker containers

General Security

Systems Manager Parameter Store - provides secure, hierarchical storage for configuration data management and secrets management
- can store data such as passwords, db strings, and license codes as parameter values

Kinesis

General Information

Kinesis - ingest REAL TIME data, analyze, and persist streaming data
If you have multiple shards for streams, You cannot guarantee the order of multiple shards, only with one
Server side encryption is a feature in Amazon Kinesis

Kinesis Analytics

query data in your stream
build streaming applications using SQL
can preprocess data with Lambda

Kinesis Firehose

delivers real time streaming data to S3, Redshift, Elastic Search, and Splunk
if need to transform data before sent to S3, use Lambda to transform

Encryption at Rest

Enabled server-side data encryption for Kinesis Firehose.
- ONLY possible if you use Kinesis stream as your data source
- Data now only stored in Kinesis stream

CloudFormation

CloudFormation makes system engineers lives easier, whearas Elastic Beanstalk(sets up automatically)makes lives easier for developers
define all resourcss needed for deployment
if want to deploy lambda function to Multiple AWS account, then use CloudFormation because its infrastruce not development
if cloudformation template has huge list of resources, break templates into smaller managble templates then use AWS::CloudFormation::Stack to reference other templates
if need to configure EC2 instances like NGINX, then use cfn-init helper script

Route 53

Route 53 Weighted

allows you to associate multiple resources with one domain name or subdomain so that you can choose how much traffic is routed to each resource
good for load balancing and testing new versions of software

MISC.

to compensate for network latency use
- retries in application code
- Exponential backoff algoritm
  - progressively longer waits between retries for consecutive error responses
  - Can help stagger the rate of API calls

2 Ways to create Restful API

Lambda(used to host code) and API Gateway(used to accesss API's to point to which Lambda function)
EC2(creat API in EC2 Instance) and Elastic Load Balancer(to do routing)

OpWorks

OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments.

AWS Systems Manager Parameter Store

secure storage and or configuration data management and secrets management
- can store passwords, database strings, and license codes

Redshift

Data warehouse

miqui/notes.md

Dynamo DB

General Info

Accelerator DAX

Time to Live (TTL)

DynamoDB Encryption

Indexes

Parallel Scans

Global Tables

DynamoDB Streams

Encryption at Rest

S3

General Info

S3 API

s3 Performance Optimization

Permissions

Encryption

Storage Numbers

IAM

General Information

Testing custom policies

Amazon Cognito

User Pools Cognito:

Identity Pools(Federated Identities)

Cognito Streams

ELB - Elastic Load Balancing

General Information

Elasticache

Sticky Sessions

Access Logs

In VPC

EC2 - Elastic Cloud Computing

General Information

EIP - Elastic IP Address

AMI - Amazon Machine Images

EC2 Volumes

EC2 API

VPC - Virtual Private Cloud

General Info

Route Table

VPN - Virtual Private Network

SQS - Simple Queue Service

General Information

Caching Strategies

Storage Numbers

Queue Types

Retrieving Messages

SNS - Simple Notification Service

General Information

Process of SNS to mobile

Amazon CloudWatch

General Information

CI / CD

CodeCommit

CodeBuild

CodePipeline

CodeDeploy

CodeStar

Lambda

General Information

Dead Letter Queue

X-ray

CloudTrail

Lambda@Edge

Step Functions

ALIAS with -routing-config

RDS

General Information

API Gateway

Amazon Elastic Beanstalk

General Information

Application Lifecycle policy

Custom Platforms

Deployment Options

Elastic Container Service

General Information

General Security

Kinesis

General Information

Kinesis Analytics