[AWS Certificate]-ElastiCache

Amazon ElastiCache - Overview

Fully-managed in-memory data store (caching service, to boost DB read performance)
It is a remote caching service, or a side cache i.e. separate dedicated caching instance
Provides rapid access to data across distributed nodes
Two flavors (both are open-source key-value stores)
- Amazon ElastiCache for Redis
- Amazon ElastiCache for Memcached
Sub-millisecond latency for real-time applications
Redis supports complex data types, snapshots, replication, encryption, transactions, pub/sub messaging, transactional Lua scripting, and support for geospatial data
Memcached suitable for relatively simple applications like static website caching

Database caches

Store frequently access data (read operations)
Improve DB performance by taking the most read load off the DB
Three types - integrated / local / remote caches
Database integrated ache (stores data within DB)
- Typically limited by available memory and resources
- Example - Integrated cache in Aurora
  - integrated and managed cache with built-in write-through capabilities
  - enabled by default and no code changes needed
Local cache (stores data within application)
Remote cache (stores data on dedicated servers)
- Typically built upon key/value NoSQL stores
- Example - Redis and Memcached
- Suport up-to a million requests per second per cache node
- Offer sub-millisecond latency
- Caching of data and managing its validity is managed by your application

ElastiCache use cases

Caching Strategies

Lazy loading - loads data into the cache only when necessary

Reactive approach
Only the queried data is cached (small. size)
There is cache miss penalty
Can contain stale data (use appropriate TTL)

Write through - loads data into the cache as it gets written to the DB

Proactive approach
Data is always current (never stale)
Results in cache churn (most data is never read, use TTL to save space)

Lazy loading with write through

Get the benefits of both strategies
Always use appropriate TTL

Lazy loading illustrated

Write through illustrated

User session store illustrated

User logs into any the application
The application writes the session data into ElatiCache
The user hits another instance of our application
The instance retrieves the data and the user is already logged in

Redis Architecture - Cluster mode disabled

Redis clusters are generally placed in private subnets
Accessed from EC2 instance placed in a public subnet in a VPC
Cluster mode disabled - single shard
A shard has a primary node and 0-5 replicas
A shard with replicas is also. called as a replication group
Replicas can be deployed as Multi-AZ
Multi-AZ replicas support Auto-Failover capability
Single reader endpoint (auto updates replica endpoint changes)

Redis Architecture - Cluster mode enabled

Cluster mode enabled - multiple shards
Data is distributed across the available shards
A shard has a primary node and 0-5 replicas
Multi-AZ replicas support Auto-Failover capability
Max 90 nodes per cluster (90 shards w/ no replicas to 15 shards w/5 replica each)
Minimum 3 shards r ecommended for HA
Use nitro system-based node types for higher performance (e.g. M5 / R5 etc)

Redis Multi-AZ with Auto-Failover

Failes over to a replica node on outage
Minimal downtime (typically 3-6 minutes)
ASYNC replication (=can have some data loss due to replication lag)
Manual reboot does not trigger auto-failover
(other reboots/failures do)
You can simulate/test a failover using AWS console / CLI / API
During planned maintenance for auto-failover enabled clusters
- If cluster mode enabled - no write interruption
- If cluster mode disabled - brief write interruption (few seconds)

Redis Backup and Restore

Supports manual and automatic backups
Backups are point-in-time copy of the entire Redis clsuter, can't backup individual nodes
Can be used to warm start a new cluster (=preloaded data)
Can backup from primary node or from replica
Recommended to backup from a replica (ensures primary node performance)
Backups (also called snapshot) are stored in S3
Can export snapshots to your S3 buckets in the same region
Can then copy the exported snapshot to other region / account using S3 API

Redis Scaling

Cluster Mode Disabled

Vertical Scaling
- Scale up / scale down node type
- minimal downtime
Horizontal Scaling
- add/remove replica nodes
- if Multi-AZ with automatic failover is enabled, you cannot remove the last replica

Cluster Mode Enabled

Vertical Scaling (Online)
- scale up / scale down node type
- no downtime
Horizontal scaling (=resharding and shard reblancing)
- allows partitioning across shards
- add/remove/rebalance shards
- resharding = change the number of shards as needed
- shard rebalancing = ensure that data is equally distributed across shards
- two modes - offline (with downtime) and online (no downtime)

Horizontal scaling - resharding / rebalancing

	Online Mode (=no downtime)	Offline Mode (=downtime)
Cluster availability during scaling up	YES	NO
Can scale out / scale in / rebalance	YES	YES
Can scale up / down (change node type)	NO	YES
Can upgrade engine version	NO	YES
Can specify the number of replica nodes in each shard independently	NO	YES
Can specify the keyspace for shards independently	NO	YES

Redis Replication

Cluster Mode Disabled	Cluster Mode Enabled
1 Shard	Up to 90 shards
0-5 replicas	0-5 replicas per shard
If 0 replicas, primary failure = total data loss	If 0 replicas, primary failure = total data loss in that shard
Multi-AZ supported	Multi-AZ required
Supports scaling	Support partitioning
If primary load is read-heavy, you can scale the cluster (though up to 5 replica max)	Good for write-heavy nodes (you gert additional write endpoints, one per shard)

Redis - Global Datastore

Allows you to create cross region replicas for Redis
Single writer cluster (primary cluster), multiple reader clusters (secondary clusters)
Can replicate to up to two other regions
Improves local latency (bring data closer to your users)
Provides for DR (you can manually promote a secondary cluster to be primary, not automatic)
Not available for single node clusters (must convert it to a replication group first)
Security for cross-region communication is provided through VPC peering
Cluster cannot be modified / resized as usual
- you scale the clusters by modifying the global datastore
- all member clusters wil get scaled
To modify a global datatstore's parameters
- modify the parameter group of any member cluster
- Change gets applied to all member clusters automatically
Data is replicated cross-region in < 1 sec (typically, not an SLA)
RPO (typical) < 1 sec (amt of data loss due to disaster)
RTO (typical) < 1 min (time taken for DR)

Redis - Good things to know

Replica lag may grow and shrink over time. If a replica is too far behind the primary, reboot it
In case of latency/throughput issues, scaling out the cluster helps
In case of memory pressure, scaling out the cluster helps
If the cluster is over-scaled, you can scale in to reduce costs
In case of online scaling
- cluster remains available, but with some performance degradation
- level of degradation would depend on CPU utilization and amout of data
You cannot change Redis cluster mode after creating it (can create a new cluster and warm start it with existing data)
All nodes within a cluster are of the same instance type

Redis best practice

Cluster mode - connect using the configuration endpoint (allows for auto-discovery of shard and keyspace (slot) mapping
Cluster mode disable - use primary endpoint for writes and reader endpoint for reads (always kept up to date with any cluster changes)
Set the parameter reserved-memory-percent=25% (for background processes, non-data)
Keep socket timeout = 1 second (at least)
- Too low => numerous timeouts on high load
- Too high => application might take longer to detect connection issues
Keep DNS caching timeout low (TTL = 5-10 seconds recommended)
Do not use the "cache forever" option for DNS caching

Redis use cases - Gaming Leaderboards

Use Redis sorted sets - automatically stores data sorted
Example - top 10 scores for a game

Redis use cases - Pub/sub messaging or queues

Redis use cases - Recommendation Data

Uses INCR or DECR in Redis
Using Redis hashes, you can maintain a list of who liked / disliked a product

Memcached Overview

Simple in-memory key-value store with sub-millisecond latency
Automatic detection and recovery from cache node failures
Typical applications
- Session store (persistent as well as transient session data store)
- DB query results caching (relational or NoSQL DBs - RDS / DynamoDB etc.)
- Webpage caching
- API caching
- Object caching (images/files/metadata)
Well suited for web / mobile apps, gaming, IoT, ad-tech, and e-commerce

Memcached Architecture

Memchached cluster is generally placed in private subnet
Accessed from EC2 instance placed in a public subnet in a VPC
Allows access only from EC2 network (apps should be hosted on whitelisted EC2 instances)
Whitelist using security groups
Up to 20 nodes per cluster
Data is distributed across the available nodes
Replicas are not supported
Node failure = data loss
Nodes can be deployed as Multi-AZ (to reduce data loss)

Memcached Auto Discovery

Allows client to automatically identify nodes in your Memcached cluster
No need to manually connect to individual nodes
Simply connect to any one node (using configuration endpoint) and retrieve a list of all other nodes
The metadat (list of all nodes) get s updated dynamically as you add/remove nodes
Node failures are automatically detected, and nodes get replaced
Enabled by default (you must use Auto Discovery capable client)

Memcached Scaling

Vertical scaling not supported
- can resize by creating a new cluster and migrating your application
Horizontal scaling
- allows you to partition your data across multiple nodes
- up to 20 nodes per cluster and 100 nodes per region (soft limit)
- no need to change endpoints post scaling (if you use auto-discovery)
- must re-map at least some of your keyspace post scaling (evently spread cache keys across all nodes)

Demo

Choosing between Redis and Memcached

Redis	Memcached
Sub-millisecond latency	Sub-millisecond latency
Supports complex data types (sorted sets, hashes, bitmaps, hyperloglog, geosparial index)	Support only simple data types (string, objects)
Multi AZ with Auto-Failover, supports sharding	Multi-node for sharding
Read Replicas for scalability and HA	Non persistent
Data durability using AOF persistence	No backup and restore
Backup and restore features	Multi-threaded architecture

ElastiCache Security - Encryption

Memcached does not support encryption
Encryption at rest. for Redis (using KMS)
Encryption in-transit for Redis (using TLS/SSL)
- Between server and client
- Is an optional feature
- Can have some performance impact
- Supports encrypted replication
Redis snapshots in S3 use S3's encryption capabilities

ElastiCache Security - Auth and Access Control

Authentication into the cache
- Redis AUTH - server can authenticate the clients (requires SSL/TLS enabled)
- Server Authentication - clients can authenticate that they are connecting to the right server
IAM
- IAM policies can be used for AWS API-level security (create cache, update cache etc.)
- ElastiCache doesn't support IAM permissions for actions within ElastiCache
  (which clients can access what)

ElastiCache Security - Network

Recommended to use private subnets
Control network access to ElastiCache through VPC security groups
ElastiCache Security Groups - allows to control access to ElastiCache clusters running outside Amazon VPC
For clusters within Amazon VPC, simply use VPC security groups

ElastiCache Logging and Monitoring

Integrated with CloudWatch
- Host level metrics - CPU / Memory / Network
- Redis metrics - replication lag / engine CPU utilization / metrics from Redis INFO command
- 60-second granularity
ElastiCache Events
- Integrated with SNS
- Log of events related to cluster instances / SGs / PGs
- Available within ElastiCache console
API calls logged with CloudTrail

ElastiCache Pricing

Priced per node-hour consumed for each node type
Partial node-hours consumed are billed as full hours
Can use reserved nodes. for upfront discounts (1-3 year terms)
Data transfer
- No charge for data transfer between EC2 and ElastiCache within AZ
- All other data transfer chargeable
Backup storage
- For automated and manual snapshots (per GB per month)
- Space for one snapshot is complimentary for each active Redis cluster

'AWS Database > AWS Other Database' 카테고리의 다른 글

[AWS Certificate]-Amazon QLDB (0)	2022.01.16
[AWS Certificate]-Amazon Timestream (0)	2022.01.16
[AWS Certificate]-Amazon Elasticsearch Service (0)	2022.01.16
[AWS Certificate]-Amazon Neptune (0)	2022.01.16
[AWS Certificate]-DocumentDB (0)	2022.01.15

Clark의 IT Container

[AWS Certificate]-ElastiCache

'AWS Database > AWS Other Database' 카테고리의 다른 글

티스토리툴바

[AWS Certificate]-ElastiCache

'AWS Database > AWS Other Database' 카테고리의 다른 글

'AWS Database/AWS Other Database' Related Articles

티스토리툴바