DevOps Zero to Hero: Part 6 - AWS Fundamentals for DevOps
Introduction
Amazon Web Services (AWS) is the world's most comprehensive cloud platform, offering over 200 fully featured services. As a DevOps engineer, understanding AWS fundamentals is crucial for building, deploying, and managing applications in the cloud. This part covers essential AWS services and best practices for DevOps workflows.
AWS Global Infrastructure
Regions and Availability Zones
Regions: Physical locations around the world with clusters of data centers
Availability Zones (AZs): One or more discrete data centers within a region
Edge Locations: Content delivery network (CDN) points for CloudFront
Local Zones: Extension of regions closer to end users
Choosing a Region
Consider these factors:
Latency: Proximity to users
Compliance: Data sovereignty requirements
Service Availability: Not all services available in all regions
Cost: Pricing varies by region
Disaster Recovery: Multi-region for high availability
Identity and Access Management (IAM)
Core Concepts
Users: Individual identities with credentials
Groups: Collections of users with shared permissions
Roles: Temporary credentials for services/users
Policies: JSON documents defining permissions
IAM Best Practices
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
### Creating IAM Resources with AWS CLI
```bash
# Create user
aws iam create-user --user-name devops-user
# Create access key
aws iam create-access-key --user-name devops-user
# Create group
aws iam create-group --group-name devops-team
# Add user to group
aws iam add-user-to-group --user-name devops-user --group-name devops-team
# Attach policy to group
aws iam attach-group-policy --group-name devops-team --policy-arn arn:aws:iam::aws:policy/PowerUserAccess
# Create role for EC2
aws iam create-role --role-name ec2-s3-access --assume-role-policy-document file://trust-policy.json
# Attach policy to role
aws iam attach-role-policy --role-name ec2-s3-access --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess
IAM Security Best Practices
Enable MFA for all users
Use roles instead of access keys where possible
Apply least privilege principle
Rotate credentials regularly
Use policy conditions for additional security
Enable CloudTrail for audit logging
Use AWS Organizations for multi-account management
Compute Services
EC2 (Elastic Compute Cloud)
Instance Types
General Purpose (t3, m5): Balanced compute, memory, networking
Compute Optimized (c5): High-performance processors
Memory Optimized (r5, x1): In-memory databases
Storage Optimized (i3, d2): High sequential read/write
Accelerated Computing (p3, g4): GPU instances
EC2 User Data Script
#!/bin/bash
# This script runs when instance starts
# Update system
yum update -y
# Install Docker
amazon-linux-extras install docker -y
service docker start
usermod -a -G docker ec2-user
# Install Docker Compose
curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
# Install CloudWatch agent
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
rpm -U ./amazon-cloudwatch-agent.rpm
# Pull and run application
docker pull myapp:latest
docker run -d -p 80:3000 --name app --restart always myapp:latest
EC2 Launch Template
aws ec2 create-launch-template \
--launch-template-name devops-template \
--version-description "DevOps Web App Template" \
--launch-template-data '{
"ImageId": "ami-0c55b159cbfafe1f0",
"InstanceType": "t3.micro",
"KeyName": "my-key-pair",
"SecurityGroupIds": ["sg-12345678"],
"UserData": "IyEvYmluL2Jhc2gKZWNobyAiSGVsbG8gV29ybGQi",
"IamInstanceProfile": {
"Name": "ec2-s3-access"
},
"TagSpecifications": [{
"ResourceType": "instance",
"Tags": [
{"Key": "Name", "Value": "DevOps-Instance"},
{"Key": "Environment", "Value": "Production"}
]
}]
}'
ECS (Elastic Container Service)
Task Definition
{
"family": "devops-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"containerDefinitions": [
{
"name": "web-app",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/devops-app:latest",
"portMappings": [
{
"containerPort": 3000,
"protocol": "tcp"
}
],
"essential": true,
"environment": [
{
"name": "NODE_ENV",
"value": "production"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/devops-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
]
}
ECS Service with Auto Scaling
# Create service
aws ecs create-service \
--cluster production-cluster \
--service-name devops-service \
--task-definition devops-app:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345,subnet-67890],securityGroups=[sg-12345],assignPublicIp=ENABLED}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=web-app,containerPort=3000"
# Register scalable target
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/production-cluster/devops-service \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 10
# Create scaling policy
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/production-cluster/devops-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-scaling \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
}
}'
Lambda Functions
Creating Lambda Function
# lambda_function.py
import json
import boto3
import os
from datetime import datetime
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])
def lambda_handler(event, context):
"""
Process incoming events and store in DynamoDB
"""
try:
# Parse event
body = json.loads(event.get('body', '{}'))
# Prepare item
item = {
'id': context.request_id,
'timestamp': datetime.utcnow().isoformat(),
'event_type': body.get('type', 'unknown'),
'data': body,
'processed': True
}
# Store in DynamoDB
table.put_item(Item=item)
return {
'statusCode': 200,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps({
'message': 'Event processed successfully',
'id': context.request_id
})
}
except Exception as e:
print(f"Error: {str(e)}")
return {
'statusCode': 500,
'body': json.dumps({'error': str(e)})
}
Deploy Lambda with CLI
# Package function
zip function.zip lambda_function.py
# Create function
aws lambda create-function \
--function-name process-events \
--runtime python3.9 \
--role arn:aws:iam::123456789012:role/lambda-execution-role \
--handler lambda_function.lambda_handler \
--zip-file fileb://function.zip \
--timeout 30 \
--memory-size 256 \
--environment Variables={TABLE_NAME=events-table}
# Create API Gateway trigger
aws apigatewayv2 create-api \
--name events-api \
--protocol-type HTTP \
--target arn:aws:lambda:us-east-1:123456789012:function:process-events
Storage Services
S3 (Simple Storage Service)
S3 Bucket Policies
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-static-website/*"
},
{
"Sid": "DenyUnencryptedObjectUploads",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-secure-bucket/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "AES256"
}
}
}
]
}
S3 Lifecycle Rules
aws s3api put-bucket-lifecycle-configuration \
--bucket my-app-logs \
--lifecycle-configuration '{
"Rules": [
{
"Id": "ArchiveOldLogs",
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
}
]
}'
S3 Static Website Hosting
# Create bucket
aws s3 mb s3://my-static-website
# Enable static website hosting
aws s3 website s3://my-static-website \
--index-document index.html \
--error-document error.html
# Upload files
aws s3 sync ./dist s3://my-static-website --acl public-read
# Create CloudFront distribution
aws cloudfront create-distribution \
--origin-domain-name my-static-website.s3.amazonaws.com \
--default-root-object index.html
EBS (Elastic Block Store)
Volume Types
gp3: General purpose SSD (3000-16000 IOPS)
gp2: Previous generation general purpose
io2: Provisioned IOPS SSD (up to 64000 IOPS)
st1: Throughput optimized HDD
sc1: Cold HDD
EBS Snapshots
# Create snapshot
aws ec2 create-snapshot \
--volume-id vol-12345678 \
--description "Daily backup $(date +%Y-%m-%d)"
# Create snapshot lifecycle policy
aws dlm create-lifecycle-policy \
--execution-role-arn arn:aws:iam::123456789012:role/dlm-lifecycle-role \
--description "Daily EBS snapshots" \
--state ENABLED \
--policy-details '{
"PolicyType": "EBS_SNAPSHOT_MANAGEMENT",
"ResourceTypes": ["VOLUME"],
"TargetTags": [{"Key": "Backup", "Value": "true"}],
"Schedules": [{
"Name": "Daily Snapshots",
"CreateRule": {
"Interval": 24,
"IntervalUnit": "HOURS",
"Times": ["03:00"]
},
"RetainRule": {
"Count": 7
}
}]
}'
EFS (Elastic File System)
# Create EFS
aws efs create-file-system \
--creation-token my-efs \
--performance-mode generalPurpose \
--throughput-mode bursting \
--encrypted
# Create mount targets
aws efs create-mount-target \
--file-system-id fs-12345678 \
--subnet-id subnet-12345678 \
--security-groups sg-12345678
# Mount on EC2
sudo mount -t efs -o tls fs-12345678:/ /mnt/efs
Database Services
RDS (Relational Database Service)
Multi-AZ RDS Setup
aws rds create-db-instance \
--db-instance-identifier production-db \
--db-instance-class db.t3.micro \
--engine postgres \
--engine-version 14.7 \
--master-username admin \
--master-user-password SecurePass123! \
--allocated-storage 100 \
--storage-type gp3 \
--storage-encrypted \
--vpc-security-group-ids sg-12345678 \
--db-subnet-group-name production-subnet-group \
--backup-retention-period 7 \
--preferred-backup-window "03:00-04:00" \
--preferred-maintenance-window "mon:04:00-mon:05:00" \
--multi-az \
--auto-minor-version-upgrade \
--enable-performance-insights \
--performance-insights-retention-period 7
RDS Read Replica
aws rds create-db-instance-read-replica \
--db-instance-identifier production-db-read \
--source-db-instance-identifier production-db \
--db-instance-class db.t3.micro \
--publicly-accessible false
DynamoDB
Create Table with Global Secondary Index
aws dynamodb create-table \
--table-name user-sessions \
--attribute-definitions \
AttributeName=user_id,AttributeType=S \
AttributeName=session_id,AttributeType=S \
AttributeName=timestamp,AttributeType=N \
--key-schema \
AttributeName=user_id,KeyType=HASH \
AttributeName=session_id,KeyType=RANGE \
--global-secondary-indexes '[
{
"IndexName": "SessionIndex",
"Keys": [
{"AttributeName": "session_id", "KeyType": "HASH"},
{"AttributeName": "timestamp", "KeyType": "RANGE"}
],
"Projection": {"ProjectionType": "ALL"},
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
}
}
]' \
--billing-mode PAY_PER_REQUEST \
--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES \
--tags Key=Environment,Value=Production
DynamoDB Auto Scaling
aws application-autoscaling register-scalable-target \
--service-namespace dynamodb \
--resource-id table/user-sessions \
--scalable-dimension dynamodb:table:ReadCapacityUnits \
--min-capacity 5 \
--max-capacity 1000
aws application-autoscaling put-scaling-policy \
--service-namespace dynamodb \
--resource-id table/user-sessions \
--scalable-dimension dynamodb:table:ReadCapacityUnits \
--policy-name ReadScalingPolicy \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "DynamoDBReadCapacityUtilization"
}
}'
Networking
VPC (Virtual Private Cloud)
Complete VPC Setup
# Create VPC
aws ec2 create-vpc --cidr-block 10.0.0.0/16
# Create Internet Gateway
aws ec2 create-internet-gateway
# Attach IGW to VPC
aws ec2 attach-internet-gateway --vpc-id vpc-12345 --internet-gateway-id igw-12345
# Create public subnet
aws ec2 create-subnet --vpc-id vpc-12345 --cidr-block 10.0.1.0/24 --availability-zone us-east-1a
# Create private subnet
aws ec2 create-subnet --vpc-id vpc-12345 --cidr-block 10.0.10.0/24 --availability-zone us-east-1a
# Create NAT Gateway
aws ec2 allocate-address --domain vpc
aws ec2 create-nat-gateway --subnet-id subnet-12345 --allocation-id eipalloc-12345
# Create route tables
aws ec2 create-route-table --vpc-id vpc-12345
aws ec2 create-route --route-table-id rtb-12345 --destination-cidr-block 0.0.0.0/0 --gateway-id igw-12345
Application Load Balancer
# Create ALB
aws elbv2 create-load-balancer \
--name production-alb \
--subnets subnet-12345 subnet-67890 \
--security-groups sg-12345 \
--scheme internet-facing \
--type application \
--ip-address-type ipv4
# Create target group
aws elbv2 create-target-group \
--name production-targets \
--protocol HTTP \
--port 80 \
--vpc-id vpc-12345 \
--health-check-enabled \
--health-check-path /health \
--health-check-interval-seconds 30 \
--health-check-timeout-seconds 5 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3
# Create listener
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:... \
--protocol HTTPS \
--port 443 \
--certificates CertificateArn=arn:aws:acm:... \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...
Route 53
# Create hosted zone
aws route53 create-hosted-zone \
--name example.com \
--caller-reference $(date +%s)
# Create A record
aws route53 change-resource-record-sets \
--hosted-zone-id Z123456789 \
--change-batch '{
"Changes": [{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "www.example.com",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "Z35SXDOTRQ7X7K",
"DNSName": "production-alb-123456.us-east-1.elb.amazonaws.com",
"EvaluateTargetHealth": true
}
}
}]
}'
Monitoring and Logging
CloudWatch
Custom Metrics
import boto3
from datetime import datetime
cloudwatch = boto3.client('cloudwatch')
def put_custom_metric(metric_name, value, unit='Count'):
"""Send custom metric to CloudWatch"""
response = cloudwatch.put_metric_data(
Namespace='CustomApp',
MetricData=[
{
'MetricName': metric_name,
'Value': value,
'Unit': unit,
'Timestamp': datetime.utcnow()
}
]
)
return response
# Example usage
put_custom_metric('RequestCount', 1)
put_custom_metric('ResponseTime', 250, 'Milliseconds')
CloudWatch Alarms
# CPU alarm
aws cloudwatch put-metric-alarm \
--alarm-name high-cpu \
--alarm-description "Alarm when CPU exceeds 80%" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 2 \
--alarm-actions arn:aws:sns:us-east-1:123456789012:alerts
# Custom metric alarm
aws cloudwatch put-metric-alarm \
--alarm-name high-error-rate \
--alarm-description "Alarm when error rate exceeds 1%" \
--metric-name ErrorRate \
--namespace CustomApp \
--statistic Average \
--period 60 \
--threshold 1 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 3
CloudWatch Logs Insights
-- Find top 10 slowest API requests
fields @timestamp, @message
| filter @message like /Response time/
| parse @message /Response time: (?<duration>\d+)ms/
| sort duration desc
| limit 10
-- Count errors by type
fields @timestamp, @message
| filter @message like /ERROR/
| parse @message /ERROR: (?<error_type>[^:]+)/
| stats count() by error_type
-- Request rate per minute
fields @timestamp
| filter @message like /Request/
| stats count() by bin(1m)
X-Ray Tracing
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all
# Patch all supported libraries
patch_all()
@xray_recorder.capture('process_request')
def process_request(request):
# Add metadata
xray_recorder.current_subsegment().put_metadata('user_id', request.user_id)
# Add annotation (searchable)
xray_recorder.current_subsegment().put_annotation('request_type', request.type)
# Process request
result = perform_operation(request)
return result
Security Best Practices
AWS Systems Manager
Parameter Store
# Store secure parameter
aws ssm put-parameter \
--name /production/database/password \
--value "SecurePassword123!" \
--type SecureString \
--key-id alias/aws/ssm
# Retrieve parameter
aws ssm get-parameter \
--name /production/database/password \
--with-decryption
Session Manager
# Start session
aws ssm start-session --target i-1234567890abcdef0
# Port forwarding
aws ssm start-session \
--target i-1234567890abcdef0 \
--document-name AWS-StartPortForwardingSession \
--parameters '{"portNumber":["3306"],"localPortNumber":["3306"]}'
AWS Secrets Manager
# Create secret
aws secretsmanager create-secret \
--name production/database \
--description "Production database credentials" \
--secret-string '{
"username": "admin",
"password": "SecurePass123!",
"engine": "postgres",
"host": "production-db.cluster-123456.us-east-1.rds.amazonaws.com",
"port": 5432,
"dbname": "appdb"
}'
# Rotate secret
aws secretsmanager rotate-secret \
--secret-id production/database \
--rotation-lambda-arn arn:aws:lambda:us-east-1:123456789012:function:SecretsManagerRotation
Cost Optimization
Cost Management Tools
# Enable Cost Explorer
aws ce get-cost-and-usage \
--time-period Start=2024-01-01,End=2024-01-31 \
--granularity MONTHLY \
--metrics "UnblendedCost" \
--group-by Type=DIMENSION,Key=SERVICE
# Create budget
aws budgets create-budget \
--account-id 123456789012 \
--budget '{
"BudgetName": "Monthly-Budget",
"BudgetLimit": {
"Amount": "1000",
"Unit": "USD"
},
"TimeUnit": "MONTHLY",
"BudgetType": "COST"
}' \
--notifications-with-subscribers '[
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "admin@example.com"
}
]
}
]'
Reserved Instances and Savings Plans
# Get RI recommendations
aws ce get-reservation-purchase-recommendation \
--service "Amazon Elastic Compute Cloud - Compute" \
--lookback-period-in-days THIRTY_DAYS \
--term-in-years ONE_YEAR \
--payment-option NO_UPFRONT
# Get Savings Plans recommendations
aws ce get-savings-plans-purchase-recommendation \
--savings-plans-type COMPUTE_SP \
--term-in-years ONE_YEAR \
--payment-option NO_UPFRONT \
--lookback-period-in-days THIRTY_DAYS
Disaster Recovery
Backup Strategies
# Create backup plan
aws backup create-backup-plan \
--backup-plan '{
"BackupPlanName": "DailyBackups",
"Rules": [{
"RuleName": "DailyRule",
"TargetBackupVaultName": "Default",
"ScheduleExpression": "cron(0 5 ? * * *)",
"StartWindowMinutes": 60,
"CompletionWindowMinutes": 120,
"Lifecycle": {
"DeleteAfterDays": 30
}
}]
}'
# Assign resources to backup plan
aws backup create-backup-selection \
--backup-plan-id plan-12345 \
--backup-selection '{
"SelectionName": "AllEC2",
"IamRoleArn": "arn:aws:iam::123456789012:role/service-role/AWSBackupDefaultServiceRole",
"Resources": ["arn:aws:ec2:*:*:instance/*"],
"ListOfTags": [{
"ConditionType": "STRINGEQUALS",
"ConditionKey": "Backup",
"ConditionValue": "true"
}]
}'
Key Takeaways
AWS provides comprehensive services for every layer of the technology stack
IAM is fundamental for security - always follow least privilege principle
Choose the right compute service: EC2 for full control, ECS/Fargate for containers, Lambda for serverless
Use managed services when possible to reduce operational overhead
Implement proper monitoring and logging from day one
Design for failure - use multiple AZs and regions for high availability
Optimize costs with Reserved Instances, Savings Plans, and right-sizing
Automate everything - infrastructure, deployments, backups, and scaling
What's Next?
In Part 7, we'll deploy containers to Amazon ECS. You'll learn:
ECS architecture and concepts
Creating task definitions and services
Load balancing with ALB
Auto-scaling containers
Blue-green deployments
Service discovery
ECS with Fargate vs EC2
Additional Resources
Ready to deploy containers at scale? Continue with Part 7: Deploying Containers to Amazon ECS! ] }