home / skills / williamzujkowski / standards / serverless
This skill helps you design and implement serverless architectures in Python, delivering auto-scaling, pay-per-use benefits and rapid time-to-market.
npx playbooks add skill williamzujkowski/standards --skill serverlessReview the files below or copy the command above to add this skill to your agents.
---
name: serverless
category: cloud-native
difficulty: intermediate
tags:
- aws-lambda
- serverless
- cloud-functions
- faas
- event-driven
related:
- kubernetes
- microservices
- monitoring
version: 1.0.0
description: 'Benefits:'
---
# Serverless Computing
## Level 1: Quick Reference
### Serverless Benefits and Tradeoffs
**Benefits:**
- **Zero Server Management**: No OS patching, scaling, or capacity planning
- **Automatic Scaling**: Scales from zero to thousands of concurrent executions
- **Pay-Per-Use**: Only charged for actual execution time (100ms granularity)
- **Built-in HA**: Multi-AZ deployment by default
- **Fast Time-to-Market**: Focus on code, not infrastructure
**Tradeoffs:**
- **Cold Starts**: 100ms-5s latency for new container initialization
- **Execution Limits**: 15 min max (Lambda), 60 min (Cloud Functions)
- **Vendor Lock-in**: Platform-specific APIs and deployment models
- **Debugging Complexity**: Distributed tracing across ephemeral environments
- **State Management**: Stateless by design, requires external persistence
### Common Serverless Patterns
**1. API Backend (HTTP Trigger)**
```
Client → API Gateway → Lambda → Database
- RESTful APIs, GraphQL endpoints
- Authentication/authorization at gateway
- Response caching, throttling, API keys
```
**2. Event-Driven Processing (Event Trigger)**
```
S3 Upload → Lambda → Process File → Store Result
SQS Queue → Lambda → Transform Data → Publish Event
DynamoDB Stream → Lambda → Aggregate Metrics
```
**3. Scheduled Tasks (Cron Trigger)**
```
EventBridge Rule (cron) → Lambda → Cleanup/Report/Backup
- Data aggregation (hourly, daily)
- Automated backups and archival
- Health checks and monitoring
```
**4. Stream Processing**
```
Kinesis/Kafka → Lambda → Real-time Analytics → Dashboard
- Log processing and filtering
- Clickstream analysis
- IoT data ingestion
```
### Essential Serverless Checklist
**Architecture:**
- [ ] Function size < 50 MB (Lambda), memory 128-10240 MB
- [ ] Single responsibility per function
- [ ] Async patterns for long-running tasks (SQS, Step Functions)
- [ ] Idempotent handlers (retry-safe)
**Cold Start Optimization:**
- [ ] Provisioned concurrency for latency-sensitive APIs
- [ ] Minimize dependencies (slim packages)
- [ ] Initialize SDK clients outside handler
- [ ] Use compiled languages (Go, Rust) for sub-100ms starts
**Timeouts and Concurrency:**
- [ ] Set appropriate timeout (default 3s, max 15min Lambda)
- [ ] Configure reserved/unreserved concurrency limits
- [ ] Use SQS for rate limiting and backpressure
- [ ] Monitor throttles and errors (CloudWatch alarms)
**Security:**
- [ ] Least-privilege IAM roles (one per function)
- [ ] Secrets in AWS Secrets Manager/Parameter Store
- [ ] VPC configuration for database access (increases cold start)
- [ ] Environment variable encryption (AWS KMS)
**Observability:**
- [ ] Structured logging (JSON) with correlation IDs
- [ ] X-Ray tracing for distributed requests
- [ ] Custom CloudWatch metrics (business KPIs)
- [ ] Alerting on error rates, duration, throttles
**Cost Optimization:**
- [ ] Right-size memory (CPU scales with memory)
- [ ] Use ARM (Graviton2) for 20% cost reduction
- [ ] Delete unused functions and versions
- [ ] Monitor invocation count and duration trends
---
## Level 2:
>
> **📚 Full Examples**: See [REFERENCE.md](./REFERENCE.md) for complete code samples, detailed configurations, and production-ready implementations.
Implementation Guide
### AWS Lambda Fundamentals
**Runtime Lifecycle:**
1. **INIT Phase** (cold start):
- Download deployment package from S3
- Start runtime (Python, Node.js, Java, Go, etc.)
- Execute initialization code (outside handler)
- Load dependencies and establish connections
- Typical duration: 100ms-5s
2. **INVOKE Phase**:
- Execute handler function
- Process event payload
- Return response or error
- Billed duration: rounded up to nearest 1ms
3. **SHUTDOWN Phase**:
- Runtime shutdown after inactivity (~10-15 min)
- Connections closed, temp files deleted
**Lambda Handler Pattern:**
*See [REFERENCE.md](./REFERENCE.md#example-0) for complete implementation.*
**Lambda Layers:**
- Shared code libraries across multiple functions
- Max 5 layers per function, 250 MB total (unzipped)
- Common use cases: SDKs, custom libraries, config files
*See [REFERENCE.md](./REFERENCE.md#example-1) for complete implementation.*
**Deployment Packages:**
- **Zip Archive**: Up to 50 MB (direct upload), 250 MB (S3)
- **Container Image**: Up to 10 GB (ECR), cold start +1-2s
**Multi-Cloud Comparison:**
| Feature | AWS Lambda | Google Cloud Functions | Azure Functions |
|---------|-----------|----------------------|-----------------|
| **Max Duration** | 15 minutes | 60 minutes (2nd gen) | 10 minutes (Consumption) |
| **Memory** | 128 MB - 10 GB | 128 MB - 32 GB | 128 MB - 14 GB |
| **Cold Start** | 100-500ms (Python/Node) | 200-800ms | 150-600ms |
| **Concurrency** | 1000 default (soft limit) | 1000 per region | 200 per instance |
| **Pricing** | $0.20/1M requests + $0.0000166667/GB-s | $0.40/1M + $0.0000025/GB-s | $0.20/1M + $0.000016/GB-s |
| **Triggers** | 20+ (S3, DynamoDB, SQS, API Gateway) | 10+ (HTTP, Pub/Sub, Storage) | 15+ (HTTP, Queue, Blob) |
### Serverless Frameworks
**1. AWS SAM (Serverless Application Model):**
- CloudFormation extension for serverless resources
- Local testing with `sam local`
- Built-in CI/CD with CodePipeline integration
*See [REFERENCE.md](./REFERENCE.md#example-2) for complete implementation.*
**2. Serverless Framework:**
- Multi-cloud support (AWS, Azure, GCP)
- Plugin ecosystem (offline, webpack, domain-manager)
- Environment-based deployments
*See [REFERENCE.md](./REFERENCE.md#example-3) for complete implementation.*
**3. AWS CDK (Cloud Development Kit):**
- Infrastructure as code in Python/TypeScript/Java
- Higher-level constructs (L2/L3)
- Type safety and IDE autocomplete
*See [REFERENCE.md](./REFERENCE.md#example-4) for complete implementation.*
### Event Sources and Triggers
**1. API Gateway (HTTP/REST):**
*See [REFERENCE.md](./REFERENCE.md#example-5) for complete implementation.*
**2. S3 Events:**
*See [REFERENCE.md](./REFERENCE.md#example-6) for complete implementation.*
**3. SQS (Queue):**
*See [REFERENCE.md](./REFERENCE.md#example-7) for complete implementation.*
**4. EventBridge (Scheduled/Custom):**
*See [REFERENCE.md](./REFERENCE.md#example-8) for complete implementation.*
**5. DynamoDB Streams:**
*See [REFERENCE.md](./REFERENCE.md#example-9) for complete implementation.*
### Cold Start Optimization
**Understanding Cold Starts:**
- **Triggered by**: First invocation, scaling up, runtime updates
- **Components**: Download code → Start runtime → Init code → Invoke handler
- **Impact**: P99 latency spikes, user-facing API degradation
**Optimization Techniques:**
**1. Provisioned Concurrency:**
*See [REFERENCE.md](./REFERENCE.md#example-10) for complete implementation.*
**2. Minimize Package Size:**
*See [REFERENCE.md](./REFERENCE.md#example-11) for complete implementation.*
**3. Lazy Loading:**
*See [REFERENCE.md](./REFERENCE.md#example-12) for complete implementation.*
**4. Connection Pooling:**
*See [REFERENCE.md](./REFERENCE.md#example-13) for complete implementation.*
**5. Choose Fast Runtimes:**
```
Cold Start Benchmarks (512 MB):
- Go: 100-150ms
- Rust: 120-180ms
- Python 3.11: 200-300ms
- Node.js 20: 250-350ms
- Java 17: 1-2s (use Snapstart for sub-200ms)
- .NET 7: 500-800ms
```
**6. Lambda SnapStart (Java):**
*See [REFERENCE.md](./REFERENCE.md#example-15) for complete implementation.*
### State Management
**Ephemeral Storage (Local):**
*See [REFERENCE.md](./REFERENCE.md#example-16) for complete implementation.*
**DynamoDB (Key-Value):**
*See [REFERENCE.md](./REFERENCE.md#example-17) for complete implementation.*
**S3 (Large Objects):**
*See [REFERENCE.md](./REFERENCE.md#example-18) for complete implementation.*
**ElastiCache (Distributed Cache):**
*See [REFERENCE.md](./REFERENCE.md#example-19) for complete implementation.*
**Step Functions (Workflow State):**
*See [REFERENCE.md](./REFERENCE.md#example-20) for complete implementation.*
### Observability and Monitoring
**Structured Logging:**
*See [REFERENCE.md](./REFERENCE.md#example-21) for complete implementation.*
**X-Ray Tracing:**
*See [REFERENCE.md](./REFERENCE.md#example-22) for complete implementation.*
**Custom CloudWatch Metrics:**
*See [REFERENCE.md](./REFERENCE.md#example-23) for complete implementation.*
**CloudWatch Alarms:**
*See [REFERENCE.md](./REFERENCE.md#example-24) for complete implementation.*
### Security Best Practices
**IAM Roles and Permissions:**
*See [REFERENCE.md](./REFERENCE.md#example-25) for complete implementation.*
**Secrets Management:**
*See [REFERENCE.md](./REFERENCE.md#example-26) for complete implementation.*
**VPC Configuration:**
*See [REFERENCE.md](./REFERENCE.md#example-27) for complete implementation.*
**Input Validation:**
*See [REFERENCE.md](./REFERENCE.md#example-28) for complete implementation.*
### Testing Strategies
**Unit Tests:**
*See [REFERENCE.md](./REFERENCE.md#example-29) for complete implementation.*
**Integration Tests:**
*See [REFERENCE.md](./REFERENCE.md#example-30) for complete implementation.*
**Local Testing (SAM CLI):**
*See [REFERENCE.md](./REFERENCE.md#example-31) for complete implementation.*
### Cost Optimization
**Memory Sizing:**
*See [REFERENCE.md](./REFERENCE.md#example-32) for complete implementation.*
**ARM (Graviton2) Migration:**
*See [REFERENCE.md](./REFERENCE.md#example-33) for complete implementation.*
**Monitoring Costs:**
*See [REFERENCE.md](./REFERENCE.md#example-34) for complete implementation.*
**Cleanup and Governance:**
*See [REFERENCE.md](./REFERENCE.md#example-35) for complete implementation.*
## Examples
### Basic Usage
*See [REFERENCE.md](./REFERENCE.md#example-36) for complete implementation.*
### Advanced Usage
```python
// TODO: Add advanced example for serverless
// This example shows production-ready patterns
```
### Integration Example
```python
// TODO: Add integration example showing how serverless
// works with other systems and services
```
See `examples/serverless/` for complete working examples.
## Integration Points
This skill integrates with:
### Upstream Dependencies
- **Tools**: Common development tools and frameworks
- **Prerequisites**: Basic understanding of general concepts
### Downstream Consumers
- **Applications**: Production systems requiring serverless functionality
- **CI/CD Pipelines**: Automated testing and deployment workflows
- **Monitoring Systems**: Observability and logging platforms
### Related Skills
- See other skills in this category
### Common Integration Patterns
1. **Development Workflow**: How this skill fits into daily development
2. **Production Deployment**: Integration with production systems
3. **Monitoring & Alerting**: Observability integration points
## Common Pitfalls
### Pitfall 1: Insufficient Testing
**Problem:** Not testing edge cases and error conditions leads to production bugs
**Solution:** Implement comprehensive test coverage including:
- Happy path scenarios
- Error handling and edge cases
- Integration points with external systems
**Prevention:** Enforce minimum code coverage (80%+) in CI/CD pipeline
### Pitfall 2: Hardcoded Configuration
**Problem:** Hardcoding values makes applications inflexible and environment-dependent
**Solution:** Use environment variables and configuration management:
- Separate config from code
- Use environment-specific configuration files
- Never commit secrets to version control
**Prevention:** Use tools like dotenv, config validators, and secret scanners
### Pitfall 3: Ignoring Security Best Practices
**Problem:** Security vulnerabilities from not following established security patterns
**Solution:** Follow security guidelines:
- Input validation and sanitization
- Proper authentication and authorization
- Encrypted data transmission (TLS/SSL)
- Regular security audits and updates
**Prevention:** Use security linters, SAST tools, and regular dependency updates
**Best Practices:**
- Follow established patterns and conventions for serverless
- Keep dependencies up to date and scan for vulnerabilities
- Write comprehensive documentation and inline comments
- Use linting and formatting tools consistently
- Implement proper error handling and logging
- Regular code reviews and pair programming
- Monitor production metrics and set up alerts
---
## Level 3: Deep Dive Resources
### Official Documentation
- [AWS Lambda Developer Guide](https://docs.aws.amazon.com/lambda/)
- [Google Cloud Functions](https://cloud.google.com/functions/docs)
- [Azure Functions Documentation](https://learn.microsoft.com/azure/azure-functions/)
- [Serverless Framework](https://www.serverless.com/framework/docs)
- [AWS SAM](https://docs.aws.amazon.com/serverless-application-model/)
### Books and Courses
- **"Serverless Architectures on AWS"** by Peter Sbarski
- **"Production-Ready Serverless"** by Yan Cui
- **AWS Certified Solutions Architect** (includes serverless)
- **A Cloud Guru: AWS Lambda & Serverless**
### Tools and Libraries
- **AWS Lambda Powertools** (Python, TypeScript, Java)
- **Serverless Framework Plugins** (offline, webpack, prune)
- **Lumigo** (Observability and debugging)
- **Thundra** (APM for serverless)
- **AWS X-Ray** (Distributed tracing)
### Community Resources
- [Serverless Stack (SST)](https://sst.dev/)
- [Off-by-none Newsletter](https://offbynone.io/)
- [ServerlessLand Patterns](https://serverlessland.com/patterns)
- [AWS Samples GitHub](https://github.com/aws-samples)
### Bundled Resources
- `templates/lambda-function.py` - Production Lambda template
- `templates/sam-template.yaml` - SAM infrastructure template
- `templates/serverless.yml` - Serverless Framework config
- `templates/api-gateway.yaml` - API Gateway REST API
- `scripts/deploy-lambda.sh` - Automated deployment script
- `resources/serverless-patterns.md` - Common architecture patterns
This skill provides a concise, production-ready serverless development standard for starting projects quickly. It bundles battle-tested patterns, templates, and operational checklists focused on AWS Lambda and multi-cloud considerations. The goal is to help teams ship API backends, event-driven processors, and scheduled tasks with sensible defaults for performance, security, and cost.
The skill inspects common serverless concerns and supplies templates, configuration examples, and a checklist to enforce standards. It includes lifecycle guidance (init/invoke/shutdown), recommendations for cold-start optimization, observability patterns, and deployment options (SAM, Serverless Framework, CDK). It also offers integration points for CI/CD, monitoring, and downstream applications.
How do I reduce cold start latency?
Use provisioned concurrency for latency-sensitive endpoints, shrink deployment packages, lazy-load heavy libraries, and pick faster runtimes (Go, Rust, Python 3.11).
What should I do about long-running work?
Avoid long synchronous functions. Break work into async patterns: SQS, Step Functions, or background jobs. Use orchestration for retries and visibility.