home / skills / sickn33 / antigravity-awesome-skills / multi-cloud-architecture

multi-cloud-architecture skill

/skills/multi-cloud-architecture

This skill helps you design cloud-agnostic multi-cloud architectures across AWS, Azure, and GCP using a decision framework.

This is most likely a fork of the multi-cloud-architecture skill from xfstudio
npx playbooks add skill sickn33/antigravity-awesome-skills --skill multi-cloud-architecture

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.1 KB
---
name: multi-cloud-architecture
description: Design multi-cloud architectures using a decision framework to select and integrate services across AWS, Azure, and GCP. Use when building multi-cloud systems, avoiding vendor lock-in, or leveraging best-of-breed services from multiple providers.
---

# Multi-Cloud Architecture

Decision framework and patterns for architecting applications across AWS, Azure, and GCP.

## Do not use this skill when

- The task is unrelated to multi-cloud architecture
- You need a different domain or tool outside this scope

## Instructions

- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.

## Purpose

Design cloud-agnostic architectures and make informed decisions about service selection across cloud providers.

## Use this skill when

- Design multi-cloud strategies
- Migrate between cloud providers
- Select cloud services for specific workloads
- Implement cloud-agnostic architectures
- Optimize costs across providers

## Cloud Service Comparison

### Compute Services

| AWS | Azure | GCP | Use Case |
|-----|-------|-----|----------|
| EC2 | Virtual Machines | Compute Engine | IaaS VMs |
| ECS | Container Instances | Cloud Run | Containers |
| EKS | AKS | GKE | Kubernetes |
| Lambda | Functions | Cloud Functions | Serverless |
| Fargate | Container Apps | Cloud Run | Managed containers |

### Storage Services

| AWS | Azure | GCP | Use Case |
|-----|-------|-----|----------|
| S3 | Blob Storage | Cloud Storage | Object storage |
| EBS | Managed Disks | Persistent Disk | Block storage |
| EFS | Azure Files | Filestore | File storage |
| Glacier | Archive Storage | Archive Storage | Cold storage |

### Database Services

| AWS | Azure | GCP | Use Case |
|-----|-------|-----|----------|
| RDS | SQL Database | Cloud SQL | Managed SQL |
| DynamoDB | Cosmos DB | Firestore | NoSQL |
| Aurora | PostgreSQL/MySQL | Cloud Spanner | Distributed SQL |
| ElastiCache | Cache for Redis | Memorystore | Caching |

**Reference:** See `references/service-comparison.md` for complete comparison

## Multi-Cloud Patterns

### Pattern 1: Single Provider with DR

- Primary workload in one cloud
- Disaster recovery in another
- Database replication across clouds
- Automated failover

### Pattern 2: Best-of-Breed

- Use best service from each provider
- AI/ML on GCP
- Enterprise apps on Azure
- General compute on AWS

### Pattern 3: Geographic Distribution

- Serve users from nearest cloud region
- Data sovereignty compliance
- Global load balancing
- Regional failover

### Pattern 4: Cloud-Agnostic Abstraction

- Kubernetes for compute
- PostgreSQL for database
- S3-compatible storage (MinIO)
- Open source tools

## Cloud-Agnostic Architecture

### Use Cloud-Native Alternatives

- **Compute:** Kubernetes (EKS/AKS/GKE)
- **Database:** PostgreSQL/MySQL (RDS/SQL Database/Cloud SQL)
- **Message Queue:** Apache Kafka (MSK/Event Hubs/Confluent)
- **Cache:** Redis (ElastiCache/Azure Cache/Memorystore)
- **Object Storage:** S3-compatible API
- **Monitoring:** Prometheus/Grafana
- **Service Mesh:** Istio/Linkerd

### Abstraction Layers

```
Application Layer
    ↓
Infrastructure Abstraction (Terraform)
    ↓
Cloud Provider APIs
    ↓
AWS / Azure / GCP
```

## Cost Comparison

### Compute Pricing Factors

- **AWS:** On-demand, Reserved, Spot, Savings Plans
- **Azure:** Pay-as-you-go, Reserved, Spot
- **GCP:** On-demand, Committed use, Preemptible

### Cost Optimization Strategies

1. Use reserved/committed capacity (30-70% savings)
2. Leverage spot/preemptible instances
3. Right-size resources
4. Use serverless for variable workloads
5. Optimize data transfer costs
6. Implement lifecycle policies
7. Use cost allocation tags
8. Monitor with cloud cost tools

**Reference:** See `references/multi-cloud-patterns.md`

## Migration Strategy

### Phase 1: Assessment
- Inventory current infrastructure
- Identify dependencies
- Assess cloud compatibility
- Estimate costs

### Phase 2: Pilot
- Select pilot workload
- Implement in target cloud
- Test thoroughly
- Document learnings

### Phase 3: Migration
- Migrate workloads incrementally
- Maintain dual-run period
- Monitor performance
- Validate functionality

### Phase 4: Optimization
- Right-size resources
- Implement cloud-native services
- Optimize costs
- Enhance security

## Best Practices

1. **Use infrastructure as code** (Terraform/OpenTofu)
2. **Implement CI/CD pipelines** for deployments
3. **Design for failure** across clouds
4. **Use managed services** when possible
5. **Implement comprehensive monitoring**
6. **Automate cost optimization**
7. **Follow security best practices**
8. **Document cloud-specific configurations**
9. **Test disaster recovery** procedures
10. **Train teams** on multiple clouds

## Reference Files

- `references/service-comparison.md` - Complete service comparison
- `references/multi-cloud-patterns.md` - Architecture patterns

## Related Skills

- `terraform-module-library` - For IaC implementation
- `cost-optimization` - For cost management
- `hybrid-cloud-networking` - For connectivity

Overview

This skill designs multi-cloud architectures using a practical decision framework to select and integrate services across AWS, Azure, and GCP. It helps you evaluate trade-offs, pick best-of-breed services, and create cloud-agnostic designs that reduce vendor lock-in. The output includes patterns, migration phases, and actionable steps for implementation.

How this skill works

The skill clarifies goals, constraints, and required inputs, then applies a decision framework to map workloads to provider services and multi-cloud patterns. It compares compute, storage, and database options across providers and recommends patterns like single-provider with DR, best-of-breed, geographic distribution, or cloud-agnostic abstraction. It produces migration phases, cost optimization tactics, and validation steps to verify architecture and runbooks.

When to use it

  • Design a multi-cloud strategy to avoid vendor lock-in or leverage provider strengths
  • Migrate workloads between cloud providers or run dual-cloud operations
  • Select the most appropriate cloud services for specific workloads (compute, storage, database)
  • Build cloud-agnostic applications with portability and common abstractions
  • Optimize costs, fault tolerance, or regulatory compliance across regions/providers

Best practices

  • Start by clarifying goals, constraints, compliance, and failure scenarios before selecting services
  • Use infrastructure-as-code (Terraform/OpenTofu) and CI/CD pipelines to enforce consistency
  • Favor managed services for operational efficiency, but standardize on open APIs for portability
  • Design for failure: replicate critical data, automate failover, and test disaster recovery regularly
  • Adopt cloud-agnostic runtimes where appropriate (Kubernetes, PostgreSQL, S3-compatible storage) to reduce coupling
  • Track costs with tagging, reserved/committed capacity, and automated rightsizing; measure and iterate

Example use cases

  • Primary workload on AWS with automated disaster recovery in Azure for geographic resilience
  • Run AI/ML pipelines on GCP while hosting enterprise apps on Azure and general compute on AWS (best-of-breed)
  • Deploy a Kubernetes control plane across providers with common CI/CD, monitoring (Prometheus/Grafana), and S3-compatible object storage
  • Migrate a legacy monolith: assessment, pilot in target cloud, incremental migration with dual-run validation
  • Implement global traffic routing to serve users from nearest cloud region for latency and sovereignty requirements

FAQ

How do I choose between a single-provider with DR and best-of-breed?

Use single-provider with DR when operational simplicity and unified support matter; choose best-of-breed when specific workload features or performance justify cross-provider complexity.

What are quick wins for multi-cloud cost optimization?

Enable reserved/committed pricing where predictable, use spot/preemptible instances for flexible workloads, right-size resources, and apply lifecycle policies for storage to reduce wasted spend.