home / skills / nilecui / skillsbase / deployment-pipeline-design

deployment-pipeline-design skill

needs review

/.cursor/skills/deployment-pipeline-design

This skill designs multi-stage CI/CD pipelines with approval gates, security checks, and deployment orchestration to accelerate safe software delivery.

npx playbooks add skill nilecui/skillsbase --skill deployment-pipeline-design

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

8.3 KB

---
name: deployment-pipeline-design
description: Design multi-stage CI/CD pipelines with approval gates, security checks, and deployment orchestration. Use when architecting deployment workflows, setting up continuous delivery, or implementing GitOps practices.
---

# Deployment Pipeline Design

Architecture patterns for multi-stage CI/CD pipelines with approval gates and deployment strategies.

## Purpose

Design robust, secure deployment pipelines that balance speed with safety through proper stage organization and approval workflows.

## When to Use

- Design CI/CD architecture
- Implement deployment gates
- Configure multi-environment pipelines
- Establish deployment best practices
- Implement progressive delivery

## Pipeline Stages

### Standard Pipeline Flow

```
┌─────────┐   ┌──────┐   ┌─────────┐   ┌────────┐   ┌──────────┐
│  Build  │ → │ Test │ → │ Staging │ → │ Approve│ → │Production│
└─────────┘   └──────┘   └─────────┘   └────────┘   └──────────┘
```

### Detailed Stage Breakdown

1. **Source** - Code checkout
2. **Build** - Compile, package, containerize
3. **Test** - Unit, integration, security scans
4. **Staging Deploy** - Deploy to staging environment
5. **Integration Tests** - E2E, smoke tests
6. **Approval Gate** - Manual approval required
7. **Production Deploy** - Canary, blue-green, rolling
8. **Verification** - Health checks, monitoring
9. **Rollback** - Automated rollback on failure

## Approval Gate Patterns

### Pattern 1: Manual Approval

```yaml
# GitHub Actions
production-deploy:
  needs: staging-deploy
  environment:
    name: production
    url: https://app.example.com
  runs-on: ubuntu-latest
  steps:
    - name: Deploy to production
      run: |
        # Deployment commands
```

### Pattern 2: Time-Based Approval

```yaml
# GitLab CI
deploy:production:
  stage: deploy
  script:
    - deploy.sh production
  environment:
    name: production
  when: delayed
  start_in: 30 minutes
  only:
    - main
```

### Pattern 3: Multi-Approver

```yaml
# Azure Pipelines
stages:
- stage: Production
  dependsOn: Staging
  jobs:
  - deployment: Deploy
    environment:
      name: production
      resourceType: Kubernetes
    strategy:
      runOnce:
        preDeploy:
          steps:
          - task: ManualValidation@0
            inputs:
              notifyUsers: '[email protected]'
              instructions: 'Review staging metrics before approving'
```

**Reference:** See `assets/approval-gate-template.yml`

## Deployment Strategies

### 1. Rolling Deployment

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 1
```

**Characteristics:**
- Gradual rollout
- Zero downtime
- Easy rollback
- Best for most applications

### 2. Blue-Green Deployment

```yaml
# Blue (current)
kubectl apply -f blue-deployment.yaml
kubectl label service my-app version=blue

# Green (new)
kubectl apply -f green-deployment.yaml
# Test green environment
kubectl label service my-app version=green

# Rollback if needed
kubectl label service my-app version=blue
```

**Characteristics:**
- Instant switchover
- Easy rollback
- Doubles infrastructure cost temporarily
- Good for high-risk deployments

### 3. Canary Deployment

```yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
spec:
  replicas: 10
  strategy:
    canary:
      steps:
      - setWeight: 10
      - pause: {duration: 5m}
      - setWeight: 25
      - pause: {duration: 5m}
      - setWeight: 50
      - pause: {duration: 5m}
      - setWeight: 100
```

**Characteristics:**
- Gradual traffic shift
- Risk mitigation
- Real user testing
- Requires service mesh or similar

### 4. Feature Flags

```python
from flagsmith import Flagsmith

flagsmith = Flagsmith(environment_key="API_KEY")

if flagsmith.has_feature("new_checkout_flow"):
    # New code path
    process_checkout_v2()
else:
    # Existing code path
    process_checkout_v1()
```

**Characteristics:**
- Deploy without releasing
- A/B testing
- Instant rollback
- Granular control

## Pipeline Orchestration

### Multi-Stage Pipeline Example

```yaml
name: Production Pipeline

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build application
        run: make build
      - name: Build Docker image
        run: docker build -t myapp:${{ github.sha }} .
      - name: Push to registry
        run: docker push myapp:${{ github.sha }}

  test:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Unit tests
        run: make test
      - name: Security scan
        run: trivy image myapp:${{ github.sha }}

  deploy-staging:
    needs: test
    runs-on: ubuntu-latest
    environment:
      name: staging
    steps:
      - name: Deploy to staging
        run: kubectl apply -f k8s/staging/

  integration-test:
    needs: deploy-staging
    runs-on: ubuntu-latest
    steps:
      - name: Run E2E tests
        run: npm run test:e2e

  deploy-production:
    needs: integration-test
    runs-on: ubuntu-latest
    environment:
      name: production
    steps:
      - name: Canary deployment
        run: |
          kubectl apply -f k8s/production/
          kubectl argo rollouts promote my-app

  verify:
    needs: deploy-production
    runs-on: ubuntu-latest
    steps:
      - name: Health check
        run: curl -f https://app.example.com/health
      - name: Notify team
        run: |
          curl -X POST ${{ secrets.SLACK_WEBHOOK }} \
            -d '{"text":"Production deployment successful!"}'
```

## Pipeline Best Practices

1. **Fail fast** - Run quick tests first
2. **Parallel execution** - Run independent jobs concurrently
3. **Caching** - Cache dependencies between runs
4. **Artifact management** - Store build artifacts
5. **Environment parity** - Keep environments consistent
6. **Secrets management** - Use secret stores (Vault, etc.)
7. **Deployment windows** - Schedule deployments appropriately
8. **Monitoring integration** - Track deployment metrics
9. **Rollback automation** - Auto-rollback on failures
10. **Documentation** - Document pipeline stages

## Rollback Strategies

### Automated Rollback

```yaml
deploy-and-verify:
  steps:
    - name: Deploy new version
      run: kubectl apply -f k8s/

    - name: Wait for rollout
      run: kubectl rollout status deployment/my-app

    - name: Health check
      id: health
      run: |
        for i in {1..10}; do
          if curl -sf https://app.example.com/health; then
            exit 0
          fi
          sleep 10
        done
        exit 1

    - name: Rollback on failure
      if: failure()
      run: kubectl rollout undo deployment/my-app
```

### Manual Rollback

```bash
# List revision history
kubectl rollout history deployment/my-app

# Rollback to previous version
kubectl rollout undo deployment/my-app

# Rollback to specific revision
kubectl rollout undo deployment/my-app --to-revision=3
```

## Monitoring and Metrics

### Key Pipeline Metrics

- **Deployment Frequency** - How often deployments occur
- **Lead Time** - Time from commit to production
- **Change Failure Rate** - Percentage of failed deployments
- **Mean Time to Recovery (MTTR)** - Time to recover from failure
- **Pipeline Success Rate** - Percentage of successful runs
- **Average Pipeline Duration** - Time to complete pipeline

### Integration with Monitoring

```yaml
- name: Post-deployment verification
  run: |
    # Wait for metrics stabilization
    sleep 60

    # Check error rate
    ERROR_RATE=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=rate(http_errors_total[5m])" | jq '.data.result[0].value[1]')

    if (( $(echo "$ERROR_RATE > 0.01" | bc -l) )); then
      echo "Error rate too high: $ERROR_RATE"
      exit 1
    fi
```

## Reference Files

- `references/pipeline-orchestration.md` - Complex pipeline patterns
- `assets/approval-gate-template.yml` - Approval workflow templates

## Related Skills

- `github-actions-templates` - For GitHub Actions implementation
- `gitlab-ci-patterns` - For GitLab CI implementation
- `secrets-management` - For secrets handling

Overview

This skill designs multi-stage CI/CD pipelines that balance delivery speed with deployment safety. It defines stage organization, approval gates, security checks, progressive delivery strategies, and rollback patterns for reliable production rollouts. Use it to architect pipelines across GitHub Actions, GitLab CI, Azure Pipelines, or Kubernetes-based deployments.

How this skill works

The skill inspects desired environments, test suites, and release risk to recommend a staged flow: source → build → test → staging → approval → production → verification → rollback. It maps approval patterns (manual, time-delayed, multi-approver) and deployment strategies (rolling, blue-green, canary, feature flags) to implementation snippets and orchestration patterns. It also integrates monitoring checks, automated rollback triggers, and artifact/secrets handling guidance.

When to use it

Architect a new CI/CD workflow for multiple environments
Add approval gates or multi-approver controls before production
Implement progressive delivery (canary, blue-green, rolling)
Define automated rollback and verification steps
Adopt GitOps practices or convert pipelines to declarative orchestration

Best practices

Fail fast: run quick unit and lint checks early in the pipeline
Parallelize independent jobs to reduce lead time
Keep environment parity between staging and production
Integrate security scans and secret stores into pipeline steps
Automate verification and rollback based on health and metrics
Document pipeline stages, deployment windows, and owner responsibilities

Example use cases

Design a pipeline that builds, scans, deploys to staging, runs E2E tests, then requires manual approval for production
Implement a canary rollout using Argo Rollouts with timed weight increases and automated metrics checks
Add a multi-approver manual validation step in Azure Pipelines for high-risk releases
Configure time-delayed production deployment in GitLab for scheduled releases
Use feature flags to ship code behind switches for controlled exposure and instant rollback

FAQ

How do I choose between canary, blue-green, and rolling?

Choose based on risk and cost: rolling works for most apps with zero downtime; blue-green gives instant switchover but doubles infra temporarily; canary reduces risk by shifting traffic gradually and is best when you can observe metrics during rollout.

When should I require manual approval?

Require manual approval for high-risk changes (database migrations, breaking API changes) or when business stakeholders must sign off. Use multi-approver gates for critical systems and time-delays for scheduled windows.

How do I automate rollback safely?

Automate rollback by adding health and metric checks post-deploy; if thresholds (error rate, latency, failed health probes) breach, trigger a rollback command or undo deployment. Keep playbooks and quick manual rollback commands as backups.