home / skills / pluginagentmarketplace / custom-plugin-devops / iac

iac skill

/skills/iac

This skill helps you master infrastructure as code with Terraform, Ansible, and CloudFormation for automated, reliable deployments.

npx playbooks add skill pluginagentmarketplace/custom-plugin-devops --skill iac

Review the files below or copy the command above to add this skill to your agents.

Files (7)
SKILL.md
2.5 KB
---
name: iac-skill
description: Infrastructure as Code with Terraform, Ansible, and CloudFormation.
sasmp_version: "1.3.0"
bonded_agent: 04-infrastructure-as-code
bond_type: PRIMARY_BOND

parameters:
  - name: tool
    type: string
    required: false
    enum: ["terraform", "ansible", "cloudformation", "pulumi"]
    default: "terraform"
  - name: operation
    type: string
    required: true
    enum: ["plan", "apply", "destroy", "validate", "import"]

retry_config:
  strategy: exponential_backoff
  initial_delay_ms: 1000
  max_retries: 3

observability:
  logging: structured
  metrics: enabled
---

# Infrastructure as Code Skill

## Overview
Master IaC with Terraform, Ansible, and CloudFormation for automated infrastructure.

## Parameters
| Name | Type | Required | Default | Description |
|------|------|----------|---------|-------------|
| tool | string | No | terraform | IaC tool |
| operation | string | Yes | - | Operation type |

## Core Topics

### MANDATORY
- Terraform HCL syntax and providers
- State management and locking
- Modules and workspaces
- Ansible playbooks and roles
- Inventory management

### OPTIONAL
- CloudFormation templates
- Pulumi and CDK
- Testing IaC (terratest)
- Secret management

### ADVANCED
- Custom providers
- Complex module design
- Multi-cloud strategies
- Drift detection

## Quick Reference

```bash
# Terraform
terraform init
terraform plan -out=plan.tfplan
terraform apply plan.tfplan
terraform destroy
terraform fmt -recursive
terraform validate
terraform state list
terraform import aws_instance.web i-123

# State Management
terraform state mv old new
terraform state rm resource
terraform force-unlock LOCK_ID

# Ansible
ansible-playbook -i inventory playbook.yml
ansible-playbook playbook.yml --check --diff
ansible-playbook playbook.yml --tags nginx
ansible all -m ping -i inventory
ansible-vault encrypt secrets.yml
```

## Troubleshooting

### Common Failures
| Symptom | Root Cause | Solution |
|---------|------------|----------|
| State lock | Concurrent ops | Wait or force-unlock |
| Resource exists | Drift | Import or delete |
| Provider auth | Credentials | Check AWS_PROFILE |
| Cycle error | Dependencies | Restructure |

### Debug Checklist
1. Validate: `terraform validate`
2. Check state: `terraform state list`
3. Debug: `TF_LOG=DEBUG terraform plan`
4. Verify credentials

### Recovery Procedures

#### Corrupted State
1. Restore from S3 versioning
2. Or import: `terraform import` for each resource

## Resources
- [Terraform Docs](https://developer.hashicorp.com/terraform/docs)
- [Ansible Docs](https://docs.ansible.com)

Overview

This skill teaches Infrastructure as Code (IaC) workflows using Terraform, Ansible, and CloudFormation to automate provisioning, configuration, and lifecycle management. It focuses on practical tooling: HCL syntax, state handling, modules, playbooks, inventories, and template design. The goal is repeatable, auditable infrastructure for CI/CD, deployments, and recovery scenarios.

How this skill works

The skill inspects IaC configurations and common operational patterns, offering command references, troubleshooting steps, and recovery procedures for state and drift issues. It guides through init/plan/apply/destroy cycles, Ansible playbook runs, state management operations, and vaulting secrets. It also highlights advanced topics like custom providers, multi-cloud strategies, and testing approaches.

When to use it

  • Bootstrapping cloud environments and standardized resources with Terraform
  • Configuring servers, roles, and deployments using Ansible playbooks and inventories
  • Managing and recovering Terraform state in remote backends (locks, restores, imports)
  • Automating CI/CD pipelines for infrastructure changes and safe rollouts
  • Documenting and modularizing infrastructure for reuse across teams

Best practices

  • Keep state in a remote backend with versioning and locking (e.g., S3 + DynamoDB)
  • Split infrastructure into reusable modules and separate environments with workspaces
  • Use terraform fmt and validate in CI to catch syntax and formatting issues early
  • Encrypt secrets with ansible-vault and avoid embedding credentials in code
  • Run plan and --check/diff modes before apply to preview changes and detect drift

Example use cases

  • Create a VPC, subnets, and common security groups with Terraform modules for reuse
  • Deploy application servers with Ansible roles and inventory targeting by environment
  • Recover from a corrupted Terraform state by restoring S3 versioning or importing resources
  • Integrate terraform plan/apply steps into CI pipelines with automated approvals
  • Implement drift detection by comparing live state to planned configurations and automating remediation

FAQ

How do I handle a state lock caused by a failed operation?

Wait for the operation to complete or use terraform force-unlock with the lock ID after confirming no active processes are modifying the state.

When should I import resources instead of recreating them?

Import when resources already exist in the cloud and you want to adopt them into IaC without downtime or when deletion would be disruptive.