home / skills / plurigrid / asi / infrastructure-software-upgrades

infrastructure-software-upgrades skill

/skills/infrastructure-software-upgrades

This skill guides you through reliable infrastructure upgrades by identifying current versions, researching changes, and validating rollback and tests.

npx playbooks add skill plurigrid/asi --skill infrastructure-software-upgrades

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
1.9 KB
---
name: infrastructure-software-upgrades
description: Generic guidelines on how to perform infrastructure component upgrades in a reliable way
license: MIT
tags:
  - upgrades
  - maintenance
  - devops
  - operations
metadata:
  author: Stakpak <[email protected]>
  version: "1.0.2"
---

When upgrading infrastructure components like Kubernetes Cluster, Jenkins, Kubernetes Operators, Databases, always consider the following:

1) Find the current version of the component you want to upgrade

2) Research online the new available versions of this component, preferring stable versions in production unless you want to test newer unstable versions of the software or need to for other reasons the nightly/unstable version installed -> this depends on why the user wants to do the upgrade in the first place\
3) Once you know the target version, perform comprehensive research on all the change logs of all the versions between the current version and the target version, you must check each version in the upgrade sequence for breaking changes, this is crucial to make sure you did not miss any important stps\
4) Identify any potential breaking changes, dependencies, and specific steps or tools required while performing the upgrade

5) Always try to perform upgrades in a non-destructive and reversible manner, so take backups or perform rolling deployments when necessary to avoid downtime, unless the user wants to do a YOLO upgrade, in this case keep aside a plan for rollback

6) Write down the upgrade plan in a markdown file for future reference or for approval when needed

7) Once you get confirmation or approval on the plan, proceed with the execution, keeping in mind that you might need to revisit it and update as you hit unexpected issues during the upgrade

8) After you're done, you MUST test that the component and it's dependencies are functioning properly, check health checks and functionality if possible (you're are not truly done until you do this)

Overview

This skill provides clear, practical guidelines for performing infrastructure component upgrades reliably and safely. It focuses on version discovery, impact analysis, reversible execution, and post-upgrade verification to minimize downtime and avoid regressions. Use it to plan upgrades for clusters, orchestration tools, databases, CI systems, and operators.

How this skill works

The skill guides you through discovering the current and target versions, researching intermediate release notes for breaking changes, and identifying dependencies and required tools. It emphasizes creating a reversible upgrade plan, obtaining approval, executing with observability, and verifying health and functionality after the upgrade. The approach mixes research, documentation, and cautious execution to reduce risk.

When to use it

  • Upgrading Kubernetes clusters, operators, or control plane components
  • Upgrading CI/CD platforms like Jenkins or build infrastructure
  • Applying database engine or schema version upgrades
  • Performing rolling upgrades for stateful services in production
  • Testing non-production environments before wide rollout

Best practices

  • Verify current version and precisely define the target version before any changes
  • Read changelogs for every intermediate release to catch breaking changes
  • Prefer stable production releases unless testing unstable features is required
  • Design reversible, non-destructive steps: backups, snapshots, and rolling updates
  • Document the upgrade plan and required approvals before execution
  • Validate health checks and end-to-end functionality after completing the upgrade

Example use cases

  • Planning a Kubernetes minor version upgrade, including kubeadm, kubelet, and control plane components
  • Upgrading a database engine with pre-checks for deprecated features and a tested rollback path
  • Updating Jenkins and plugins: check plugin compatibility across all intermediate versions
  • Rolling operator upgrade in a cluster: review CRD changes and operator-specific migration steps
  • Staged rollout from staging to production with documented approvals and automated health checks

FAQ

How do I pick a target version?

Prefer stable, supported releases for production. If you need new features, choose a tested release and include extra validation in the plan.

What if the changelog is unclear about breaking changes?

Test the upgrade in an isolated environment, reach out to vendor/community channels, and add conservative rollback points to the plan.