home / skills / codyswanngt / lisa / ops-verify-health

ops-verify-health skill

safe

/expo/copy-overwrite/.claude/skills/ops-verify-health

This skill automates health checks for frontend and backend services across environments, reporting response times and status to guide remediation.

npx playbooks add skill codyswanngt/lisa --skill ops-verify-health

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

2.7 KB

---
name: ops-verify-health
description: Health check all services across environments. Checks frontend URLs, backend GraphQL endpoints, and reports response times.
allowed-tools:
  - Bash
  - Read
---

# Ops: Verify Health

Health check all services across environments.

**Argument**: `$ARGUMENTS` — environment(s) to check (default: `all`). Options: `local`, `dev`, `staging`, `production`, `all`

## Discovery

Read these files to build the environment URL table:

1. `e2e/constants.ts` — frontend URLs per environment
2. `.env.localhost`, `.env.development`, `.env.staging`, `.env.production` — `EXPO_PUBLIC_GRAPHQL_BASE_URL` for backend GraphQL URLs

## Health Checks

For each environment, run these checks:

### Frontend Check

```bash
curl -sf -o /dev/null -w "HTTP %{http_code} in %{time_total}s" {frontend_url}
```

Verify the response contains Expo/React markers:
```bash
curl -sf {frontend_url} | head -20
```

### Backend GraphQL Check

```bash
curl -sf {graphql_url} -X POST \
  -H "Content-Type: application/json" \
  -d '{"query":"{ __typename }"}' \
  -w "\nHTTP %{http_code} in %{time_total}s\n"
```

### GraphQL Introspection (detailed check)

```bash
curl -sf {graphql_url} -X POST \
  -H "Content-Type: application/json" \
  -d '{"query":"{ __schema { queryType { name } } }"}' \
  -w "\nHTTP %{http_code} in %{time_total}s\n"
```

## Full Health Check Script

Run all checks for the specified environment(s):

```bash
check_env() {
  local ENV=$1
  local FE_URL=$2
  local BE_URL=$3

  echo "=== $ENV ==="

  # Frontend
  FE_STATUS=$(curl -sf -o /dev/null -w "%{http_code}" --max-time 10 "$FE_URL" 2>/dev/null || echo "000")
  FE_TIME=$(curl -sf -o /dev/null -w "%{time_total}" --max-time 10 "$FE_URL" 2>/dev/null || echo "timeout")
  echo "Frontend: HTTP $FE_STATUS ($FE_TIME s)"

  # Backend
  BE_RESULT=$(curl -sf --max-time 10 "$BE_URL" -X POST \
    -H "Content-Type: application/json" \
    -d '{"query":"{ __typename }"}' \
    -w "\n%{http_code} %{time_total}" 2>/dev/null || echo -e "\n000 timeout")
  BE_STATUS=$(echo "$BE_RESULT" | tail -1 | awk '{print $1}')
  BE_TIME=$(echo "$BE_RESULT" | tail -1 | awk '{print $2}')
  echo "Backend:  HTTP $BE_STATUS ($BE_TIME s)"
  echo ""
}
```

## EAS Update Status

Check the latest OTA updates deployed to each branch:

```bash
eas update:list --branch {env} --limit 3
```

## Output Format

Report results as a table:

| Service | Environment | Status | Response Time | Details |
|---------|-------------|--------|---------------|---------|
| Frontend | dev | UP (200) | 0.45s | HTML contains React root |
| Backend | dev | UP (200) | 0.32s | GraphQL responds `__typename` |

Flag any service with status != 200 or response time > 5s as a concern.

Overview

This skill runs automated health checks across all environments to verify frontend URLs and backend GraphQL endpoints and to report response times. It discovers environment endpoints from project constants and .env files, then executes quick HTTP and GraphQL probes to surface status and timing metrics. Results are presented in a concise table highlighting failures and slow responses.

How this skill works

The skill reads frontend URLs from environment constants and GraphQL base URLs from environment files to build an environment table. For each environment it performs a frontend HTTP probe and inspects the first lines of HTML for framework markers. It posts a minimal GraphQL query for basic liveness and optionally runs an introspection query for schema validation. Response codes and timings are collected and flagged when status != 200 or time > 5s.

When to use it

Before or after deployments to verify services are reachable in each environment.
During CI/CD pipeline stages to gate promotions based on service health.
For periodic monitoring or on-demand checks when investigating incidents.
When validating new environment configuration or DNS changes.

Best practices

Limit max-time for curl probes (e.g., 10s) to avoid long-running checks.
Run the lightweight __typename query for liveness and use introspection only when deeper validation is needed.
Flag any non-200 status or response time over 5s as a concern and surface details in the report.
Keep environment URL mappings current in constants and .env files so checks remain accurate.
Include EAS update:list to verify OTA deployments for each branch when relevant.

Example use cases

Run health checks across local, dev, staging, and production before a release.
Integrate into CI to prevent promoting a build if frontend or backend are down.
Use during incident triage to quickly determine which environments or services are affected.
Schedule nightly checks and send the generated table to the ops channel for review.

FAQ

What endpoints are probed?

The skill probes frontend URLs and backend GraphQL base URLs discovered from project constants and environment files.

How are slow or failing services flagged?

Any service returning a status other than 200 or taking longer than 5 seconds is flagged as a concern in the report.