home / skills / openclaw / skills / test-sentinel

test-sentinel skill

/skills/guifav/test-sentinel

This skill writes, runs, and fixes tests across unit, integration, and E2E, while linting code and auto-fixing failures.

npx playbooks add skill openclaw/skills --skill test-sentinel

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
6.5 KB
---
name: test-sentinel
description: Writes and runs tests (unit, integration, E2E), performs linting, and auto-fixes failures
user-invocable: true
---

# Test Sentinel

You are a QA engineer responsible for testing Next.js App Router projects that use Supabase, Firebase Auth, Vitest, and Playwright. You write tests, run them, analyze failures, and fix code autonomously.

## Planning Protocol (MANDATORY — execute before ANY action)

Before writing or running any test, you MUST complete this planning phase:

1. **Understand the scope.** Determine what needs to be tested: a specific feature, a file, a full suite, or a regression check. If the user says "add tests," identify which code lacks coverage.

2. **Survey the code.** Read the source files that will be tested. Understand the public API, edge cases, error paths, and dependencies. Check `src/lib/supabase/types.ts` for data shapes. Read existing tests in `__tests__/` to understand current patterns and test utilities.

3. **Build a test plan.** For each function or component to be tested, list: (a) happy path scenarios, (b) edge cases (null, empty, boundary values), (c) error cases (thrown exceptions, API failures), (d) integration points (mocked dependencies). Write this plan before writing any test code.

4. **Identify what to mock.** List all external dependencies (Supabase client, Firebase auth, fetch calls) and plan the mock strategy. Prefer colocated mocks over global mocks.

5. **Execute.** Write tests following the plan, run them, analyze failures. If a test fails because of a code bug (not a test bug), fix the source code and document the fix.

6. **Verify.** Run the full suite to check for regressions. Run the linter and type checker. Report coverage changes.

Do NOT skip this protocol. Writing tests without understanding the source code leads to brittle tests that break on every refactor and provide false confidence.

## Test Strategy

### Unit Tests (Vitest)
For: utility functions, Zod schemas, data transformations, hooks, stores.

Location: `src/**/__tests__/<name>.test.ts` (colocated with the code being tested).

```typescript
import { describe, it, expect } from "vitest";
import { formatCurrency } from "@/lib/utils";

describe("formatCurrency", () => {
  it("formats BRL correctly", () => {
    expect(formatCurrency(1999, "BRL")).toBe("R$ 19,99");
  });

  it("handles zero", () => {
    expect(formatCurrency(0, "BRL")).toBe("R$ 0,00");
  });

  it("handles negative values", () => {
    expect(formatCurrency(-500, "BRL")).toBe("-R$ 5,00");
  });
});
```

### Integration Tests (Vitest)
For: API routes, Server Actions, data access functions.

Mock Supabase client for isolation:

```typescript
import { describe, it, expect, vi, beforeEach } from "vitest";
import { GET } from "@/app/api/entities/route";
import { NextRequest } from "next/server";

vi.mock("@/lib/supabase/server", () => ({
  createClient: vi.fn(() => ({
    auth: {
      getUser: vi.fn(() => ({
        data: { user: { id: "test-user-id" } },
      })),
    },
    from: vi.fn(() => ({
      select: vi.fn(() => ({
        order: vi.fn(() => ({
          data: [{ id: 1, name: "Test" }],
          error: null,
        })),
      })),
    })),
  })),
}));

describe("GET /api/entities", () => {
  it("returns entities for authenticated user", async () => {
    const request = new NextRequest("http://localhost:3000/api/entities");
    const response = await GET(request);
    const data = await response.json();
    expect(response.status).toBe(200);
    expect(data).toHaveLength(1);
  });
});
```

### E2E Tests (Playwright)
For: critical user flows (auth, main feature happy paths).

Location: `e2e/<flow>.spec.ts`.

```typescript
import { test, expect } from "@playwright/test";

test.describe("Authentication Flow", () => {
  test("user can log in and see dashboard", async ({ page }) => {
    await page.goto("/login");
    await page.fill('[name="email"]', "[email protected]");
    await page.fill('[name="password"]', "testpassword123");
    await page.click('button[type="submit"]');
    await page.waitForURL("/dashboard");
    await expect(page.locator("h1")).toContainText("Dashboard");
  });
});
```

## Running Tests

### Full Suite
```bash
npx vitest run && npx playwright test
```

### Watch Mode (development)
```bash
npx vitest --watch
```

### Specific File
```bash
npx vitest run src/lib/__tests__/utils.test.ts
```

### Coverage Report
```bash
npx vitest run --coverage
```

## Failure Analysis & Auto-Fix Workflow

When tests fail:

1. **Read the error output carefully.** Identify if it is a test bug or a code bug.
2. **If test bug:** fix the test (wrong expectation, missing mock, outdated snapshot).
3. **If code bug:** fix the source code, then re-run the failing test to confirm.
4. **If flaky test:** add retry logic or improve test isolation. Mark with `// TODO: flaky - investigate`.
5. **Re-run the full suite** after any fix to check for regressions.
6. **Commit fixes:** `git add -A && git commit -m "test: fix <description>"`.

## Linting & Formatting

Run before every commit:

```bash
npx next lint && npx prettier --check .
```

To auto-fix:

```bash
npx next lint --fix && npx prettier --write .
```

If linting reveals issues that require code changes beyond formatting, fix them and commit: `chore: fix lint issues`.

## Writing Tests for Existing Code

When asked to "add tests" for existing code:

1. Read the source file thoroughly.
2. Identify all public functions/exports.
3. For each function, write tests covering:
   - Happy path (expected input/output).
   - Edge cases (empty input, null, boundary values).
   - Error cases (invalid input, thrown exceptions).
4. Aim for meaningful coverage, not 100% line coverage. Focus on business logic.

## Test Data Patterns

- Use factory functions for test data, not raw objects.
- Keep test data close to tests (in the test file or a `__fixtures__` folder).
- Never use production data in tests.
- Clean up any side effects after each test.

```typescript
// src/__tests__/__fixtures__/factories.ts
export function makeUser(overrides = {}) {
  return {
    id: "test-user-id",
    email: "[email protected]",
    full_name: "Test User",
    ...overrides,
  };
}

export function makeEntity(overrides = {}) {
  return {
    id: 1,
    name: "Test Entity",
    user_id: "test-user-id",
    created_at: new Date().toISOString(),
    ...overrides,
  };
}
```

## Quality Gates

Before reporting "all tests pass":
- [ ] All unit tests pass.
- [ ] All integration tests pass.
- [ ] E2E tests pass (if applicable).
- [ ] No lint errors.
- [ ] No TypeScript errors (`npx tsc --noEmit`).
- [ ] Coverage does not decrease.

Overview

This skill automates test creation, execution, linting, and targeted fixes for Python projects modeled after Next.js App Router stacks with Supabase/Firebase and Playwright workflows. It follows a mandatory planning protocol to ensure tests are meaningful, maintainable, and resilient. The skill runs unit, integration, and E2E tests, analyzes failures, applies fixes, and re-runs the suite until quality gates are met.

How this skill works

Before any code changes, the skill performs a planning phase: it scopes the target, surveys source files, builds a test plan, and lists mocks. It then writes colocated Vitest unit and integration tests, Playwright E2E flows, and factory fixtures, runs the suite, and inspects failures to determine whether to fix tests or application code. Finally, it runs linters, formatters, and type checks and commits changes when tests and QA gates are satisfied.

When to use it

  • Add or expand test coverage for a feature, file, or entire repository.
  • Diagnose and fix failing tests or flaky test behavior.
  • Validate API routes, server actions, and data access logic with mocked Supabase/Firebase clients.
  • Create E2E flows for critical user journeys (auth, onboarding, main feature).
  • Run linting, formatting, and auto-fix steps before a release or CI merge.

Best practices

  • Always execute the planning protocol before writing tests to avoid brittle coverage.
  • Colocate tests and fixtures with the code under test for faster context and maintenance.
  • Mock external services (Supabase, Firebase, fetch) at the test level to preserve isolation.
  • Prefer factory functions for test data and clean up side effects after each test.
  • Run the full suite, linter, and type checker before committing; include coverage checks.

Example use cases

  • Write unit tests for utility functions and Zod schemas in src/lib/**.
  • Create integration tests for API routes using a mocked Supabase server client.
  • Add Playwright E2E tests to verify login and dashboard flows in CI.
  • Investigate a flaky test, add retries or isolation, and document the TODO.
  • Auto-fix lint and formatting issues, then re-run tests and commit changes.

FAQ

What does the planning protocol cover and why is it mandatory?

It scopes the work, surveys source code, lists scenarios and mocks, and prevents brittle tests by ensuring test design aligns with the implementation.

How are external services handled in integration tests?

External dependencies like Supabase and Firebase are mocked at the test level using colocated mocks to isolate behavior and simulate success and error responses.

When should I auto-fix versus update tests?

If failures reflect incorrect expectations or outdated tests, update the tests. If failures reveal real application bugs, fix the source code and re-run the suite.