home / skills / proffesor-for-testing / agentic-qe / qe-test-execution

qe-test-execution skill

safe

This skill orchestrates parallel QE test execution with smart selection, retry logic, and comprehensive result aggregation to accelerate CI pipelines.

npx playbooks add skill proffesor-for-testing/agentic-qe --skill qe-test-execution

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.7 KB

---
name: "QE Test Execution"
description: "Parallel test execution orchestration with intelligent scheduling, retry logic, and comprehensive result aggregation."
---

# QE Test Execution

## Purpose

Guide the use of v3's test execution capabilities including parallel orchestration, smart test selection, flaky test handling, and distributed execution across multiple environments.

## Activation

- When running test suites
- When optimizing test execution time
- When handling flaky tests
- When setting up CI/CD test pipelines
- When executing tests across environments

## Quick Start

```bash
# Run all tests with parallelization
aqe test run --parallel --workers 4

# Run affected tests only
aqe test run --affected --since HEAD~1

# Run with retry for flaky tests
aqe test run --retry 3 --retry-delay 1000

# Run specific test types
aqe test run --type unit,integration --exclude e2e
```

## Agent Workflow

```typescript
// Orchestrate test execution
Task("Execute test suite", `
  Run the full test suite with:
  - 4 parallel workers
  - Retry flaky tests up to 3 times
  - Generate JUnit report
  - Fail fast on critical tests
  Report results and any failures.
`, "qe-test-executor")

// Smart test selection
Task("Run affected tests", `
  Analyze changes in PR #123 and:
  - Identify affected test files
  - Run only relevant tests
  - Include integration tests for changed modules
  - Report coverage delta
`, "qe-test-selector")
```

## Execution Strategies

### 1. Parallel Execution

```typescript
await testExecutor.runParallel({
  suites: ['unit', 'integration'],
  workers: 4,
  distribution: 'by-file',  // or 'by-test', 'by-duration'
  isolation: 'process',
  sharding: {
    enabled: true,
    total: 4,
    index: process.env.SHARD_INDEX
  }
});
```

### 2. Smart Test Selection

```typescript
await testExecutor.runAffected({
  changes: gitChanges,
  selection: {
    direct: true,      // Tests for changed files
    transitive: true,  // Tests for dependents
    integration: true  // Integration tests touching changed code
  },
  fallback: 'full-suite'  // If analysis fails
});
```

### 3. Flaky Test Handling

```typescript
await testExecutor.handleFlaky({
  detection: {
    enabled: true,
    threshold: 0.1,  // 10% flake rate
    window: 100      // Last 100 runs
  },
  strategy: {
    retry: 3,
    quarantine: true,
    notify: ['#flaky-tests']
  }
});
```

## Execution Configuration

```yaml
execution:
  parallel:
    workers: auto  # CPU cores - 1
    timeout: 30000
    bail: false

  retry:
    count: 2
    delay: 1000
    only_failed: true

  reporting:
    formats: [junit, json, html]
    include_timing: true
    include_logs: true

  environments:
    - name: node-18
      image: node:18-alpine
    - name: node-20
      image: node:20-alpine
```

## CI/CD Integration

```yaml
# GitHub Actions example
test:
  runs-on: ubuntu-latest
  strategy:
    matrix:
      shard: [1, 2, 3, 4]
  steps:
    - uses: actions/checkout@v4
    - name: Run tests
      run: |
        aqe test run \
          --shard ${{ matrix.shard }}/4 \
          --parallel \
          --report junit
    - name: Upload results
      uses: actions/upload-artifact@v4
      with:
        name: test-results-${{ matrix.shard }}
        path: reports/
```

## Result Aggregation

```typescript
interface ExecutionResults {
  summary: {
    total: number;
    passed: number;
    failed: number;
    skipped: number;
    flaky: number;
    duration: number;
  };
  shards: ShardResult[];
  failures: TestFailure[];
  flakyTests: FlakyTest[];
  coverage: CoverageReport;
  timing: TimingAnalysis;
}
```

## Coordination

**Primary Agents**: qe-test-executor, qe-test-selector, qe-flaky-detector
**Coordinator**: qe-test-execution-coordinator
**Related Skills**: qe-test-generation, qe-coverage-analysis

Overview

This skill orchestrates parallel test execution with intelligent scheduling, retry logic for flaky tests, and comprehensive result aggregation. It enables distributed runs across environments, smart test selection, and configurable reporting to speed up feedback loops. Use it to reduce CI time, isolate flaky behavior, and produce unified test metrics for pipelines.

How this skill works

The skill coordinates specialized agents to run tests in parallel, shard workloads, and select only affected tests based on source changes. It applies retry and quarantine strategies for flaky tests, aggregates shard results into a single execution summary, and emits reports in formats like JUnit, JSON, and HTML. Configuration options let you tune workers, sharding, timeouts, and reporting for CI/CD integration.

When to use it

Running large test suites where parallelization and sharding reduce wall-clock time
Optimizing CI pipelines to run only affected tests from a change or PR
Handling flaky tests with automated retry, quarantine, and notifications
Distributing tests across multiple environments or Node images
Generating consolidated reports for test results, coverage, and timing

Best practices

Use smart test selection to run affected and transitive tests to save CI time
Shard by duration or file to balance worker load; prefer duration for long-tail suites
Enable retry for suspected flakiness and quarantine repeatedly failing tests
Emit machine-readable reports (JUnit/JSON) and upload artifacts per shard for aggregation
Set bail/timeout policies for critical tests to fail fast when necessary

Example use cases

Run all unit and integration suites with 4 workers and collect a single JUnit report
Execute only tests affected by a PR to provide fast feedback on changes
Configure CI matrix to shard tests across runners and upload per-shard artifacts
Detect flaky tests using a historical window and automatically retry or quarantine them
Compare coverage delta after running affected tests to evaluate risk of changes

FAQ

How are flaky tests detected?

Flaky detection uses a configurable window and threshold based on historical runs; tests exceeding the flake rate are flagged for retry or quarantine.

How do shards get aggregated?

Each shard emits a report and artifact; the coordinator collects shard results to produce a single ExecutionResults summary with failures, flaky tests, coverage, and timing.