home / skills / omer-metin / skills-for-antigravity / scientific-method

scientific-method skill

/skills/scientific-method

This skill applies the scientific method to computational research and software engineering, improving hypothesis testing, reproducibility, and avoidance of

npx playbooks add skill omer-metin/skills-for-antigravity --skill scientific-method

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
1.2 KB
---
name: scientific-method
description: The scientific method applied to computational research, data science, and experimental software engineering. Covers hypothesis formulation, experimental design, controls, reproducibility, and avoiding common methodological pitfalls like p-hacking, HARKing, and confirmation bias. Use when ", " mentioned. 
---

# Scientific Method

## Identity



## Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.

**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Overview

This skill applies the scientific method to computational research, data science, and experimental software engineering. It guides hypothesis formulation, experimental design, controls, and reproducibility while highlighting common methodological pitfalls. The skill emphasizes concrete, repeatable workflows that reduce bias and increase trust in results.

How this skill works

The skill inspects project plans, experimental setups, data collection procedures, and analysis scripts to identify gaps in design, controls, and reproducibility. It maps issues to specific failure modes and prescriptive fixes, such as adding controls, preregistering hypotheses, or improving randomization. Responses are grounded in the canonical guidance files: consult references/patterns.md for construction patterns, references/sharp_edges.md for common failures, and references/validations.md for objective validation rules.

When to use it

  • Designing an experiment, A/B test, or benchmark for models or software.
  • Preparing analysis plans to avoid p-hacking, HARKing, or confirmation bias.
  • Auditing reproducibility and repeatability of code, data, or pipelines.
  • Creating or reviewing control groups, randomization, and blinding procedures.
  • Validating statistical choices, sample sizes, and multiple-comparison controls.

Best practices

  • State clear, falsifiable hypotheses and predefine outcome metrics before data collection.
  • Use predefined patterns from references/patterns.md for experiment structure and data handling.
  • Include controls and randomization; document all preprocessing and selection criteria.
  • Preregister analysis plans where possible and separate exploratory from confirmatory analyses.
  • Validate assumptions and constraints against references/validations.md before drawing conclusions.

Example use cases

  • Reviewing a model evaluation that lacks a held-out test set and recommending reproducible splits.
  • Designing an A/B test with proper randomization, power calculation, and stopping rules.
  • Auditing a data science pipeline for undisclosed preprocessing steps that could produce bias.
  • Converting exploratory findings into a confirmatory experiment with preregistration and controls.

FAQ

How does this skill handle exploratory vs confirmatory analyses?

The skill distinguishes them explicitly: it recommends labeling exploratory steps, avoiding confirmatory claims from them, and preregistering confirmatory analyses using patterns from references/patterns.md.

What if my team lacks statistical expertise?

The skill provides concrete validation checks from references/validations.md and suggests minimal, high-impact steps—like power estimation, simple controls, and blinded evaluation—that reduce major risks until formal expertise can be engaged.