home / skills / a5c-ai / babysitter / gatk-variant-caller
This skill applies GATK best practices for germline and somatic variant calling with joint genotyping support.
npx playbooks add skill a5c-ai/babysitter --skill gatk-variant-callerReview the files below or copy the command above to add this skill to your agents.
---
name: gatk-variant-caller
description: GATK best practices skill for germline and somatic variant calling with joint genotyping
allowed-tools:
- Read
- Write
- Glob
- Grep
- Edit
- WebFetch
- WebSearch
- Bash
metadata:
version: "1.0"
category: bioinformatics
tags:
- variant-analysis
- gatk
- snv
- indel
---
# GATK Variant Caller Skill
## Purpose
Provide GATK best practices for germline and somatic variant calling with joint genotyping support.
## Capabilities
- HaplotypeCaller execution
- Base quality score recalibration (BQSR)
- Variant quality score recalibration (VQSR)
- Joint genotyping across cohorts
- GVCF generation and management
- Mutect2 somatic calling
## Usage Guidelines
- Follow GATK best practices workflow
- Apply BQSR for improved accuracy
- Use VQSR for quality filtering when sample count permits
- Generate GVCFs for scalable joint calling
- Select Mutect2 for somatic variants
- Document resource bundles and versions
## Dependencies
- GATK4
- Picard
## Process Integration
- Whole Genome Sequencing Pipeline (wgs-analysis-pipeline)
- Clinical Variant Interpretation (clinical-variant-interpretation)
- Tumor Molecular Profiling (tumor-molecular-profiling)
- Rare Disease Diagnostic Pipeline (rare-disease-diagnostics)
This skill implements GATK best practices for germline and somatic variant calling with support for joint genotyping. It packages automated steps like BQSR, HaplotypeCaller, Mutect2, VQSR, and GVCF management into a reproducible workflow suitable for cohort-scale analysis. The skill is designed to integrate with orchestration tools to run deterministic, resumable pipelines.
The skill runs data-preprocessing (including Base Quality Score Recalibration) then executes HaplotypeCaller to produce per-sample GVCFs for germline analysis. For somatic workflows it runs Mutect2 with tumor-normal handling and filtering. It supports VQSR for cohort-aware variant quality modeling and performs joint genotyping across aggregated GVCFs to produce cohort VCFs. The implementation expects GATK4 and Picard as runtime dependencies.
What dependencies are required?
GATK4 and Picard are required; include the same reference FASTA, known-sites VCFs, and resource bundles used in best-practice documentation.
When should I use VQSR versus hard filters?
Use VQSR when you have a sufficiently large cohort or training resources; use carefully tuned hard filters for small datasets where VQSR is unreliable.