home / skills / a5c-ai / babysitter / data-versioning-manager
This skill helps manage data versions, provenance, and lineage to support reproducible research across experiments and collaborations.
npx playbooks add skill a5c-ai/babysitter --skill data-versioning-managerReview the files below or copy the command above to add this skill to your agents.
---
name: data-versioning-manager
description: Skill for managing data versions and provenance
allowed-tools:
- Bash
- Read
- Write
metadata:
specialization: scientific-discovery
domain: science
category: Reproducibility
skill-id: SK-SCIDISC-025
---
# Data Versioning Manager Skill
## Purpose
Manage data versions, track provenance, and ensure data lineage for reproducible scientific research.
## Capabilities
- Version datasets
- Track data lineage
- Document transformations
- Enable rollback
- Support collaboration
- Generate provenance
## Usage Guidelines
1. Initialize versioning
2. Track data changes
3. Document transformations
4. Create snapshots
5. Manage branches
6. Export provenance
## Process Integration
Works within scientific discovery workflows for:
- Data management
- Reproducibility support
- Collaboration enabling
- Audit compliance
## Configuration
- Version control system
- Storage backends
- Metadata schemas
- Access controls
## Output Artifacts
- Version histories
- Provenance records
- Transformation logs
- Data snapshots
This skill manages data versions and documents provenance to make datasets reproducible and auditable. It provides deterministic, resumable controls for dataset snapshots, branching, and rollback. The goal is to preserve lineage and transformations so teams can reproduce results and meet compliance requirements.
The skill records every change as a versioned snapshot and stores metadata that describes source, parameters, and transformation steps. It links versions into a lineage graph so you can trace origin and dependencies across branches. Rollback, export of provenance records, and access control are supported to enforce reproducibility and collaboration policies.
How does rollback work?
Rollback restores a selected snapshot as the active dataset and records the action in the provenance log so the change is traceable.
What metadata should I capture?
Capture source identifiers, timestamps, tool versions, transformation parameters, responsible actors, and checksums to ensure integrity and traceability.