home / skills / benchflow-ai / skillsbench / nlp-research-repo-package-installment

This skill ensures NLP research repos align Python versions and dependencies before installation to maximize reproducibility.

npx playbooks add skill benchflow-ai/skillsbench --skill nlp-research-repo-package-installment

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
2.0 KB
---
name: nlp-research-repo-package-installment
version: "1.0"
description: Align Python version and repo-declared dependencies (requirements.txt / environment.yml) before installing packages for NLP research code reproduction.
---

# NLP Research Repo Package Installment

When reproducing an NLP research repo, **always align the environment to the repo’s declared dependencies first**. Most failures come from **Python version mismatch** or installing packages without following `requirements.txt` / `environment.yml`.

## What to do (must run before any install)

1. **Read the repo dependency files**
- Prefer `environment.yml` / `environment.yaml` (often pins **Python** + channels + non-pip deps)
- Otherwise use `requirements.txt` (pip deps)
- If both exist, treat `environment.yml` as the base, `requirements.txt` as supplemental unless README says otherwise

2. **Log the current environment (Python version is critical)**  
Write `/root/python_int.txt` containing:
- `python -VV` *(required; Python version is often the root cause)*
- `python -m pip --version`
- `python -m pip freeze`

3. **Compare & decide**
- If the repo expects a specific Python major/minor and the current Python does not match, it’s usually best to **set up a matching environment**  before installing dependencies.
        Example: set up a fresh Python 3.11 environment (Docker / Ubuntu) with uv
        # Install uv
        apt-get update
        apt-get install -y --no-install-recommends curl ca-certificates
        rm -rf /var/lib/apt/lists/*
        curl -LsSf https://astral.sh/uv/0.9.7/install.sh | sh
        export PATH="/root/.local/bin:$PATH"

        # Install Python + create a venv
        uv python install 3.11.8
        uv venv --python 3.11.8 /opt/py311

        # Use the new Python for installs/runs
        /opt/py311/bin/python -VV
        /opt/py311/bin/python -m pip install -U pip setuptools wheel

- Prefer installing from the repo’s dependency files (avoid random upgrades), then run a quick import/smoke test.

Overview

This skill aligns the Python runtime and repository-declared dependencies before installing packages for NLP research reproduction. It enforces reading dependency files, logging the current Python environment, and creating a matching environment when versions differ. The goal is reliable installs and higher chance of reproducing results.

How this skill works

It inspects environment.yml / environment.yaml and requirements.txt to determine pinned Python versions and package constraints. It logs the active Python version, pip version, and installed packages to /root/python_int.txt for audit. If the repo requires a different Python major/minor, it recommends creating a matching interpreter (system, Docker, or virtualenv) before installing dependencies and running smoke tests.

When to use it

  • Reproducing experiments from an NLP research repository
  • Before running any installation steps for a downloaded repo
  • When dependency files (environment.yml or requirements.txt) are present
  • If tests or imports fail with cryptic errors suggesting version mismatch
  • When preparing CI or Docker images for research code

Best practices

  • Prefer environment.yml if present; it often pins Python and non-pip deps
  • Always record python -VV, python -m pip --version, and python -m pip freeze to /root/python_int.txt
  • If Python major/minor differs, create a fresh matching environment rather than force-upgrading packages
  • Install only from repo-declared files first; avoid blind pip upgrades of everything
  • After install, run quick import or smoke tests to verify critical modules load

Example use cases

  • Cloning a paper's repo and ensuring Python 3.11 matches the authors' environment before running training scripts
  • Building a Docker image for reproducible NLP experiments that uses environment.yml as the source of truth
  • CI job that checks out a repo, logs current Python/pip state, and switches to the declared Python before dependency install
  • Local research environment setup where requirements.txt supplements an environment.yml base
  • Troubleshooting import errors by comparing logged environment state to repo-declared constraints

FAQ

What if both environment.yml and requirements.txt exist?

Treat environment.yml as the base environment (it may pin Python and channels). Use requirements.txt as supplemental unless the README instructs otherwise.

Can I upgrade Python in-place to match the repo?

Upgrading system Python in-place is risky. Prefer creating a new interpreter (venv, Docker, or tool like uv) matching the required major.minor, then install dependencies there.