home / skills / karpathy / nanochat / read-arxiv-paper

read-arxiv-paper skill

/.claude/skills/read-arxiv-paper

This skill fetches and summarizes an arXiv paper from a provided URL, highlighting key concepts and practical implications for your projects.

npx playbooks add skill karpathy/nanochat --skill read-arxiv-paper

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
1.9 KB
---
name: read-arxiv-paper
description: Use this skill when when asked to read an arxiv paper given an arxiv URL
---

You will be given a URL of an arxiv paper, for example:

https://www.arxiv.org/abs/2601.07372

### Part 1: Normalize the URL

The goal is to fetch the TeX Source of the paper (not the PDF!), the URL always looks like this:

https://www.arxiv.org/src/2601.07372

Notice the /src/ in the url. Once you have the URL:

### Part 2: Download the paper source

Fetch the url to a local .tar.gz file. A good location is `~/.cache/nanochat/knowledge/{arxiv_id}.tar.gz`.

(If the file already exists, there is no need to re-download it).

### Part 3: Unpack the file in that folder

Unpack the contents into `~/.cache/nanochat/knowledge/{arxiv_id}` directory.

### Part 4: Locate the entrypoint

Every latex source usually has an entrypoint, such as `main.tex` or something like that.

### Part 5: Read the paper

Once you've found the entrypoint, Read the contents and then recurse through all other relevant source files to read the paper.

#### Part 6: Report

Once you've read the paper, produce a summary of the paper into a markdown file at `./knowledge/summary_{tag}.md`. Notice that 1) use the local knowledge directory here (it's easier for me to open and reference here), not in `~/.cache`, and 2) generate some reasonable `tag` like e.g. `conditional_memory` or whatever seems appropriate given the paper. Probably make sure that the tag doesn't exist yet so you're not overwriting files.

As for the summary itself, remember that you're processing this paper within the context of the nanochat repository, so most often we we will be interested in how to apply the paper and its lessons to the nanochat project. Therefore, you should feel free to "remind yourself" of the related nanochat code by reading the relevant parts, and then explicitly make the connection of how this paper might relate to nanochat or what are things we might be inspired about or try.

Overview

This skill automates reading an arXiv paper given its URL and produces a concise, actionable summary saved into the local knowledge folder. It fetches the paper source (TeX), unpacks and parses the entrypoint, reads recursive source files, and creates a summary focused on applying the paper to the nanochat project. The output is a markdown file with a unique tag so existing notes are not overwritten.

How this skill works

The skill normalizes an arXiv abstract URL to the /src/ endpoint, downloads the tar.gz TeX source to a cache location, and unpacks it into a per-paper cache directory. It locates the LaTeX entrypoint (for example main.tex), reads that file and recursively loads included source files, and then synthesizes the paper content while relating findings to nanochat. Finally it writes a summary markdown file to ./knowledge with a unique tag-based filename.

When to use it

  • You need a structured, code-aware summary of an arXiv paper rather than just a PDF skim.
  • You want to extract implementation details, algorithms, or experimental protocols from LaTeX sources.
  • You plan to map research ideas directly to the nanochat codebase or implementation experiments.
  • You prefer reproducible caching of paper sources for offline review and auditing.
  • You want automated extraction of dependencies (figures, macros, bibliographies) from source.

Best practices

  • Always provide the canonical arXiv abstract URL (https://arxiv.org/abs/<id>); the skill will normalize to /src/ automatically.
  • Check the cache directory (~/.cache/nanochat/knowledge) for existing tar.gz to avoid re-downloading.
  • Verify the detected LaTeX entrypoint and scan for common alternate entry filenames (main.tex, paper.tex, ms.tex).
  • Keep the ./knowledge folder under version control so generated summaries are tracked and reviewed.
  • Use descriptive tags for summary filenames to avoid collisions and to make later search easier.

Example use cases

  • Summarize a new foundation-model architecture and list changes to try in nanochat's training pipeline.
  • Extract pseudocode and hyperparameters from a methods section for rapid reproduction experiments.
  • Identify useful data augmentation or loss functions to integrate with nanochat components.
  • Capture experimental setups and suggested ablations to design follow-up benchmark runs.
  • Collect related-work claims and citations to cross-link with existing project literature notes.

FAQ

What if the paper has supplementary PDFs or images not in TeX?

The skill will extract any included files packaged in the source tarball; if supplementary materials are external, note them in the summary and provide the external links for manual retrieval.

Will this overwrite existing summaries?

No. The skill chooses a tag-based filename and ensures it does not overwrite existing files; it will prompt to choose a different tag if a collision is detected.