home / skills / kthorn / research-superpower / evaluating-paper-relevance

evaluating-paper-relevance skill

/skills/research/evaluating-paper-relevance

This skill applies a two-stage abstract screening and deep-dive extraction to identify papers with the exact data and methods you need.

npx playbooks add skill kthorn/research-superpower --skill evaluating-paper-relevance

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
18.8 KB
---
name: Evaluating Paper Relevance
description: Two-stage paper screening - abstract scoring then deep dive for specific data extraction
when_to_use: After literature search returns results. When need to determine if paper contains specific data. When screening papers for relevance. When extracting methods, results, data from papers.
version: 1.0.0
---

# Evaluating Paper Relevance

## Overview

Two-stage screening process: quick abstract scoring followed by deep dive into promising papers.

**Core principle:** Precision over breadth. Find papers that actually contain the specific data/methods user needs, not just topically related papers.

## When to Use

Use this skill when:
- Have list of papers from search
- Need to determine which papers have relevant data
- User asks for specific information (measurements, protocols, datasets, etc.)
- Screening papers one-by-one
- Any research domain (medicinal chemistry, genomics, ecology, computational methods, etc.)

## Choosing Your Approach

**Small searches (<50 papers):**
- Manual screening with progress reporting
- Use papers-reviewed.json + SUMMARY.md only
- No helper scripts needed
- Report progress to user for every paper

**Large searches (50-150 papers):**
- Consider helper scripts (screen_papers.py + deep_dive_papers.py)
- Use Progressive Enhancement Pattern (see Helper Scripts section)
- Create README.md with methodology
- May want TOP_PRIORITY_PAPERS.md for quick reference
- Use richer JSON structure (evaluated-papers.json categorized by relevance)
- Consider using subagent-driven-review skill for parallel screening

**Very large searches (>150 papers):**
- Definitely use helper scripts with Progressive Enhancement Pattern
- Create full auxiliary documentation suite (README.md, TOP_PRIORITY_PAPERS.md)
- Consider citation network analysis
- Plan for multi-week timeline
- Strongly consider subagent-driven-review skill for parallelization
- May need multiple consolidation checkpoints

## Two-Stage Process

### Stage 1: Abstract Screening (Fast)

**Goal:** Quickly identify promising papers

**Score 0-10 based on:**
- **Keywords match (0-3 points)**: Does abstract mention key terms relevant to the query?
- **Data type match (0-4 points)**: Does it mention the specific information user needs?
  - Examples: measurements (IC50, expression levels, population sizes), protocols, datasets, structures, sequences, code
- **Specificity (0-3 points)**: Is it specific to user's question or just general background/review?

**Decision rules:**
- Score < 5: Skip (not relevant)
- Score 5-6: Note in summary as "possibly relevant" but skip for now
- Score β‰₯ 7: Proceed to Stage 2 (deep dive)

**IMPORTANT: Report to user for EVERY paper:**
```
πŸ“„ [N/Total] Screening: "Paper Title"
   Abstract score: 8 β†’ Fetching full text...
```

or

```
πŸ“„ [N/Total] Screening: "Paper Title"
   Abstract score: 4 β†’ Skipping (insufficient relevance)
```

**Never screen silently** - user needs to see progress happening

### Stage 2: Deep Dive (Thorough)

**Goal:** Extract specific data/methods from promising papers

#### 1. Check ChEMBL (for medicinal chemistry papers)

**If paper describes medicinal chemistry / SAR data:**

Use `skills/research/checking-chembl` to check if paper is in ChEMBL database:
```bash
curl -s "https://www.ebi.ac.uk/chembl/api/data/document.json?doi=$doi"
```

**If found in ChEMBL:**
- Note ChEMBL ID and activity count in SUMMARY.md
- Report to user: "βœ“ ChEMBL: CHEMBL3870308 (45 data points)"
- Structured SAR data available without PDF parsing

**Continue to full text fetch for context, methods, discussion.**

#### 2. Fetch Full Text

**Try in order:**

**A. PubMed Central (free full text):**
```bash
# Check if available in PMC
curl "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pmc&term=PMID[PMID]&retmode=json"

# If found, fetch full text XML via API
curl "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=PMCID&rettype=full&retmode=xml"

# Or fetch HTML directly (note: use pmc.ncbi.nlm.nih.gov, not www.ncbi.nlm.nih.gov/pmc)
curl "https://pmc.ncbi.nlm.nih.gov/articles/PMCID/"
```

**B. DOI resolution:**
```bash
# Try publisher link
curl -L "https://doi.org/10.1234/example.2023"
# May hit paywall - check response
```

**C. Unpaywall (MANDATORY if paywalled):**
**CRITICAL: If step B hits a paywall, you MUST immediately try Unpaywall before giving up.**

Use `skills/research/finding-open-access-papers` to find free OA version:
```bash
curl "https://api.unpaywall.org/v2/DOI?email=USER_EMAIL"
# Often finds versions in repositories, preprint servers, author copies
# IMPORTANT: Ask user for their email if not already provided - do NOT use [email protected]
```

Report to user:
```
⚠️  Paper behind paywall, checking Unpaywall...
βœ“ Found open access version at [repository/preprint server]
```

or

```
⚠️  Paper behind paywall, checking Unpaywall...
βœ— No open access version available - continuing with abstract only
```

**D. Preprints (direct):**
- Check bioRxiv: `https://www.biorxiv.org/content/10.1101/{doi}`
- Check arXiv (for computational papers)

**If full text unavailable AFTER trying Unpaywall:**
- Note in SUMMARY.md: "⚠️ Full text behind paywall - no OA version found via Unpaywall"
- Continue with abstract-only evaluation (limited)

**CRITICAL: Do NOT skip Unpaywall check. Many paywalled papers have free versions in repositories.**

#### 2. Scan for Relevant Content

**Focus on sections:**
- **Methods**: Experimental procedures, protocols
- **Results**: Data tables, figures, measurements
- **Tables/Figures**: Often contain the specific data user needs
- **Supplementary Information**: Additional data, extended methods

**What to look for (adapt to research domain):**
- Specific data user requested
  - **Medicinal chemistry**: IC50 values, compound structures, SAR data
  - **Genomics**: Gene expression levels, sequences, variant data
  - **Ecology**: Population measurements, species counts, environmental parameters
  - **Computational**: Algorithms, code availability, performance benchmarks
  - **Clinical**: Patient outcomes, treatment protocols, sample sizes
- Methods/protocols described in detail
- Statistical analysis and significance
- Data availability statements
- Code/data repositories mentioned

**Use grep/text search (adapt search terms):**
```bash
# Examples for different domains
grep -i "IC50\|Ki\|MIC" paper.xml                    # Medicinal chemistry
grep -i "expression\|FPKM\|RNA-seq" paper.xml        # Genomics
grep -i "abundance\|population\|sampling" paper.xml  # Ecology
grep -i "algorithm\|github\|code" paper.xml          # Computational
```

#### 3. Extract Findings

**Create structured extraction (adapt to research domain):**

**Example 1: Medicinal chemistry**
```json
{
  "doi": "10.1234/medchem.2023",
  "title": "Novel kinase inhibitors...",
  "relevance_score": 9,
  "findings": {
    "data_found": [
      "IC50 values for compounds 1-12 (Table 2)",
      "Selectivity data (Figure 3)",
      "Synthesis route (Scheme 1)"
    ],
    "key_results": [
      "Compound 7: IC50 = 12 nM",
      "10-step synthesis, 34% yield"
    ]
  }
}
```

**Example 2: Genomics**
```json
{
  "doi": "10.1234/genomics.2023",
  "title": "Gene expression in disease...",
  "relevance_score": 8,
  "findings": {
    "data_found": [
      "RNA-seq data for 50 samples (GEO: GSE12345)",
      "Differential expression results (Table 1)",
      "Gene set enrichment analysis (Figure 4)"
    ],
    "key_results": [
      "123 genes upregulated (FDR < 0.05)",
      "Pathway enrichment: immune response"
    ]
  }
}
```

**Example 3: Computational methods**
```json
{
  "doi": "10.1234/compbio.2023",
  "title": "Novel alignment algorithm...",
  "relevance_score": 9,
  "findings": {
    "data_found": [
      "Algorithm pseudocode (Methods)",
      "Code repository (github.com/user/tool)",
      "Benchmark results (Table 2)"
    ],
    "key_results": [
      "10x faster than BLAST",
      "98% accuracy on test dataset"
    ]
  }
}
```

#### 4. Download Materials

**PDFs:**
```bash
# If PDF available
curl -L -o "papers/$(echo $doi | tr '/' '_').pdf" "https://doi.org/$doi"
```

**Supplementary data:**
```bash
# Download SI files if URLs found
curl -o "papers/${doi}_supp.zip" "https://publisher.com/supp/file.zip"
```

#### 5. Update Tracking Files

**CRITICAL: Use ONLY papers-reviewed.json and SUMMARY.md. Do NOT create custom tracking files.**

**CRITICAL: Add EVERY paper to papers-reviewed.json, regardless of score. This prevents re-reviewing papers and tracks complete search history.**

**Add to papers-reviewed.json:**

**For relevant papers (score β‰₯7):**
```json
{
  "10.1234/example.2023": {
    "pmid": "12345678",
    "status": "relevant",
    "score": 9,
    "source": "pubmed_search",
    "timestamp": "2025-10-11T10:30:00Z",
    "found_data": ["IC50 values", "synthesis methods"],
    "has_full_text": true,
    "chembl_id": "CHEMBL1234567"
  }
}
```

**For not-relevant papers (score <7):**
```json
{
  "10.1234/another.2023": {
    "pmid": "12345679",
    "status": "not_relevant",
    "score": 4,
    "source": "pubmed_search",
    "timestamp": "2025-10-11T10:31:00Z",
    "reason": "no activity data, review paper"
  }
}
```

**Always add papers even if skipped** - this prevents re-processing and documents what was already checked.

**Add to SUMMARY.md (examples for different domains):**

**Medicinal chemistry example:**
```markdown
### [Novel kinase inhibitors with improved selectivity](https://doi.org/10.1234/medchem.2023) (Score: 9)

**DOI:** [10.1234/medchem.2023](https://doi.org/10.1234/medchem.2023)
**PMID:** [12345678](https://pubmed.ncbi.nlm.nih.gov/12345678/)
**ChEMBL:** [CHEMBL1234567](https://www.ebi.ac.uk/chembl/document_report_card/CHEMBL1234567/)

**Key Findings:**
- IC50 values for 12 inhibitors (Table 2)
- Compound 7: IC50 = 12 nM, >80-fold selectivity
- Synthesis route (Scheme 1, page 4)

**Files:** PDF, supplementary data
```

**Genomics example:**
```markdown
### [Transcriptomic analysis of disease progression](https://doi.org/10.1234/genomics.2023) (Score: 8)

**DOI:** [10.1234/genomics.2023](https://doi.org/10.1234/genomics.2023)
**PMID:** [23456789](https://pubmed.ncbi.nlm.nih.gov/23456789/)
**Data:** [GEO: GSE12345](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12345)

**Key Findings:**
- RNA-seq data: 50 samples, 3 conditions
- 123 differentially expressed genes (FDR < 0.05)
- Immune pathway enrichment (Figure 3)

**Files:** PDF, supplementary tables with gene lists
```

**Computational methods example:**
```markdown
### [Fast sequence alignment with novel algorithm](https://doi.org/10.1234/compbio.2023) (Score: 9)

**DOI:** [10.1234/compbio.2023](https://doi.org/10.1234/compbio.2023)
**Code:** [github.com/user/tool](https://github.com/user/tool)

**Key Findings:**
- New alignment algorithm (pseudocode in Methods)
- 10x faster than BLAST, 98% accuracy
- Benchmark datasets available

**Files:** PDF, code repository linked
```

**IMPORTANT: Always make DOIs and PMIDs clickable links:**
- DOI format: `[10.1234/example.2023](https://doi.org/10.1234/example.2023)`
- PMID format: `[12345678](https://pubmed.ncbi.nlm.nih.gov/12345678/)`
- Makes papers easy to access directly from SUMMARY.md

## Progress Reporting

**CRITICAL: Report to user as you work - never work silently!**

**For every paper, report:**
1. **Start screening:** `πŸ“„ [N/Total] Screening: "Title..."`
2. **Abstract score:** `Abstract score: X/10`
3. **Decision:** What you're doing next (fetching full text / skipping / etc)

**For relevant papers, report findings immediately (adapt to domain):**

**Medicinal chemistry example:**
```
πŸ“„ [15/127] Screening: "Selective BTK inhibitors..."
   Abstract score: 8 β†’ Fetching full text...
   βœ“ Found IC50 data for 8 compounds (Table 2)
   βœ“ Selectivity data vs 50 kinases (Figure 3)
   β†’ Added to SUMMARY.md
```

**Genomics example:**
```
πŸ“„ [23/89] Screening: "Gene expression in liver disease..."
   Abstract score: 9 β†’ Fetching full text...
   βœ“ RNA-seq data available (GEO: GSE12345)
   βœ“ 123 DEGs identified (Table 1, FDR < 0.05)
   β†’ Added to SUMMARY.md
```

**Computational methods example:**
```
πŸ“„ [7/45] Screening: "Novel phylogenetic algorithm..."
   Abstract score: 8 β†’ Fetching full text...
   βœ“ Code available (github.com/user/tool)
   βœ“ Benchmark results (10x faster, Table 2)
   β†’ Added to SUMMARY.md
```

**Update user every 5-10 papers with summary:**
```
πŸ“Š Progress: Reviewed 30/127 papers
   - Highly relevant: 3
   - Relevant: 5
   - Currently screening paper 31...
```

**Why this matters:** User needs to see work happening and provide feedback/corrections early

## Integration with Other Skills

**For medicinal chemistry papers:**
- **Use `skills/research/checking-chembl`** to find curated SAR data
- Check BEFORE attempting to parse activity tables from PDFs
- ~30-40% of medicinal chemistry papers have ChEMBL data

**During full text fetching:**
- **If paywalled: MANDATORY to use `skills/research/finding-open-access-papers` (Unpaywall)**
- Do NOT skip this step - Unpaywall finds ~50% of paywalled papers for free

**After finding relevant paper:**
1. **Check ChEMBL** (if medicinal chemistry)
2. **Extract findings** to SUMMARY.md
3. **Download files** to papers/ folder
4. **Call traversing-citations skill** to find related papers
5. **Update papers-reviewed.json** to avoid re-processing

## Scoring Rubric

| Score | Meaning | Action |
|-------|---------|--------|
| 0-4 | Not relevant | Skip, brief note in summary |
| 5-6 | Possibly relevant | Note for later, skip deep dive for now |
| 7-8 | Relevant | Deep dive, extract data, add to summary |
| 9-10 | Highly relevant | Deep dive, extract data, follow citations, highlight in summary |

## Helper Scripts (Optional)

**When screening many papers (>20), consider creating a helper script:**

**Benefits:**
- Batch processing with rate limiting
- Consistent scoring logic
- Save intermediate results
- Resume after interruption

**Create in research session folder:**
```python
# research-sessions/YYYY-MM-DD-query/screen_papers.py
```

**Key components:**
1. **Fetch abstracts** - PubMed efetch with error handling
2. **Score abstracts** - Implement scoring rubric (0-10)
3. **Rate limiting** - 500ms delay between API calls (or longer if running parallel subagents)
4. **Save results** - JSON with scored papers categorized by relevance
5. **Progress reporting** - Print status as it runs

### Progressive Enhancement Pattern (Recommended for 50+ papers)

**For large-scale screening, use two-script pattern:**

**Script 1: Abstract Screening** (`screen_papers.py`)
- Batch fetch abstracts
- Score using rubric (0-10)
- Categorize by relevance
- Output: `evaluated-papers.json` with basic metadata

**Script 2: Deep Dive** (`deep_dive_papers.py`)
- Read Script 1 output
- Fetch full text for highly relevant papers (score β‰₯8)
- Extract domain-specific data (measurements, protocols, datasets, etc.)
- Update same JSON file with enhanced metadata

**Benefits:**
- **Can run steps independently** - Score abstracts once, re-run deep dive multiple times
- **Resume if interrupted** - No need to re-fetch abstracts if deep dive fails
- **Re-run deep dive without re-scoring abstracts** - Adjust extraction logic, keep scores
- **Consistent and reproducible** - Same scoring logic applied to all papers
- **Save API calls** - Abstract screening happens once, deep dive only on relevant papers

**Script design:**
- Parameterize keywords and data types for specific query
- Progressive enhancement - add detail to same JSON file
- Include rate limiting (500ms between API calls for single script, longer if parallel)
- Keep scripts with research session for reproducibility

**When NOT to create helper script:**
- Few papers (<20)
- One-off quick searches
- Manual screening is faster

## Common Mistakes

**Not tracking all papers:** Only adding relevant papers to papers-reviewed.json β†’ Add EVERY paper regardless of score to prevent re-review
**Skipping Unpaywall:** Hitting paywall and giving up β†’ ALWAYS check Unpaywall first, many papers have free versions
**Creating unnecessary files for small searches:** For <50 papers, use ONLY papers-reviewed.json and SUMMARY.md. For large searches (>100 papers), structured evaluated-papers.json and auxiliary files (README.md, TOP_PRIORITY_PAPERS.md) add significant value and should be used.
**Too strict:** Skipping papers that mention data indirectly β†’ Re-read abstract carefully
**Too lenient:** Deep diving into tangentially related papers β†’ Focus on specific data user needs
**Missing supplementary data:** Many papers hide key data in SI β†’ Always check for supplementary files
**Silent screening:** User can't see progress β†’ Report EVERY paper as you screen it
**No periodic summaries:** User loses big picture β†’ Update every 5-10 papers
**Non-clickable DOIs/PMIDs:** Plain text identifiers β†’ Always use markdown links
**Re-reviewing papers:** Wastes time β†’ Always check papers-reviewed.json first
**Not using helper scripts:** Manually screening 100+ papers β†’ Consider batch script

## Quick Reference

| Task | Action |
|------|--------|
| Check if reviewed | Look up DOI in papers-reviewed.json |
| Score abstract | Keywords (0-3) + Data type (0-4) + Specificity (0-3) |
| Get full text | Try PMC β†’ DOI β†’ Unpaywall β†’ Preprints |
| Find data | Grep for terms, focus on Methods/Results/Tables |
| Download PDF | `curl -L -o papers/FILE.pdf URL` |
| Update tracking | Add to papers-reviewed.json + SUMMARY.md |

## Next Steps

After evaluating paper:
- If score β‰₯ 7: Call `skills/research/traversing-citations`
- Continue to next paper in search results
- Check if reached 50 papers or 5 minutes β†’ ask user to continue or stop

## Auxiliary Files (for large searches >100 papers)

### README.md Template

**Use this structure for research projects with 100+ papers:**

1. **Project Overview**
   - Query description
   - Target molecules/topics
   - Date completed

2. **Quick Start Guide**
   - Where to start reading
   - Priority lists

3. **File Inventory**
   - Description of each file
   - What each is used for

4. **Key Findings Summary**
   - Statistics
   - Top findings
   - Coverage by category

5. **Methodology**
   - Scoring rubric
   - Decision rules
   - Data sources

6. **Next Steps**
   - Recommended actions
   - Priority order

### TOP_PRIORITY_PAPERS.md Template

**For datasets with >50 relevant papers, create curated priority list:**

- Organized by tier (Tier 1: Must-read, Tier 2: High-value, etc.)
- Include score, DOI, key findings summary
- Note full text availability
- Suggest reading order

**Example structure:**
```markdown
# Top Priority Papers

## Tier 1: Must-Read (Score 10)

### [Paper Title](https://doi.org/10.xxxx/yyyy) (Score: 10)

**DOI:** [10.xxxx/yyyy](https://doi.org/10.xxxx/yyyy)
**PMID:** [12345678](https://pubmed.ncbi.nlm.nih.gov/12345678/)
**Full text:** βœ“ PMC12345678

**Key Findings:**
- Finding 1
- Finding 2

---

## Tier 2: High-Value (Score 8-9)

[Additional papers organized by priority...]
```

Overview

This skill performs two-stage paper screening: a fast abstract scoring pass followed by targeted deep dives for papers that pass a relevance threshold. The goal is precision over breadthβ€”find papers that actually contain the data or methods requested, then extract and document those findings. It includes mandatory progress reporting and structured tracking of results.

How this skill works

First, it scores each abstract 0–10 using keyword match, data-type match, and specificity. Papers scoring β‰₯7 advance to a deep dive where full text is fetched (PubMed Central, DOI resolution, Unpaywall), ChEMBL is checked for medicinal-chemistry SAR, and methods/results/tables/supplementary files are searched. Findings are recorded in a structured summary and every paper is logged to prevent re-review.

When to use it

  • You have a search result list and must prioritize which papers to inspect.
  • You need specific measurements, protocols, datasets, or code from the literature.
  • Screening a small to large corpus (single-review or batch workflows).
  • When you must document progress for collaborators or maintain audit trails.
  • When medicinal chemistry, genomics, ecology, computational, or clinical data is required.

Best practices

  • Always report progress for every paperβ€”never screen silently; show abstract score and decision.
  • Use the 0–10 rubric: keywords (0–3), data-type (0–4), specificity (0–3); skip <5, hold 5–6, deep-dive β‰₯7.
  • If paywalled, always try Unpaywall before giving up; ask user for an email if needed for API calls.
  • For medicinal chemistry, check ChEMBL before parsing PDFs to leverage curated SAR data.
  • Log every paper into papers-reviewed.json and summarize relevant finds in SUMMARY.md to avoid duplication.

Example use cases

  • Quickly triage 30 search hits to find papers with IC50 or SAR tables for a drug project.
  • Screen genomics hits to locate RNA-seq datasets and GEO accession numbers for meta-analysis.
  • Locate computational-method papers with code repositories and benchmark tables for replication.
  • Process a large literature sweep with helper scripts that batch abstract fetch, scoring, and resumable deep dives.
  • Identify ecology studies reporting population counts and measurement protocols for a systematic review.

FAQ

What if full text is behind a paywall?

Always call Unpaywall to search for author copies, preprints, or repository versions. If no OA version is found, note the limitation and continue using the abstract only.

How strict is the scoring threshold?

Use the rubric as guidance: <5 = skip, 5–6 = possibly relevant (defer), β‰₯7 = deep dive. Adjust thresholds with user agreement for more or less sensitivity.

Do I need to download every PDF?

Download PDFs and supplementary files only for papers scored β‰₯7 or if the user explicitly requests full-text retrieval; always record availability in SUMMARY.md.