home / skills / gptomics / bioskills / bowtie2-alignment

bowtie2-alignment skill

/read-alignment/bowtie2-alignment

This skill assists DNA read alignment with Bowtie2 in end-to-end or local modes, optimizing ChIP-seq and ATAC-seq workflows.

npx playbooks add skill gptomics/bioskills --skill bowtie2-alignment

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
4.6 KB
---
name: bio-read-alignment-bowtie2-alignment
description: Align short reads using Bowtie2 with local or end-to-end modes. Supports gapped alignment. Use when aligning ChIP-seq, ATAC-seq, or when flexible alignment modes are needed.
tool_type: cli
primary_tool: bowtie2
---

## Version Compatibility

Reference examples tested with: samtools 1.19+

Before using code patterns, verify installed versions match. If versions differ:
- CLI: `<tool> --version` then `<tool> --help` to confirm flags

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Bowtie2 Alignment

**"Align DNA reads with Bowtie2"** → Map short reads to a reference genome using Bowtie2's end-to-end or local alignment modes.
- CLI: `bowtie2 -x index -1 R1.fq -2 R2.fq | samtools sort -o aligned.bam`

## Build Index

```bash
# Build index from reference FASTA
bowtie2-build reference.fa reference_index

# With threads (faster)
bowtie2-build --threads 8 reference.fa reference_index

# Creates: reference_index.1.bt2, .2.bt2, .3.bt2, .4.bt2, .rev.1.bt2, .rev.2.bt2
```

## Basic Alignment

```bash
# Paired-end reads
bowtie2 -p 8 -x reference_index -1 reads_1.fq.gz -2 reads_2.fq.gz -S aligned.sam

# Single-end reads
bowtie2 -p 8 -x reference_index -U reads.fq.gz -S aligned.sam

# Direct to sorted BAM
bowtie2 -p 8 -x reference_index -1 r1.fq.gz -2 r2.fq.gz | \
    samtools sort -@ 4 -o aligned.sorted.bam -
```

## Alignment Modes

```bash
# End-to-end mode (default) - align entire read
bowtie2 --end-to-end -x index -1 r1.fq -2 r2.fq -S aligned.sam

# Local mode - soft-clip ends for better alignment
bowtie2 --local -x index -1 r1.fq -2 r2.fq -S aligned.sam
```

## Sensitivity Presets

```bash
# Very fast (less sensitive)
bowtie2 --very-fast -x index -1 r1.fq -2 r2.fq -S aligned.sam

# Fast
bowtie2 --fast -x index -1 r1.fq -2 r2.fq -S aligned.sam

# Sensitive (default)
bowtie2 --sensitive -x index -1 r1.fq -2 r2.fq -S aligned.sam

# Very sensitive (slower but more accurate)
bowtie2 --very-sensitive -x index -1 r1.fq -2 r2.fq -S aligned.sam

# Local mode equivalents
bowtie2 --very-sensitive-local -x index -1 r1.fq -2 r2.fq -S aligned.sam
```

## ChIP-seq Alignment

```bash
# Typical ChIP-seq settings
bowtie2 -p 8 \
    --very-sensitive \
    --no-mixed \
    --no-discordant \
    -x index -1 chip_1.fq.gz -2 chip_2.fq.gz | \
    samtools view -bS -q 30 -F 4 - | \
    samtools sort -o chip.sorted.bam -
```

## ATAC-seq Alignment

```bash
# ATAC-seq with size selection
bowtie2 -p 8 \
    --very-sensitive \
    -X 2000 \                    # Max fragment length
    --no-mixed \
    --no-discordant \
    -x index -1 atac_1.fq.gz -2 atac_2.fq.gz | \
    samtools view -bS -q 30 - | \
    samtools sort -o atac.sorted.bam -
```

## Fragment Size Options

```bash
# Set expected insert size range
bowtie2 -p 8 \
    -I 100 \     # Minimum fragment length
    -X 500 \     # Maximum fragment length
    -x index -1 r1.fq -2 r2.fq -S aligned.sam
```

## Read Group and Output Options

```bash
# Add read group
bowtie2 -p 8 \
    --rg-id sample1 \
    --rg SM:sample1 \
    --rg PL:ILLUMINA \
    --rg LB:lib1 \
    -x index -1 r1.fq -2 r2.fq -S aligned.sam
```

## Multi-mapping Reads

```bash
# Report up to k alignments per read
bowtie2 -k 5 -x index -1 r1.fq -2 r2.fq -S aligned.sam

# Report all alignments
bowtie2 -a -x index -1 r1.fq -2 r2.fq -S aligned.sam
```

## Output Unmapped Reads

```bash
# Write unmapped reads to separate files
bowtie2 -p 8 \
    --un-conc-gz unmapped_%.fq.gz \
    -x index -1 r1.fq.gz -2 r2.fq.gz -S aligned.sam
```

## Key Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| -p | 1 | Number of threads |
| -x | - | Index basename |
| -1/-2 | - | Paired-end reads |
| -U | - | Single-end reads |
| -I | 0 | Min fragment length |
| -X | 500 | Max fragment length |
| -k | 1 | Report up to k alignments |
| --no-mixed | off | Suppress unpaired alignments |
| --no-discordant | off | Suppress discordant alignments |

## Alignment Statistics

```bash
# Bowtie2 prints alignment summary to stderr
bowtie2 -p 8 -x index -1 r1.fq -2 r2.fq -S aligned.sam 2> alignment_stats.txt
```

Example output:
```
1000000 reads; of these:
  1000000 (100.00%) were paired; of these:
    50000 (5.00%) aligned concordantly 0 times
    900000 (90.00%) aligned concordantly exactly 1 time
    50000 (5.00%) aligned concordantly >1 times
95.00% overall alignment rate
```

## Related Skills

- read-qc/fastp-workflow - Preprocess reads before alignment
- alignment-files/alignment-sorting - Post-alignment processing
- chip-seq/peak-calling - ChIP-seq analysis
- atac-seq/atac-peak-calling - ATAC-seq analysis

Overview

This skill aligns short DNA reads to a reference genome using Bowtie2, supporting both end-to-end and local alignment modes and gapped alignment. It is designed for ChIP-seq, ATAC-seq, or any project that needs flexible sensitivity and multi-mapping control. The outputs are SAM/BAM files compatible with standard downstream tools (samtools, peak callers).

How this skill works

The skill builds or reuses a Bowtie2 index from a reference FASTA, then runs Bowtie2 on single- or paired-end FASTQ inputs with user-selected presets, fragment-size limits, and reporting options. It can stream results into samtools to produce sorted BAMs, filter by mapping quality, and extract unmapped reads or multiple alignments per read. Alignment summaries are captured from Bowtie2 stderr for quick QC.

When to use it

  • Aligning ChIP-seq reads where concordant paired-end mapping and high sensitivity are critical.
  • Aligning ATAC-seq reads with controlled fragment-size range and high-sensitivity local alignment.
  • Any short-read mapping job needing local soft-clipping or strict end-to-end behavior.
  • When you need to report multiple alignments per read (multi-mapping) or output unmapped reads.
  • Pipelined workflows that stream Bowtie2 directly into samtools for sorted BAM outputs.

Best practices

  • Build the Bowtie2 index with bowtie2-build and match thread counts to available CPU cores.
  • Choose sensitivity presets based on speed vs accuracy: very-sensitive for peak calling, very-fast for quick checks.
  • Use --no-mixed and --no-discordant for ChIP/ATAC paired-end data to avoid spurious alignments.
  • Stream Bowtie2 into samtools sort to avoid large intermediate SAM files and apply MAPQ filtering.
  • Record Bowtie2 stderr to capture alignment statistics for QC and reporting.

Example use cases

  • ChIP-seq: bowtie2 --very-sensitive --no-mixed --no-discordant -x index -1 chip_1.fq.gz -2 chip_2.fq.gz | samtools view -bS -q 30 -F 4 - | samtools sort -o chip.sorted.bam -
  • ATAC-seq: bowtie2 --very-sensitive -X 2000 --no-mixed --no-discordant -x index -1 atac_1.fq.gz -2 atac_2.fq.gz | samtools view -bS -q 30 - | samtools sort -o atac.sorted.bam -
  • Paired-end aligned BAM directly: bowtie2 -p 8 -x index -1 r1.fq.gz -2 r2.fq.gz | samtools sort -@ 4 -o aligned.sorted.bam -
  • Report multi-mappers: bowtie2 -k 5 -x index -1 r1.fq -2 r2.fq -S aligned.sam
  • Extract unmapped reads: bowtie2 --un-conc-gz unmapped_%.fq.gz -x index -1 r1.fq.gz -2 r2.fq.gz -S /dev/null

FAQ

How do I choose between local and end-to-end modes?

Use end-to-end to require full-read alignment when reads are high quality; use local to allow soft-clipping at ends when adapters or low-quality tails remain.

How can I avoid very large SAM files?

Pipe Bowtie2 output into samtools view/sort to produce compressed BAMs on the fly and reduce disk I/O and storage.