home / skills / starlitnightly / omicverse / single-trajectory
This skill helps you reproduce and extend single-trajectory analyses by integrating PAGA, Palantir, VIA, velocity coupling, and fate scoring notebooks.
npx playbooks add skill starlitnightly/omicverse --skill single-trajectoryReview the files below or copy the command above to add this skill to your agents.
---
name: single-trajectory-analysis
title: Single-trajectory analysis
description: "Trajectory & RNA velocity: PAGA, Palantir, VIA, dynamo, scVelo, latentvelo, graphvelo backends via ov.single.Velo. Pseudotime, stream plots."
---
# Single-trajectory analysis skill
## Overview
This skill describes how to reproduce and extend the single-trajectory analysis workflow in `omicverse`, combining graph-based trajectory inference, RNA velocity coupling, and downstream fate scoring notebooks.
## Trajectory setup
- **PAGA (Partition-based graph abstraction)**
- Build a neighborhood graph (`pp.neighbors`) on the preprocessed AnnData object.
- Use `tl.paga` to compute cluster connectivity and `tl.draw_graph` or `tl.umap` with `init_pos='paga'` for embedding.
- Interpret edge weights to prioritize branch resolution and seed paths.
- **Palantir**
- Run `Palantir` on diffusion components, seeding with manually selected start cells (e.g., naïve T cells).
- Extract pseudotime, branch probabilities, and differentiation potential for subsequent overlays.
- **VIA**
- Execute `via.VIA` on the kNN graph to identify lineage progression with automatic root selection or user-defined roots.
- Export terminal states and pseudotime for cross-validation against PAGA and Palantir results.
## Velocity coupling (VIA + scVelo)
- Use `scv.pp.filter_and_normalize`, `scv.pp.moments`, and `scv.tl.velocity` to generate velocity layers.
- Provide VIA with `adata.layers['velocity']` to refine lineage directionality (`via.VIA(..., velocity_weight=...)`).
- Compare VIA pseudotime with scVelo latent time (`scv.tl.latent_time`) to validate directionality and root selection.
## Advanced RNA Velocity Backends (ov.single.Velo)
OmicVerse provides a unified `Velo` class wrapping 4 velocity backends. Use this when you need more than basic scVelo:
### Backend selection guide
| Backend | Best for | GPU? | Prerequisites |
|---------|----------|------|---------------|
| **scvelo** | Standard velocity analysis | No | spliced/unspliced layers |
| **dynamo** | Kinetics modeling, vector fields | No | spliced/unspliced layers |
| **latentvelo** | VAE-based, batch correction, complex dynamics | Yes (torchdiffeq) | celltype_key, batch_key optional |
| **graphvelo** | Refinement layer on top of any backend | No | base velocity + connectivities |
### Unified Velo pipeline
```python
import omicverse as ov
velo = ov.single.Velo(adata)
# 1. Filter (scvelo backend) or preprocess (dynamo backend)
velo.filter_genes(min_shared_counts=20) # For scvelo
# velo.preprocess(recipe='monocle', n_neighbors=30, n_pcs=30) # For dynamo
# 2. Compute moments
velo.moments(backend='scvelo', n_pcs=30, n_neighbors=30)
# backend: 'scvelo' or 'dynamo'
# 3. Fit kinetic parameters
velo.dynamics(backend='scvelo')
# 4. Calculate velocity
velo.cal_velocity(method='scvelo')
# method: 'scvelo', 'dynamo', 'latentvelo', 'graphvelo'
# 5. Build velocity graph and project to embedding
velo.velocity_graph(basis='umap')
velo.velocity_embedding(basis='umap')
```
### latentvelo specifics (deep learning velocity)
latentvelo uses a VAE + neural ODE to learn latent dynamics. It handles batch effects and complex trajectories better than classical scVelo:
```python
velo.cal_velocity(
method='latentvelo',
celltype_key='cell_type', # Optional: AnnotVAE uses cell type info
batch_key='batch', # Optional: batch correction
velocity_key='velocity_S',
n_top_genes=2000,
latentvelo_VAE_kwargs={}, # Pass custom VAE hyperparameters
)
# Requires: pip install torchdiffeq
# Uses GPU if available, falls back to CPU
```
### graphvelo specifics (refinement layer)
GraphVelo refines velocity estimates from any base method by leveraging the cell graph structure. Run it after scvelo or dynamo:
```python
# First: compute base velocity with scvelo or dynamo
velo.cal_velocity(method='scvelo')
# Then: refine with graphvelo
velo.graphvelo(
xkey='Ms', # Spliced moments key
vkey='velocity_S', # Base velocity key to refine
basis_keys=['X_umap', 'X_pca'], # Project to multiple embeddings
gene_subset=None, # Optional: restrict to gene subset
)
```
## Downstream fate scoring notebooks
- **CellFateGenie**: For pseudotime-associated gene discovery, use `search_skills('CellFateGenie fate genes')` to load the dedicated CellFateGenie skill.
- **`t_metacells.ipynb`**: Aggregate metacell trajectories for robustness checks and meta-state differential expression.
- **`t_cytotrace.ipynb`**: Integrate CytoTRACE differentiation potential with velocity-informed lineages for maturation scoring.
## Required preprocessing
1. Quality control: remove low-quality cells/genes, apply doublet filtering.
2. Normalization & log transformation (`sc.pp.normalize_total`, `sc.pp.log1p`).
3. Highly variable gene selection tailored to immune datasets (`sc.pp.highly_variable_genes`).
4. Batch correction if necessary (e.g., `scvi-tools`, `bbknn`).
5. Compute PCA, neighbor graph, and embedding (UMAP/FA) used by all trajectory methods.
6. For velocity: compute moments on the same neighbor graph before running VIA coupling.
## Parameter tuning
- Neighbor graph `n_neighbors` and `n_pcs` should be harmonized across PAGA, VIA, and Palantir to maintain consistency.
- In VIA, adjust `knn`, `too_big_factor`, and `root_user` for datasets with uneven sampling.
- Palantir requires careful start cell selection; use marker genes and velocity arrows to confirm.
- For PAGA, tweak `threshold` to control edge sparsity; ensure connected components reflect biological branches.
- Velocity estimation: compare `mode='stochastic'` vs `mode='dynamical'` in scVelo; recalibrate if terminal states disagree with VIA.
## Visualization and export
1. Overlay PAGA edges on UMAP (`scv.pl.paga`) and annotate branch labels.
2. Plot Palantir pseudotime and branch probabilities on embeddings.
3. Visualize VIA trajectories using `via.plot_fates` and `via.plot_scatter`.
4. Export pseudotime tables and fate probabilities to CSV for downstream notebooks.
5. Save high-resolution figures (PNG/SVG) and notebook artifacts for reproducibility.
6. Update notebooks with consistent color schemes and metadata columns before sharing.
## Defensive Validation Patterns
```python
# Before PAGA: verify neighbor graph exists
assert 'neighbors' in adata.uns, "Neighbor graph required. Run sc.pp.neighbors(adata) first."
# Before VIA velocity coupling: verify velocity layers exist
if 'velocity' not in adata.layers:
print("WARNING: velocity layer missing. Run scv.tl.velocity(adata) first for VIA coupling.")
assert 'spliced' in adata.layers and 'unspliced' in adata.layers, \
"Missing spliced/unspliced layers. Check loom/H5AD import preserved velocity layers."
# Before Palantir: verify PCA/diffusion components
assert 'X_pca' in adata.obsm, "PCA required. Run ov.pp.pca(adata) first."
```
## Troubleshooting tips
- **Missing velocity layers**: re-run `scv.pp.moments` and `scv.tl.velocity` ensuring `adata.layers['spliced']`/`['unspliced']` exist; verify loom/H5AD import preserved layers.
- **Disconnected PAGA graph**: inspect neighbor graph or adjust `n_neighbors`; confirm batch correction didn’t fragment the manifold.
- **Palantir convergence issues**: reduce diffusion components or reinitialize start cells; ensure no NaN values in data matrix.
- **VIA terminal states unstable**: increase iterations (`cluster_graph_pruning_iter`), or provide manual terminal state hints based on marker expression.
- **Notebook kernel memory errors**: downsample cells or precompute summaries (metacells) before rerunning.
- **latentvelo `ImportError: torchdiffeq`**: Install with `pip install torchdiffeq`. Required for neural ODE backend.
- **graphvelo returns NaN velocities**: Ensure base velocity (scvelo/dynamo) was computed first. graphvelo refines — it doesn't compute from scratch.
- **dynamo `preprocess` fails**: dynamo expects spliced/unspliced layers. Verify with `'spliced' in adata.layers`.
This skill guides reproducible single-trajectory analysis workflows that combine PAGA, Palantir, VIA, RNA velocity coupling, and downstream fate scoring notebooks. It explains preprocessing, parameter harmonization, velocity integration, and exportable outputs so you can reproduce and extend trajectory results across immune and multi-omics datasets. The focus is on practical steps, cross-validation strategies, and troubleshooting common failures.
The workflow builds a common neighbor graph and embedding, then runs complementary trajectory methods: PAGA for global branch structure, Palantir for probabilistic pseudotime and branch probabilities, and VIA for automatic lineage detection with optional velocity coupling. scVelo produces velocity layers and latent time that VIA can consume to refine directionality. Outputs (pseudotime, branch/fate probabilities, terminal states) are exported for fate scoring and visualization notebooks.
What preprocessing is required before trajectory analysis?
Perform QC (filter low-quality cells/genes, doublet removal), normalization and log1p, select highly variable genes, correct batches if needed, compute PCA and a neighbor graph, then derive UMAP/FA embeddings.
How do I validate directionality between methods?
Compare VIA pseudotime and roots with scVelo latent time and velocity vectors; use marker genes and Palantir start cells as orthogonal checks.
What to do if PAGA is disconnected?
Inspect and adjust neighbor graph parameters (n_neighbors, n_pcs), check batch correction effects, or relax PAGA threshold to reconnect biologically related components.