home / skills / starlitnightly / omicverse / spatial-tutorials
This skill guides you through spatial transcriptomics tutorials using OmicVerse, covering preprocessing, deconvolution, and downstream modelling across
npx playbooks add skill starlitnightly/omicverse --skill spatial-tutorialsReview the files below or copy the command above to add this skill to your agents.
---
name: spatial-transcriptomics-tutorials-with-omicverse
title: Spatial transcriptomics tutorials with omicverse
description: "Spatial transcriptomics: Visium/HD, Stereo-seq, Slide-seq preprocessing (crop, rotate, cellpose), deconvolution (Tangram, cell2location, Starfysh), clustering (GraphST, STAGATE), integration, trajectory, communication."
---
# Spatial Transcriptomics with OmicVerse
This skill covers spatial analysis workflows organized into three stages: Preprocessing, Deconvolution, and Downstream Analysis. Each stage includes the critical function calls, parameter guidance, and common pitfalls.
## Defensive Validation: Always Check Spatial Coordinates First
Before ANY spatial operation, verify that spatial coordinates exist and are numeric:
```python
# Required check before spatial analysis
assert 'spatial' in adata.obsm, \
"Missing adata.obsm['spatial']. Load with ov.io.spatial.read_visium() or set manually."
# Cast to float64 to prevent coordinate precision issues during rotation/cropping
adata.obsm['spatial'] = adata.obsm['spatial'].astype('float64')
```
## Stage 1: Preprocessing
### Crop, Rotate, and Align Coordinates
Load Visium data and manipulate spatial coordinates for region selection and alignment:
```python
import scanpy as sc, omicverse as ov
ov.plot_set()
adata = sc.datasets.visium_sge(sample_id="V1_Breast_Cancer_Block_A_Section_1")
library_id = list(adata.uns['spatial'].keys())[0]
# Cast coordinates before manipulation
adata.obsm['spatial'] = adata.obsm['spatial'].astype('float64')
# Crop to region of interest
adata_crop = ov.space.crop_space_visium(adata, crop_loc=(0, 0), crop_area=(1000, 1000),
library_id=library_id, scale=1)
# Rotate and auto-align
adata_rot = ov.space.rotate_space_visium(adata, angle=45, library_id=library_id)
ov.space.map_spatial_auto(adata_rot, method='phase')
# For manual refinement: ov.space.map_spatial_manual(adata_rot, ...)
```
### Visium HD Cell Segmentation
Segment Visium HD bins into cells using cellpose:
```python
adata = ov.space.read_visium_10x(path="binned_outputs/square_002um/",
source_image_path="tissue_image.btf")
ov.pp.filter_genes(adata, min_cells=3)
ov.pp.filter_cells(adata, min_counts=1)
# H&E-based segmentation
adata = ov.space.visium_10x_hd_cellpose_he(adata, mpp=0.3, gpu=True, buffer=150)
# Expand labels to neighboring bins
ov.space.visium_10x_hd_cellpose_expand(adata, labels_key='labels_he',
expanded_labels_key='labels_he_expanded', max_bin_distance=4)
# Gene-expression-driven seeds
ov.space.visium_10x_hd_cellpose_gex(adata, obs_key="n_counts_adjusted", mpp=0.3, sigma=5)
# Merge labels and aggregate to cell-level
ov.space.salvage_secondary_labels(adata, primary_label='labels_he_expanded',
secondary_label='labels_gex', labels_key='labels_joint')
cdata = ov.space.bin2cell(adata, labels_key='labels_joint')
```
## Stage 2: Deconvolution
### Critical API Reference: Method Selection
```python
# CORRECT — Tangram: method passed to deconvolution() call
decov_obj = ov.space.Deconvolution(adata_sc=sc_adata, adata_sp=sp_adata,
celltype_key='Subset', result_dir='result/tangram')
decov_obj.preprocess_sc(max_cells=5000)
decov_obj.preprocess_sp()
decov_obj.deconvolution(method='tangram', num_epochs=1000)
# CORRECT — cell2location: method passed at INIT time
cell2_obj = ov.space.Deconvolution(adata_sc=sc_adata, adata_sp=sp_adata,
celltype_key='Subset', result_dir='result/c2l',
method='cell2location')
cell2_obj.deconvolution(max_epochs=30000)
cell2_obj.save_model('result/c2l/model')
```
Note: For cell2location, the `method` parameter is set at initialization, not at the `deconvolution()` call. For Tangram, it's passed to `deconvolution()`.
### Starfysh Archetypal Deconvolution
```python
from omicverse.external.starfysh import AA, utils, plot_utils
visium_args = utils.prepare_data(adata_path="data/counts.h5ad",
signature_path="data/signatures.csv",
min_cells=10, filter_hvg=True, n_top_genes=3000)
adata, adata_normed = visium_args.get_adata()
aa_model = AA.ArchetypalAnalysis(adata_orig=adata_normed)
aa_model.fit(k=12, n_init=10)
visium_args = utils.refine_anchors(visium_args, aa_model, add_marker=True)
model, history = utils.run_starfysh(visium_args, poe=False, n_repeat=5, lr=5e-3, max_epochs=500)
```
## Stage 3: Downstream Analysis
### Spatial Clustering
```python
# GraphST + mclust
ov.utils.cluster(adata, use_rep='graphst|original|X_pca', method='mclust', n_components=7)
# STAGATE
ov.utils.cluster(adata, use_rep='STAGATE', method='mclust', n_components=7)
# Merge small clusters
ov.space.merge_cluster(adata, groupby='mclust', resolution=0.5)
```
Algorithm choice: GraphST and STAGATE require precalculated latent spaces. For standard clustering without spatial-aware embeddings, use Leiden/Louvain directly.
### Multi-Slice Integration (STAligner)
```python
import anndata as ad
Batch_list = [ov.read(p) for p in slice_paths]
adata_concat = ad.concat(Batch_list, label='slice_name', keys=section_ids)
STAligner_obj = ov.space.pySTAligner(adata=adata_concat, batch_key='slice_name',
hidden_dims=[256, 64], use_gpu=True)
STAligner_obj.train_STAligner_subgraph(nepochs=800, lr=1e-3)
STAligner_obj.train()
adata_aligned = STAligner_obj.predicted()
sc.pp.neighbors(adata_aligned, use_rep='STAligner')
```
### Spatial Trajectories
**SpaceFlow** — pseudo-spatial maps:
```python
sf_obj = ov.space.pySpaceFlow(adata)
sf_obj.train(spatial_regularization_strength=0.1, num_epochs=300, patience=50)
sf_obj.cal_pSM(n_neighbors=20, resolution=1.0)
```
**STT** — transition dynamics:
```python
STT_obj = ov.space.STT(adata, spatial_loc='xy_loc', region='Region', n_neighbors=20)
STT_obj.stage_estimate()
STT_obj.train(n_states=9, n_iter=15, weight_connectivities=0.5)
```
### Cell Communication (COMMOT + FlowSig)
```python
df_cellchat = ov.external.commot.pp.ligand_receptor_database(species='human', database='cellchat')
df_cellchat = ov.external.commot.pp.filter_lr_database(df_cellchat, adata, min_expr_frac=0.05)
ov.external.commot.tl.spatial_communication(adata, lr_database=df_cellchat,
distance_threshold=500, result_prefix='cellchat')
# FlowSig network
adata.layers['normalized'] = adata.X.copy()
ov.external.flowsig.tl.construct_intercellular_flow_network(
adata, commot_output_key='commot-cellchat',
flowsig_output_key='flowsig-cellchat', edge_threshold=0.7)
```
### Structural Layers (GASTON) and Slice Alignment (SLAT)
**GASTON** — iso-depth estimation:
```python
gas_obj = ov.space.GASTON(adata)
A = gas_obj.prepare_inputs(n_pcs=50)
gas_obj.load_rescale(A)
gas_obj.train(hidden_dims=[64, 32], dropout=0.1, max_epochs=2000)
gaston_isodepth, gaston_labels = gas_obj.cal_iso_depth(n_layers=5)
```
**SLAT** — cross-slice alignment:
```python
from omicverse.external.scSLAT.model import Cal_Spatial_Net, load_anndatas, run_SLAT, spatial_match
Cal_Spatial_Net(adata1, k_cutoff=20, model='KNN')
Cal_Spatial_Net(adata2, k_cutoff=20, model='KNN')
edges, features = load_anndatas([adata1, adata2], feature='DPCA', check_order=False)
embeddings, *_ = run_SLAT(features, edges, LGCN_layer=5)
best, index, distance = spatial_match(embeddings, adatas=[adata1, adata2])
```
## Troubleshooting
- **`ValueError: spatial coordinates out of bounds` after rotation**: Cast `adata.obsm['spatial']` to `float64` BEFORE calling `rotate_space_visium`. Integer coordinates lose precision during trigonometric rotation.
- **Cellpose segmentation fails with memory error**: For large `.btf` images, use `backend='tifffile'` to memory-map the image. Reduce `buffer` parameter if GPU memory is insufficient.
- **Gene ID overlap failure in Tangram/cell2location**: Harmonise identifiers (ENSEMBL vs gene symbols) between `adata_sc` and `adata_sp` before calling `preprocess_sc`/`preprocess_sp`. Drop non-overlapping genes.
- **`mclust` clustering error**: Requires `rpy2` and the R `mclust` package. If R bindings are unavailable, switch to `method='louvain'` or `method='leiden'`.
- **STAligner/SpaceFlow embeddings collapse to a single point**: Verify `adata.obsm['spatial']` exists and coordinates are scaled appropriately. Tune learning rate (try `lr=5e-4`) and regularisation strength.
- **FlowSig returns empty network**: Build spatial neighbor graphs before Moran's I filtering. Increase bootstraps or lower `edge_threshold` (try 0.5) if the network is too sparse.
- **GASTON `RuntimeError` in training**: Provide a writable `out_dir` path. PyTorch nondeterminism may cause variation between runs—set `torch.manual_seed()` for reproducibility.
- **SLAT alignment has many low-quality matches**: Regenerate spatial graphs with a higher `k_cutoff` value. Inspect `low_quality_index` flags and filter cells with high distance scores.
- **STT pathway enrichment fails**: `gseapy` needs network access for gene set downloads. Cache gene sets locally with `ov.utils.geneset_prepare()` and pass the dictionary directly.
## Dependencies
- Core: `omicverse`, `scanpy`, `anndata`, `squidpy`, `numpy`, `matplotlib`
- Segmentation: `cellpose`, `opencv-python`/`tifffile`, optional GPU PyTorch
- Deconvolution: `tangram-sc`, `cell2location`, `pytorch-lightning`; Starfysh needs `torch`, `scikit-learn`
- Downstream: `scikit-learn`, `commot`, `flowsig`, `gseapy`, torch-backed modules (STAligner, SpaceFlow, GASTON, SLAT)
## Examples
- "Crop and rotate my Visium slide, then run cellpose segmentation on the HD data and aggregate to cell-level AnnData."
- "Deconvolve my lymph node spatial data with Tangram and cell2location, compare proportions, and plot cell-type maps."
- "Integrate three DLPFC slices with STAligner, cluster with STAGATE, and infer communication with COMMOT+FlowSig."
## References
- Quick copy/paste commands: [`reference.md`](reference.md)
This skill guides users through omicverse spatial transcriptomics tutorials for preprocessing, deconvolution, and downstream modelling across Visium, Visium HD, Stereo‑seq, and Slide‑seq datasets. It organizes staged notebooks into Preprocess, Deconvolution, and Downstream workflows and surfaces key functions and troubleshooting tips for each step. Follow the recipes to prepare data, run Tangram/cell2location/Starfysh, and apply spatial clustering, alignment, and trajectory methods.
The skill points to notebook workflows that load spatial slides, rescale and align coordinates, and perform Visium HD segmentation with Cellpose or gene‑guided seeding. It wraps unified deconvolution via ov.space.Deconvolution (Tangram/cell2location) and archetypal deconvolution with Starfysh. Downstream modules include spatial clustering, multi‑slice alignment (STAligner), pseudo‑spatial modelling (SpaceFlow/STT), communication inference (COMMOT/FlowSig), GASTON layer extraction, and SLAT alignment.
What causes coordinate mismatches after rotation?
Ensure scalefactors are applied when plotting and cast adata.obsm['spatial'] to float64 before running automatic mapping.
How do I fix Cellpose runtime errors on large images?
Verify image paths, use tifffile backend for memory mapping, adjust mpp and buffer, and match CUDA/PyTorch builds for GPU acceleration.
Why do Tangram/cell2location runs fail to align genes?
Harmonize gene IDs between scRNA and spatial matrices (ENS vs symbols) and remove non‑overlapping genes before preprocess steps.