An MCP (Model Context Protocol) server for deep probabilistic analysis of single-cell omics data using scvi-tools with natural language!
Configuration
View docs{
"mcpServers": {
"hyennnnnnn-scvi-mcp": {
"command": "/path/to/scvi-mcp/venv/bin/python",
"args": [
"-m",
"scvi_mcp",
"run",
"--data",
"/path/to/data.h5ad"
]
}
}
}You can run an MCP server for deep probabilistic analysis of single-cell omics data using scvi-tools, exposing powerful modeling and annotation capabilities through an MCP client. This server enables you to train models, extract latent representations, annotate cell types, and integrate multi-modal data with a streamlined, scalable interface.
You will connect your MCP client to the scvi MCP server to perform common workflows. Start by ensuring your server is running, then initiate actions through the client that map to the available analysis tools. Typical use cases include training an SCVI model, obtaining latent representations, performing differential expression, annotating cell types with SCANVI, and integrating protein data with TOTALVI or PEAKVI analysis for chromatin accessibility.
# Prerequisites
- Python 3.8+ and a virtual environment tool (venv)
- Access to the MCP server command via the config block below
# Install the MCP server package
cd /path/to/scvi-mcp
python3 -m venv venv
source venv/bin/activate
pip install -e .
# Start the server (example for the provided MCP command is shown in the configuration)Configure your MCP client to reach the scvi MCP server using the stdio method shown below. This uses a local Python environment to invoke the MCP runner and pass the required data file path.
{
"mcpServers": {
"scvi": {
"command": "/path/to/scvi-mcp/venv/bin/python",
"args": ["-m", "scvi_mcp", "run", "--data", "/path/to/data.h5ad"]
}
}
}- Use a dedicated virtual environment for the MCP server to keep dependencies isolated from your system Python. - Point the --data argument to an H5AD file containing your single-cell data with the appropriate pre-processing steps. - When you train models or run analyses, save outputs (latents, residuals, predictions) to deterministic locations to enable reproducibility.
- Limit access to the MCP server to trusted clients and consider authenticating client requests if you expose the server over a network. - Regularly update the scvi-tools MCP package to incorporate fixes and improvements, and rebuild the environment as needed.
If the MCP server cannot start: verify that the Python path in the command exists, the virtual environment is activated, and the data file path is correct. Check that the specified data file is readable and that dependencies installed with pip install -e . completed successfully.
The server exposes tools spanning SCVI, SCANVI, TOTALVI, and PEAKVI for single-cell analysis and multi-modal data. See the tool list in the metadata for the exact tool names and how they map to model setup, training, evaluation, and prediction.
Basic SCVI workflow: setup data, create an SCVI model with latent dimensions, train, extract latent representations, and run differential expression between cell types.
SCANVI cell type annotation: setup with labeled cell types, create a SCANVI model from a trained SCVI model, train SCANVI, and predict cell types.
MIT License.
scvi-tools and Scanpy.
Prepare AnnData objects for SCVI analyses, including basic preprocessing and metadata alignment.
Instantiate an SCVI model with chosen latent dimensionality and architecture.
Train the SCVI model on the prepared dataset and monitor convergence.
Extract the learned latent representation for downstream analyses and visualization.
Compute normalized expression values from the SCVI model outputs.
Perform differential expression analysis between specified cell groups.
Persist the trained SCVI model to disk for later reuse.
Load a previously saved SCVI model for inference or continuation.
Prepare AnnData for SCANVI semi-supervised annotation, aligning labels and features.
Create a SCANVI model using a trained SCVI backbone.
Create a SCANVI model from an existing SCVI model to enable semi-supervised labeling.
Predict cell types for unlabeled cells using the SCANVI model.
Prepare AnnData for TOTALVI, integrating RNA and protein modalities.
Create a TOTALVI model for joint RNA-protein analysis.
Estimate protein foreground probabilities for denoising protein measurements.
Prepare AnnData for PEAKVI to analyze scATAC-seq data.
Create a PEAKVI model for chromatin accessibility analysis.
Identify differential chromatin accessibility between groups.
Compute the Evidence Lower Bound for model evaluation.
Assess reconstruction error as a model diagnostic.