home / skills / bobmatnyc / claude-mpm-skills / local-llm-ops

local-llm-ops skill

safe

This skill helps you manage local LLM operations on Apple Silicon with Ollama, from setup to benchmarks and diagnostics.

npx playbooks add skill bobmatnyc/claude-mpm-skills --skill local-llm-ops

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

2.3 KB

---
name: local-llm-ops
description: Local LLM operations with Ollama on Apple Silicon, including setup, model pulls, chat launchers, benchmarks, and diagnostics.
version: 1.0.0
category: toolchain
author: Claude MPM Team
license: MIT
progressive_disclosure:
  entry_point:
    summary: "Run local LLMs with Ollama: setup venv, start service, pull models, launch chat, benchmark, and diagnose."
    when_to_use: "Operating local LLMs on macOS, running Ollama-based chat sessions, or benchmarking models for speed/latency."
    quick_start: "1. ./setup_chatbot.sh 2. ./chatllm 3. ollama pull mistral (if no models)"
tags:
  - llm
  - ollama
  - local
  - benchmark
  - chat
  - ops
---

# Local LLM Ops (Ollama)

## Overview

Your `localLLM` repo provides a full local LLM toolchain on Apple Silicon: setup scripts, a rich CLI chat launcher, benchmarks, and diagnostics. The operational path is: install Ollama, ensure the service is running, initialize the venv, pull models, then launch chat or benchmarks.

## Quick Start

```bash
./setup_chatbot.sh
./chatllm
```

If no models are present:

```bash
ollama pull mistral
```

## Setup Checklist

1. Install Ollama: `brew install ollama`
2. Start the service: `brew services start ollama`
3. Run setup: `./setup_chatbot.sh`
4. Verify service: `curl http://localhost:11434/api/version`

## Chat Launchers

- `./chatllm` (primary launcher)
- `./chat` or `./chat.py` (alternate launchers)
- Aliases: `./install_aliases.sh` then `llm`, `llm-code`, `llm-fast`

Task modes:

```bash
./chat -t coding -m codellama:70b
./chat -t creative -m llama3.1:70b
./chat -t analytical
```

## Benchmark Workflow

Benchmarks are scripted in `scripts/run_benchmarks.sh`:

```bash
./scripts/run_benchmarks.sh
```

This runs `bench_ollama.py` with:

- `benchmarks/prompts.yaml`
- `benchmarks/models.yaml`
- Multiple runs and max token limits

## Diagnostics

Run the built-in diagnostic script when setup fails:

```bash
./diagnose.sh
```

Common fixes:

- Re-run `./setup_chatbot.sh`
- Ensure `ollama` is in PATH
- Pull at least one model: `ollama pull mistral`

## Operational Notes

- Virtualenv lives in `.venv`
- Chat configs and sessions live under `~/.localllm/`
- Ollama API runs at `http://localhost:11434`

## Related Skills

- `toolchains/universal/infrastructure/docker`

Overview

This skill provides a complete local LLM operations toolkit for Apple Silicon using Ollama, including setup scripts, model management, chat launchers, benchmarks, and diagnostics. It streamlines getting a local LLM service running, pulling models, and running chat or benchmarking workflows. The tooling focuses on reproducible local development with a simple CLI surface.

How this skill works

The skill guides you through installing Ollama, starting the Ollama service, creating a Python virtualenv, and pulling models into the local host. It exposes lightweight CLI launchers (primary: ./chatllm) for task-specific sessions and scripts for running automated benchmarks and diagnostics. Diagnostic scripts inspect service availability, PATH, and model presence, while benchmark scripts run repeatable tests using YAML-configured prompts and models.

When to use it

Setting up a local LLM environment on Apple Silicon for development or testing.
Launching quick chat sessions with different task modes or model choices.
Pulling and managing models locally to avoid remote API costs and latency.
Running repeatable benchmarks across models and prompts to compare performance.
Diagnosing service, PATH, or model issues when a chat launcher fails.

Best practices

Install Ollama via Homebrew and keep the Ollama service running with brew services start ollama.
Use the included setup script to initialize the virtualenv (.venv) and install Python dependencies.
Pull at least one model (e.g., ollama pull mistral) before launching a chat session.
Store chat sessions and configs under the default ~/.localllm/ for portability and backups.
Run diagnose.sh when encountering failures and re-run setup if PATH or dependencies are missing.

Example use cases

Developer wants an offline coding assistant: run ./setup_chatbot.sh, pull a code model, then ./chat -t coding -m codellama:70b.
Researcher comparing models: run ./scripts/run_benchmarks.sh to execute bench_ollama.py with prompts.yaml and models.yaml.
QA troubleshooting a failed chat launcher: run curl http://localhost:11434/api/version and ./diagnose.sh to locate issues.
Power user creating shortcuts: install aliases with ./install_aliases.sh to enable llm, llm-code, and llm-fast commands.
Ops running repeated load tests: configure runs and max tokens in benchmarks YAML and automate via CI.

FAQ

What if the chat launcher reports no models?

Pull a model with ollama pull <model> (example: ollama pull mistral) and verify the service is running, then retry the launcher.

How do I verify Ollama is accessible?

Confirm the service with curl http://localhost:11434/api/version and ensure ollama is on your PATH.