home / skills / zpankz / mcp-skillset / obsidian-process
This skill batch-processes Obsidian vaults, extracting wikilinks, normalizing tags, and managing frontmatter for scalable migrations and metadata consistency.
npx playbooks add skill zpankz/mcp-skillset --skill obsidian-processReview the files below or copy the command above to add this skill to your agents.
---
name: obsidian-process
description: This skill should be used when batch processing Obsidian markdown vaults. Handles wikilink extraction, tag normalization, frontmatter CRUD operations, and vault analysis. Use for vault-wide transformations, link auditing, tag standardization, metadata management, and migration workflows. Integrates with obsidian-markdown for syntax validation and obsidian-data-importer for structured imports.
---
# Obsidian Process
Modular Python batch processing toolkit for Obsidian vault operations with dry-run support, backup/rollback capabilities, and comprehensive reporting.
## System Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ BatchProcessor (Base) │
│ - find_markdown_files() - read_file() - write_file() │
│ - VaultContext (rollback) - ProcessingResult (output) │
└──────────────────────────┬──────────────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────────┐
│ Wikilink │ │ Tag │ │ Frontmatter │
│Extractor │ │Normalizer│ │ Processor │
└──────────┘ └──────────┘ └──────────────┘
```
## Quick Start
### Extract All Wikilinks
```bash
python scripts/wikilink_extractor.py --vault /path/to/vault --output links.json
```
### Normalize Tags
```bash
python scripts/tag_normalizer.py --vault /path/to/vault --case lower --dry-run
```
### Process Frontmatter
```bash
python scripts/frontmatter_processor.py --vault /path/to/vault --operation add --key status --value draft
```
## Modules
### 1. BatchProcessor (base class)
All processors inherit from `BatchProcessor`:
```python
from batch_processor import BatchProcessor, ProcessingResult, VaultContext
class MyProcessor(BatchProcessor):
def process(self) -> ProcessingResult:
with VaultContext(self) as ctx:
for file in self.find_markdown_files():
content = self.read_file(file)
# Transform content
self.write_file(file, new_content)
return ProcessingResult(success=True, files_processed=n, ...)
```
**Key Features**:
- `--dry-run`: Preview changes without modifying files
- `--verbose`: Detailed logging output
- Automatic backup before modifications
- Rollback on errors via `VaultContext`
### 2. WikilinkExtractor
Extracts and analyzes all wikilink formats:
| Format | Example | Captured Fields |
|--------|---------|-----------------|
| Basic | `[[Note]]` | target |
| Alias | `[[Note\|Display]]` | target, display_text |
| Header | `[[Note#Section]]` | target, header |
| Block | `[[Note^abc123]]` | target, block_id |
| Embed | `![[Image.png]]` | target, is_embedded |
**Output**: JSON with `link_index`, `backlink_index`, `statistics`
```bash
# Build link graph
python scripts/wikilink_extractor.py --vault ./vault --output graph.json
# Find broken links
python scripts/wikilink_extractor.py --vault ./vault --find-broken
```
### 3. TagNormalizer
Handles both inline (`#tag`) and frontmatter tags:
**Operations**:
- Case normalization: `lower`, `upper`, `title`
- Pattern-based rules via JSON config
- Hierarchy parsing: `#project/active/urgent` → 3 levels
```bash
# Normalize all tags to lowercase
python scripts/tag_normalizer.py --vault ./vault --case lower
# Apply custom rules
python scripts/tag_normalizer.py --vault ./vault --rules rules.json
```
**Rules JSON Format**:
```json
{
"rules": [
{"pattern": "^todo$", "replacement": "task", "case_sensitive": false},
{"pattern": "^WIP$", "replacement": "in-progress"}
]
}
```
### 4. FrontmatterProcessor
Full YAML frontmatter CRUD with templates and validators:
**Operations**:
- `add`: Add/update field (merges with existing)
- `remove`: Delete field from frontmatter
- `update`: Same as add (explicit intent)
- `validate`: Check against registered validators
- `template`: Apply predefined templates
```bash
# Add created date to all files
python scripts/frontmatter_processor.py --vault ./vault \
--operation add --key created --value "2024-01-01"
# Apply article template
python scripts/frontmatter_processor.py --vault ./vault \
--operation template --template article
# Validate all frontmatter
python scripts/frontmatter_processor.py --vault ./vault --operation validate
```
**Built-in Templates**:
- `basic`: created, tags, aliases
- `article`: created, modified, tags, aliases, author, published, draft
- `meeting`: created, tags, attendees, date, location
## Integration Patterns
### With obsidian-markdown Skill
Use obsidian-process for batch operations, then validate with obsidian-markdown:
```
1. obsidian-process: Normalize tags vault-wide
2. obsidian-markdown: Validate Obsidian syntax compliance
3. obsidian-process: Generate validation report
```
### With obsidian-data-importer Skill
Process imported data after ingestion:
```
1. obsidian-data-importer: Convert CSV/JSON to markdown
2. obsidian-process: Apply frontmatter templates
3. obsidian-process: Build wikilink index
4. obsidian-markdown: Validate final output
```
## Common Workflows
### Vault Health Check
```bash
# 1. Extract all wikilinks
python scripts/wikilink_extractor.py --vault ./vault --output links.json
# 2. Identify orphaned notes (no incoming links)
python scripts/wikilink_extractor.py --vault ./vault --find-orphans
# 3. Validate frontmatter
python scripts/frontmatter_processor.py --vault ./vault --operation validate
```
### Tag Migration
```bash
# 1. Preview changes with dry-run
python scripts/tag_normalizer.py --vault ./vault --case lower --dry-run
# 2. Apply normalization
python scripts/tag_normalizer.py --vault ./vault --case lower
# 3. Report statistics
python scripts/tag_normalizer.py --vault ./vault --stats-only
```
### Frontmatter Standardization
```bash
# 1. Apply template to all notes
python scripts/frontmatter_processor.py --vault ./vault \
--operation template --template basic --dry-run
# 2. Add custom fields
python scripts/frontmatter_processor.py --vault ./vault \
--operation add --key project --value "[[Projects/Main]]"
```
## ProcessingResult Schema
All processors return standardized results:
```python
@dataclass
class ProcessingResult:
success: bool
files_processed: int
files_modified: int
errors: List[str]
warnings: List[str]
metadata: Dict[str, Any] # Operation-specific data
timestamp: str
```
## Safety Features
| Feature | Description |
|---------|-------------|
| `--dry-run` | Preview all changes without writing |
| Backup | Automatic `.bak` files before modification |
| Rollback | `VaultContext` reverts on exceptions |
| Logging | Timestamped operation logs |
| Validation | Pre-write content validation |
## Extension Guide
To add a new processor:
```python
from batch_processor import BatchProcessor, ProcessingResult
class MyProcessor(BatchProcessor):
def __init__(self, vault_path, dry_run=False, verbose=False):
super().__init__(vault_path, dry_run, verbose)
# Custom initialization
def process_file(self, file_path: Path) -> bool:
content = self.read_file(file_path)
# Transform content
return self.write_file(file_path, new_content)
def process(self) -> ProcessingResult:
for file in self.find_markdown_files():
self.process_file(file)
return ProcessingResult(success=True, ...)
```
## CLI Reference
All scripts share common arguments:
```
--vault PATH Path to Obsidian vault (required)
--dry-run Preview changes without modifying
--verbose, -v Enable detailed logging
--output PATH Output file for results (JSON)
```
### wikilink_extractor.py
```
--find-broken Report links to non-existent notes
--find-orphans Report notes with no incoming links
--stats-only Output statistics without full index
```
### tag_normalizer.py
```
--case {lower,upper,title} Case normalization mode
--rules PATH JSON file with transformation rules
--stats-only Output tag statistics only
```
### frontmatter_processor.py
```
--operation {add,remove,update,validate,template}
--key KEY Frontmatter key for add/remove/update
--value VALUE Value to set (supports JSON arrays/objects)
--template NAME Template name for template operation
```
## LSP Integration
**When to use**: Semantic analysis requiring graph-awareness—backlink chains, cross-vault renaming, broken link detection with context.
**Setup**:
```bash
export ENABLE_LSP_TOOL=1
# Requires: markdown-oxide in PATH (cargo install markdown-oxide)
```
**Core Capabilities** (via `scripts/lsp_integration.py`):
| Operation | Purpose |
|-----------|---------|
| `findReferences` | Find all backlinks to a note |
| `goToDefinition` | Resolve wikilink targets |
| `workspaceSymbol` | Search vault for tags/headings |
| `diagnostics` | Detect broken links |
| `rename` | Cross-vault symbol renaming |
**Key Classes**:
- `LSPClient`: Low-level LSP protocol wrapper
- `MarkdownOxideIntegration`: High-level Obsidian operations
- `RecursiveLSP`: Graph traversal (backlink chains, orphan detection, strongly connected components)
- `VaultLSPAnalyzer`: Extends BatchProcessor with LSP-powered vault analysis
**CLI Quick Reference**:
```bash
# Analyze entire vault
python scripts/lsp_integration.py analyze-vault --vault ~/vault --output analysis.json
# Find recursive backlinks
python scripts/lsp_integration.py recursive-backlinks --vault ~/vault --file note.md --depth 3
```
**Detailed Documentation**: See `references/lsp-patterns.md` for:
- Complete LSP method reference
- Recursive query patterns (backlink chains, reference graphs, transitive closure)
- Batch processing strategies with caching
- Integration patterns with WikilinkExtractor, TagNormalizer, VaultAnalyzer
- Performance optimization and troubleshooting
This skill is a batch processing toolkit for Obsidian markdown vaults that automates wikilink extraction, tag normalization, frontmatter CRUD, and vault analysis. It is designed for safe, repeatable vault-wide transformations with dry-run, automatic backup, and rollback support. Use it to standardize metadata, audit links, migrate tags, and generate structured reports for downstream tools.
Processors inherit from a common BatchProcessor that finds markdown files, reads and transforms content, and writes changes while producing a standardized ProcessingResult. Key modules include a WikilinkExtractor that builds link and backlink indexes, a TagNormalizer for inline and frontmatter tags with rule-driven transforms, and a FrontmatterProcessor that performs add/remove/update/validate/template operations. Safety features include --dry-run, automatic .bak backups, VaultContext rollback on exceptions, and pre-write validation; outputs are JSON-ready for integration.
Can I preview changes without modifying files?
Yes — use --dry-run to simulate all changes and produce reports without writing files.
How does rollback work if something fails mid-run?
Processors use VaultContext which creates backups before writes and will revert to backups if an exception occurs.
Can I apply complex tag mappings?
Yes — TagNormalizer accepts a JSON rules file with regex patterns and replacements for deterministic, case-aware mappings.