home / skills / plurigrid / asi / docx
This skill helps you create, edit, and analyze DOCX documents programmatically, leveraging Pandoc, OOXML, and docx-js workflows for efficient document
npx playbooks add skill plurigrid/asi --skill docxReview the files below or copy the command above to add this skill to your agents.
---
name: docx
description: Comprehensive document creation, editing, and analysis with support for
version: 1.0.0
---
# DOCX Processing
## Workflow Decision Tree
- **Reading/Analyzing**: Use text extraction or raw XML access
- **Creating New Document**: Use docx-js (JavaScript)
- **Editing Existing**: Use OOXML editing or redlining workflow
## Reading Content
### Text Extraction with Pandoc
```bash
# Convert to markdown with tracked changes
pandoc --track-changes=all file.docx -o output.md
```
### Raw XML Access
```bash
# Unpack document
unzip document.docx -d unpacked/
# Key files:
# word/document.xml - Main content
# word/comments.xml - Comments
# word/media/ - Images
```
## Creating New Documents (docx-js)
```javascript
import { Document, Paragraph, TextRun, Packer } from 'docx';
import fs from 'fs';
const doc = new Document({
sections: [{
children: [
new Paragraph({
children: [
new TextRun({ text: "Hello ", bold: true }),
new TextRun({ text: "World", italics: true })
]
})
]
}]
});
const buffer = await Packer.toBuffer(doc);
fs.writeFileSync('document.docx', buffer);
```
## Editing Existing Documents
### Simple Edits
1. Unpack: `unzip doc.docx -d unpacked/`
2. Edit `word/document.xml`
3. Repack: `cd unpacked && zip -r ../edited.docx .`
### Tracked Changes (Redlining)
For professional documents, use tracked changes:
```xml
<!-- Deletion -->
<w:del w:author="Author" w:date="2025-01-01T00:00:00Z">
<w:r><w:delText>old text</w:delText></w:r>
</w:del>
<!-- Insertion -->
<w:ins w:author="Author" w:date="2025-01-01T00:00:00Z">
<w:r><w:t>new text</w:t></w:r>
</w:ins>
```
## Converting to Images
```bash
# DOCX to PDF
soffice --headless --convert-to pdf document.docx
# PDF to images
pdftoppm -jpeg -r 150 document.pdf page
```
## Best Practices
- Use Pandoc for text extraction
- Use docx-js for creating new documents
- For legal/business docs, always use tracked changes
- Preserve original RSIDs when editing
## Scientific Skill Interleaving
This skill connects to the K-Dense-AI/claude-scientific-skills ecosystem:
### Graph Theory
- **networkx** [○] via bicomodule
- Universal graph hub
### Bibliography References
- `general`: 734 citations in bib.duckdb
## SDF Interleaving
This skill connects to **Software Design for Flexibility** (Hanson & Sussman, 2021):
### Primary Chapter: 10. Adventure Game Example
**Concepts**: autonomous agent, game, synthesis
### GF(3) Balanced Triad
```
docx (○) + SDF.Ch10 (+) + [balancer] (−) = 0
```
**Skill Trit**: 0 (ERGODIC - coordination)
### Connection Pattern
Adventure games synthesize techniques. This skill integrates multiple patterns.
## Cat# Integration
This skill maps to **Cat# = Comod(P)** as a bicomodule in the equipment structure:
```
Trit: 0 (ERGODIC)
Home: Prof
Poly Op: ⊗
Kan Role: Adj
Color: #26D826
```
### GF(3) Naturality
The skill participates in triads satisfying:
```
(-1) + (0) + (+1) ≡ 0 (mod 3)
```
This ensures compositional coherence in the Cat# equipment structure.This skill provides comprehensive DOCX document creation, editing, and analysis tools focused on reliable text extraction, OOXML-level edits, and programmatic document generation. It combines practical workflows for reading content, producing new documents with docx-js, editing existing files via raw XML or tracked changes, and converting outputs to images or PDF. The goal is predictable, auditable changes suitable for business, legal, and scientific workflows.
For reading and analysis it extracts text with Pandoc or inspects raw OOXML by unpacking the .docx ZIP and editing word/document.xml and related parts. For creation it demonstrates using docx-js to build sections, paragraphs, and runs programmatically and export a .docx. For edits it supports simple XML edits, repacking, and professional redlining via w:ins and w:del elements to preserve tracked-change history. Conversion to raster images is handled by converting DOCX→PDF (LibreOffice headless) then PDF→JPEG with pdftoppm.
Can I edit tracked changes programmatically?
Yes. Inject w:ins and w:del elements into the document XML with proper author/date attributes; keep RSIDs if you need to preserve change identities.
When should I use raw OOXML vs a library like docx-js?
Use docx-js for creating new documents and templates. Use raw OOXML for precise edits, repairs, or when you must preserve complex Word-specific features not exposed by libraries.