home / skills / bobmatnyc / claude-mpm-skills / xlsx

xlsx skill

safe

This skill helps you manipulate Excel files programmatically across Python and JavaScript, enabling reading, writing, formatting, and data transformations.

npx playbooks add skill bobmatnyc/claude-mpm-skills --skill xlsx

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

3.2 KB

---
name: xlsx
description: Working with Excel files programmatically.
updated_at: 2025-10-30T17:00:00Z
tags: [excel, xlsx, spreadsheet, data]
progressive_disclosure:
  entry_point:
    summary: "Working with Excel files programmatically."
    when_to_use: "When working with xlsx or related functionality."
    quick_start: "1. Review the core concepts below. 2. Apply patterns to your use case. 3. Follow best practices for implementation."
---
# Excel/XLSX Manipulation

Working with Excel files programmatically.

## Python (openpyxl)

### Reading Excel
```python
from openpyxl import load_workbook

wb = load_workbook('data.xlsx')
ws = wb.active  # Get active sheet

# Read cell
value = ws['A1'].value

# Iterate rows
for row in ws.iter_rows(min_row=2, values_only=True):
    print(row)
```

### Writing Excel
```python
from openpyxl import Workbook

wb = Workbook()
ws = wb.active
ws.title = "Data"

# Write data
ws['A1'] = 'Name'
ws['B1'] = 'Age'
ws.append(['John', 30])
ws.append(['Jane', 25])

wb.save('output.xlsx')
```

### Formatting
```python
from openpyxl.styles import Font, PatternFill

# Bold header
ws['A1'].font = Font(bold=True)

# Background color
ws['A1'].fill = PatternFill(start_color="FFFF00", fill_type="solid")

# Number format
ws['B2'].number_format = '0.00'  # Two decimals
```

### Formulas
```python
# Add formula
ws['C2'] = '=A2+B2'

# Sum column
ws['D10'] = '=SUM(D2:D9)'
```

## Python (pandas)

### Reading Excel
```python
import pandas as pd

# Read sheet
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')

# Read multiple sheets
dfs = pd.read_excel('data.xlsx', sheet_name=None)
```

### Writing Excel
```python
# Write DataFrame
df.to_excel('output.xlsx', index=False)

# Multiple sheets
with pd.ExcelWriter('output.xlsx') as writer:
    df1.to_excel(writer, sheet_name='Sheet1')
    df2.to_excel(writer, sheet_name='Sheet2')
```

### Data Transformation
```python
# Filter
filtered = df[df['Age'] > 25]

# Group by
grouped = df.groupby('Department')['Salary'].mean()

# Pivot
pivot = df.pivot_table(values='Sales', index='Region', columns='Product')
```

## JavaScript (xlsx)

```javascript
import XLSX from 'xlsx';

// Read file
const workbook = XLSX.readFile('data.xlsx');
const sheetName = workbook.SheetNames[0];
const worksheet = workbook.Sheets[sheetName];

// Convert to JSON
const data = XLSX.utils.sheet_to_json(worksheet);

// Write file
const newWorksheet = XLSX.utils.json_to_sheet(data);
const newWorkbook = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(newWorkbook, newWorksheet, 'Data');
XLSX.writeFile(newWorkbook, 'output.xlsx');
```

## Common Operations

### CSV to Excel
```python
import pandas as pd

df = pd.read_csv('data.csv')
df.to_excel('data.xlsx', index=False)
```

### Excel to CSV
```python
df = pd.read_excel('data.xlsx')
df.to_csv('data.csv', index=False)
```

### Merging Excel Files
```python
dfs = []
for file in ['file1.xlsx', 'file2.xlsx', 'file3.xlsx']:
    df = pd.read_excel(file)
    dfs.append(df)

combined = pd.concat(dfs, ignore_index=True)
combined.to_excel('merged.xlsx', index=False)
```

## Remember
- Close workbooks after use
- Handle large files in chunks
- Validate data before writing
- Use pandas for data analysis, openpyxl for formatting

Overview

This skill provides practical, code-first techniques for working with Excel (XLSX) files programmatically across Python and JavaScript. It covers reading, writing, formatting, formulas, data transforms, and common workflows like CSV conversion and file merging. Use it to automate spreadsheets, prepare reports, or integrate Excel into data pipelines.

How this skill works

The skill demonstrates using openpyxl for cell-level manipulation and formatting in Python, pandas for efficient tabular reads/writes and transformations, and the xlsx package for JavaScript file-level operations and JSON conversion. Examples show reading sheets, iterating rows, writing cells and formulas, styling headers, and exporting between CSV and XLSX. It emphasizes closing workbooks, chunking large files, and validating data before writes.

When to use it

Automating reports that require precise cell formatting or formulas (use openpyxl).
Data analysis, filtering, grouping, pivoting, or bulk IO (use pandas).
Web or Node.js workflows that need to read/write Excel or convert to JSON (use xlsx).
Converting CSV datasets to Excel or exporting Excel to CSV for downstream tools.
Merging multiple workbooks into a single consolidated file for BI ingestion.

Best practices

Choose the right tool: pandas for tabular analysis, openpyxl for formatting, xlsx for JavaScript environments.
Stream or chunk large files to avoid memory spikes; use iterators where available.
Always close or save workbooks explicitly to prevent corruption.
Validate column types and missing values before writing to prevent Excel errors.
Keep formulas minimal when exporting for downstream processing; prefer computed values if consumers don't support Excel formulas.

Example use cases

Generate monthly sales reports with bold headers, colored cells, and SUM formulas using openpyxl.
Load multiple departmental Excel files, concat with pandas, compute aggregates, and write a merged workbook.
Convert CSV exports from legacy systems into formatted Excel workbooks for stakeholders.
Expose Excel data to a web app by converting sheets to JSON with the xlsx JavaScript library.
Apply number formatting and write two-decimal currency values for financial exports.

FAQ

When should I use pandas vs openpyxl?

Use pandas for data analysis, fast IO, and transformations. Use openpyxl when you need cell-level control, styling, or to preserve complex workbook features.

How do I handle very large Excel files?

Process in chunks, use pandas' iterator options or read specific columns/ranges, and avoid loading entire workbooks into memory with openpyxl when possible.