home / skills / dkyazzentwatwa / chatgpt-skills / regex-tester

regex-tester skill

/regex-tester

This skill tests, explains, and generates regular expressions with visualization and pattern generation from examples to accelerate debugging.

npx playbooks add skill dkyazzentwatwa/chatgpt-skills --skill regex-tester

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
6.5 KB
---
name: regex-tester
description: Test, debug, and explain regular expressions. Visualize matches, generate patterns from examples, and convert between regex flavors.
---

# Regex Tester

Test and debug regular expressions with detailed match visualization, plain-English explanations, and pattern generation from examples.

## Quick Start

```python
from scripts.regex_tester import RegexTester

# Test pattern
tester = RegexTester()
result = tester.test(r"\d{3}-\d{4}", "Call 555-1234 today")
print(result['matches'])  # ['555-1234']

# Explain pattern
explanation = tester.explain(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b")
print(explanation)

# Generate pattern from examples
pattern = tester.generate_pattern(["555-1234", "123-4567", "999-0000"])
print(pattern)  # r"\d{3}-\d{4}"
```

## Features

- **Pattern Testing**: Test against text with detailed results
- **Match Visualization**: See exactly what matched and where
- **Pattern Explanation**: Plain-English breakdown of regex
- **Pattern Generation**: Create patterns from example matches
- **Find & Replace**: Test substitution patterns
- **Common Patterns**: Library of pre-built patterns
- **Validation**: Check pattern syntax before use

## API Reference

### Testing

```python
tester = RegexTester()

# Basic test
result = tester.test(r"\d+", "There are 42 items")
# {
#     'pattern': r'\d+',
#     'text': 'There are 42 items',
#     'matches': ['42'],
#     'match_count': 1,
#     'positions': [(10, 12)],  # (start, end)
#     'groups': []
# }

# With flags
result = tester.test(r"hello", "Hello World", ignore_case=True)

# All matches with groups
result = tester.test(r"(\d{3})-(\d{4})", "555-1234 and 999-0000")
# matches: ['555-1234', '999-0000']
# groups: [('555', '1234'), ('999', '0000')]
```

### Explanation

```python
explanation = tester.explain(r"\b\w+@\w+\.\w{2,}\b")
# Returns:
# \b - Word boundary
# \w+ - One or more word characters
# @ - Literal '@'
# \w+ - One or more word characters
# \. - Literal '.'
# \w{2,} - Two or more word characters
# \b - Word boundary
```

### Pattern Generation

```python
# Generate pattern from examples
examples = ["[email protected]", "[email protected]", "[email protected]"]
pattern = tester.generate_pattern(examples)
# Suggests: r"[a-z]+@[a-z]+\.(com|org|net)"

# With negative examples (what NOT to match)
pattern = tester.generate_pattern(
    positive=["555-1234", "999-0000"],
    negative=["555-12", "12345"]
)
```

### Find & Replace

```python
result = tester.replace(
    pattern=r"(\d{3})-(\d{4})",
    replacement=r"(\1) \2",
    text="Call 555-1234"
)
# "Call (555) 1234"
```

### Validation

```python
# Check if pattern is valid
is_valid, error = tester.validate(r"[invalid")
# is_valid: False
# error: "unterminated character set"
```

### Common Patterns

```python
# Get pre-built patterns
email = tester.patterns['email']
phone = tester.patterns['phone']
url = tester.patterns['url']
ip = tester.patterns['ipv4']
```

## CLI Usage

```bash
# Test pattern
python regex_tester.py --pattern "\d+" --text "There are 42 items"

# Test from file
python regex_tester.py --pattern "\w+@\w+\.\w+" --file emails.txt

# Explain pattern
python regex_tester.py --explain "\b[A-Z][a-z]+\b"

# Find and replace
python regex_tester.py --pattern "(\d+)" --replace "[$1]" --text "Item 42"

# Generate pattern
python regex_tester.py --generate "555-1234,999-0000,123-4567"

# Interactive mode
python regex_tester.py --interactive
```

### CLI Arguments

| Argument | Description | Default |
|----------|-------------|---------|
| `--pattern` | Regex pattern | - |
| `--text` | Text to test against | - |
| `--file` | File to test against | - |
| `--explain` | Explain pattern | False |
| `--replace` | Replacement pattern | - |
| `--generate` | Generate from examples | - |
| `--ignore-case` | Case insensitive | False |
| `--multiline` | Multiline mode | False |
| `--interactive` | Interactive mode | False |

## Examples

### Email Validation

```python
tester = RegexTester()

email_pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

test_emails = [
    "[email protected]",
    "invalid.email",
    "user@domain",
    "[email protected]"
]

for email in test_emails:
    result = tester.test(email_pattern, email)
    status = "Valid" if result['matches'] else "Invalid"
    print(f"{email}: {status}")
```

### Extract Data from Log

```python
tester = RegexTester()

log_line = '[2024-01-15 14:30:22] ERROR: Connection timeout (192.168.1.100)'
pattern = r'\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\] (\w+): (.+) \((\d+\.\d+\.\d+\.\d+)\)'

result = tester.test(pattern, log_line)
if result['groups']:
    timestamp, level, message, ip = result['groups'][0]
    print(f"Time: {timestamp}")
    print(f"Level: {level}")
    print(f"Message: {message}")
    print(f"IP: {ip}")
```

### Phone Number Formatter

```python
tester = RegexTester()

# Normalize various phone formats
phones = ["5551234567", "(555) 123-4567", "555.123.4567"]
pattern = r"[\(\)\.\s-]?"

for phone in phones:
    # Remove all formatting
    clean = tester.replace(r"[\(\)\.\s-]", "", phone)
    # Format consistently
    formatted = tester.replace(
        r"(\d{3})(\d{3})(\d{4})",
        r"(\1) \2-\3",
        clean
    )
    print(f"{phone} -> {formatted}")
```

### Understand Complex Pattern

```python
tester = RegexTester()

# Explain a complex pattern
pattern = r"(?:https?://)?(?:www\.)?([a-zA-Z0-9-]+)\.([a-zA-Z]{2,})(?:/\S*)?"

explanation = tester.explain(pattern)
print(explanation)
# (?:https?://)? - Optional non-capturing group: 'http://' or 'https://'
# (?:www\.)? - Optional non-capturing group: 'www.'
# ([a-zA-Z0-9-]+) - Capturing group 1: domain name
# \. - Literal '.'
# ([a-zA-Z]{2,}) - Capturing group 2: TLD
# (?:/\S*)? - Optional non-capturing group: path
```

## Common Patterns Library

| Name | Pattern | Matches |
|------|---------|---------|
| `email` | Complex | [email protected] |
| `phone_us` | `\d{3}[-.]?\d{3}[-.]?\d{4}` | 555-123-4567 |
| `url` | Complex | https://example.com |
| `ipv4` | `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}` | 192.168.1.1 |
| `date_iso` | `\d{4}-\d{2}-\d{2}` | 2024-01-15 |
| `time_24h` | `\d{2}:\d{2}(:\d{2})?` | 14:30:00 |
| `hex_color` | `#[0-9A-Fa-f]{6}` | #FF5733 |
| `zipcode_us` | `\d{5}(-\d{4})?` | 12345-6789 |

## Dependencies

```
(No external dependencies - uses Python standard library re module)
```

## Limitations

- Pattern generation is heuristic (may not be optimal)
- Some advanced regex features are Python-specific
- Explanation works best with common patterns
- Very long patterns may have simplified explanations

Overview

This skill provides a compact toolkit for testing, debugging, and explaining regular expressions. It offers match visualization, plain-English breakdowns, pattern generation from examples, find-and-replace testing, and a library of common patterns. It targets practical regex development and quick validation in Python environments.

How this skill works

The skill runs patterns against input text to return matches, group captures, positions, and counts. It produces human-readable explanations for pattern components and visualizes exactly what and where each match occurs. It can infer a candidate regex from positive (and optional negative) examples and apply substitution rules to test replacements. Syntax validation and pre-built common patterns are included for rapid use.

When to use it

  • Validate and debug a regex against sample text
  • Generate a candidate pattern from example matches
  • Explain a complex or unfamiliar regex in plain English
  • Test find-and-replace rules before applying them to data
  • Normalize or extract structured data (emails, phones, timestamps)

Best practices

  • Provide representative positive and negative examples when generating patterns to reduce false matches
  • Start with pre-built common patterns and adapt them instead of writing from scratch
  • Use validation to catch syntax errors before running large replacements
  • Prefer non-capturing groups (?: ) when you don’t need captured values to keep group output simpler
  • Test patterns against edge cases (empty strings, unexpected characters, long inputs)

Example use cases

  • Extract timestamps, log levels, messages, and IPs from server logs for parsing or alerting
  • Validate user-submitted emails and phone numbers in form processing or ETL pipelines
  • Generate a concise regex from sample data when migrating or reverse-engineering formats
  • Preview and apply consistent phone-number or identifier formatting using find-and-replace rules
  • Explain and document complex URL, email, or custom identifier patterns for team handoff

FAQ

Which regex flavors are supported?

The tool is built around Python’s re behavior; it can help convert concepts between flavors but advanced, flavor-specific features may require manual adjustments.

How reliable is pattern generation?

Pattern generation is heuristic and aims to produce practical candidates. It performs best with varied positive examples and explicit negative examples for tuning.