home / skills / sickn33 / antigravity-awesome-skills / prompt-engineering-patterns

prompt-engineering-patterns skill

safe

This skill helps design and validate production-grade prompts by applying advanced patterns, tests, and templates to maximize LLM reliability and

This is most likely a fork of the prompt-engineering-patterns skill from nilecui

npx playbooks add skill sickn33/antigravity-awesome-skills --skill prompt-engineering-patterns

Review the files below or copy the command above to add this skill to your agents.

Files (9)

SKILL.md

7.2 KB

---
name: prompt-engineering-patterns
description: Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, improving LLM outputs, or designing production prompt templates.
---

# Prompt Engineering Patterns

Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability.

## Do not use this skill when

- The task is unrelated to prompt engineering patterns
- You need a different domain or tool outside this scope

## Instructions

- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.

## Use this skill when

- Designing complex prompts for production LLM applications
- Optimizing prompt performance and consistency
- Implementing structured reasoning patterns (chain-of-thought, tree-of-thought)
- Building few-shot learning systems with dynamic example selection
- Creating reusable prompt templates with variable interpolation
- Debugging and refining prompts that produce inconsistent outputs
- Implementing system prompts for specialized AI assistants

## Core Capabilities

### 1. Few-Shot Learning
- Example selection strategies (semantic similarity, diversity sampling)
- Balancing example count with context window constraints
- Constructing effective demonstrations with input-output pairs
- Dynamic example retrieval from knowledge bases
- Handling edge cases through strategic example selection

### 2. Chain-of-Thought Prompting
- Step-by-step reasoning elicitation
- Zero-shot CoT with "Let's think step by step"
- Few-shot CoT with reasoning traces
- Self-consistency techniques (sampling multiple reasoning paths)
- Verification and validation steps

### 3. Prompt Optimization
- Iterative refinement workflows
- A/B testing prompt variations
- Measuring prompt performance metrics (accuracy, consistency, latency)
- Reducing token usage while maintaining quality
- Handling edge cases and failure modes

### 4. Template Systems
- Variable interpolation and formatting
- Conditional prompt sections
- Multi-turn conversation templates
- Role-based prompt composition
- Modular prompt components

### 5. System Prompt Design
- Setting model behavior and constraints
- Defining output formats and structure
- Establishing role and expertise
- Safety guidelines and content policies
- Context setting and background information

## Quick Start

```python
from prompt_optimizer import PromptTemplate, FewShotSelector

# Define a structured prompt template
template = PromptTemplate(
    system="You are an expert SQL developer. Generate efficient, secure SQL queries.",
    instruction="Convert the following natural language query to SQL:\n{query}",
    few_shot_examples=True,
    output_format="SQL code block with explanatory comments"
)

# Configure few-shot learning
selector = FewShotSelector(
    examples_db="sql_examples.jsonl",
    selection_strategy="semantic_similarity",
    max_examples=3
)

# Generate optimized prompt
prompt = template.render(
    query="Find all users who registered in the last 30 days",
    examples=selector.select(query="user registration date filter")
)
```

## Key Patterns

### Progressive Disclosure
Start with simple prompts, add complexity only when needed:

1. **Level 1**: Direct instruction
   - "Summarize this article"

2. **Level 2**: Add constraints
   - "Summarize this article in 3 bullet points, focusing on key findings"

3. **Level 3**: Add reasoning
   - "Read this article, identify the main findings, then summarize in 3 bullet points"

4. **Level 4**: Add examples
   - Include 2-3 example summaries with input-output pairs

### Instruction Hierarchy
```
[System Context] → [Task Instruction] → [Examples] → [Input Data] → [Output Format]
```

### Error Recovery
Build prompts that gracefully handle failures:
- Include fallback instructions
- Request confidence scores
- Ask for alternative interpretations when uncertain
- Specify how to indicate missing information

## Best Practices

1. **Be Specific**: Vague prompts produce inconsistent results
2. **Show, Don't Tell**: Examples are more effective than descriptions
3. **Test Extensively**: Evaluate on diverse, representative inputs
4. **Iterate Rapidly**: Small changes can have large impacts
5. **Monitor Performance**: Track metrics in production
6. **Version Control**: Treat prompts as code with proper versioning
7. **Document Intent**: Explain why prompts are structured as they are

## Common Pitfalls

- **Over-engineering**: Starting with complex prompts before trying simple ones
- **Example pollution**: Using examples that don't match the target task
- **Context overflow**: Exceeding token limits with excessive examples
- **Ambiguous instructions**: Leaving room for multiple interpretations
- **Ignoring edge cases**: Not testing on unusual or boundary inputs

## Integration Patterns

### With RAG Systems
```python
# Combine retrieved context with prompt engineering
prompt = f"""Given the following context:
{retrieved_context}

{few_shot_examples}

Question: {user_question}

Provide a detailed answer based solely on the context above. If the context doesn't contain enough information, explicitly state what's missing."""
```

### With Validation
```python
# Add self-verification step
prompt = f"""{main_task_prompt}

After generating your response, verify it meets these criteria:
1. Answers the question directly
2. Uses only information from provided context
3. Cites specific sources
4. Acknowledges any uncertainty

If verification fails, revise your response."""
```

## Performance Optimization

### Token Efficiency
- Remove redundant words and phrases
- Use abbreviations consistently after first definition
- Consolidate similar instructions
- Move stable content to system prompts

### Latency Reduction
- Minimize prompt length without sacrificing quality
- Use streaming for long-form outputs
- Cache common prompt prefixes
- Batch similar requests when possible

## Resources

- **references/few-shot-learning.md**: Deep dive on example selection and construction
- **references/chain-of-thought.md**: Advanced reasoning elicitation techniques
- **references/prompt-optimization.md**: Systematic refinement workflows
- **references/prompt-templates.md**: Reusable template patterns
- **references/system-prompts.md**: System-level prompt design
- **assets/prompt-template-library.md**: Battle-tested prompt templates
- **assets/few-shot-examples.json**: Curated example datasets
- **scripts/optimize-prompt.py**: Automated prompt optimization tool

## Success Metrics

Track these KPIs for your prompts:
- **Accuracy**: Correctness of outputs
- **Consistency**: Reproducibility across similar inputs
- **Latency**: Response time (P50, P95, P99)
- **Token Usage**: Average tokens per request
- **Success Rate**: Percentage of valid outputs
- **User Satisfaction**: Ratings and feedback

## Next Steps

1. Review the prompt template library for common patterns
2. Experiment with few-shot learning for your specific use case
3. Implement prompt versioning and A/B testing
4. Set up automated evaluation pipelines
5. Document your prompt engineering decisions and learnings

Overview

This skill helps you master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. It focuses on designing, optimizing, and validating prompts, templates, and structured reasoning patterns for robust deployments. Use it to create reusable prompt systems that balance accuracy, token efficiency, and safety.

How this skill works

The skill inspects prompt structure, example selection, reasoning patterns, and system-level constraints to identify improvements and generate optimized prompt templates. It applies patterns like progressive disclosure, few-shot selection, chain-of-thought, and error-recovery to produce concrete prompt variants and verification steps. It also suggests instrumentation and metrics to measure performance and drift in production.

When to use it

Designing production-ready prompts or prompt templates
Improving consistency and accuracy of LLM outputs
Implementing structured reasoning (CoT, tree-of-thought) patterns
Building few-shot systems with dynamic example selection
Debugging prompts that produce inconsistent or unsafe outputs
Creating system prompts that define behavior, format, and safety limits

Best practices

Clarify goals, constraints, and required inputs before drafting prompts
Start simple and apply progressive disclosure to increase complexity only as needed
Use curated examples (show, don’t tell) and semantic selection to avoid example pollution
Iterate rapidly with A/B tests and automated evaluation on representative inputs
Move stable content to system prompts and version prompts like code
Instrument prompts: track accuracy, consistency, latency, token usage, and user satisfaction

Example use cases

Convert natural language requirements into secure SQL via a templated prompt with few-shot examples
Create a multi-turn conversational assistant with role-based system prompt and modular components
Integrate RAG context with conditional prompt sections to ensure answers cite sources or indicate missing info
Optimize token usage for high-volume APIs by consolidating instructions and caching prefixes
Implement self-verification steps that request confidence scores and revision when checks fail

FAQ

How do I choose the number of few-shot examples?

Balance performance with context window limits: prefer high-quality, diverse examples and cap examples to what fits without exceeding tokens; use semantic selection to pick the most relevant ones.

When should I use chain-of-thought prompting?

Use CoT for complex reasoning tasks that benefit from stepwise traces; combine self-consistency (multiple samples) and verification steps to reduce hallucinations and increase reliability.