home / skills / eyadsibai / ltk / prompt-engineering

prompt-engineering skill

/plugins/ltk-core/skills/prompt-engineering

This skill helps you optimize prompts, design RAG systems, and build effective agent workflows for reliable LLM outputs.

npx playbooks add skill eyadsibai/ltk --skill prompt-engineering

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.3 KB
---
name: prompt-engineering
description: Use when "writing prompts", "prompt optimization", "few-shot learning", "chain of thought", or asking about "RAG systems", "agent workflows", "LLM integration", "prompt templates"
version: 1.0.0
---

# Prompt Engineering Guide

Effective prompts, RAG systems, and agent workflows.

## When to Use

- Optimizing LLM prompts
- Building RAG systems
- Designing agent workflows
- Creating few-shot examples
- Structuring chain-of-thought reasoning

---

## Prompt Structure

### Core Components

| Component | Purpose | Include When |
|-----------|---------|--------------|
| **Role/Context** | Set expertise, persona | Complex domain tasks |
| **Task** | Clear instruction | Always |
| **Format** | Output structure | Need structured output |
| **Examples** | Few-shot learning | Pattern demonstration needed |
| **Constraints** | Boundaries, rules | Need to limit scope |

### Prompt Patterns

| Pattern | Use Case | Key Concept |
|---------|----------|-------------|
| **Chain of Thought** | Complex reasoning | "Think step by step" |
| **Few-Shot** | Pattern learning | 2-5 input/output examples |
| **Role Playing** | Domain expertise | "You are an expert X" |
| **Structured Output** | Parsing needed | Specify JSON/format exactly |
| **Self-Consistency** | Improve accuracy | Generate multiple, vote |

---

## Chain of Thought Variants

| Variant | Description | When to Use |
|---------|-------------|-------------|
| **Standard CoT** | "Think step by step" | Math, logic problems |
| **Zero-Shot CoT** | Just add "step by step" | Quick reasoning boost |
| **Structured CoT** | Numbered steps | Complex multi-step |
| **Self-Ask** | Ask sub-questions | Research-style tasks |
| **Tree of Thought** | Explore multiple paths | Creative/open problems |

**Key concept**: CoT works because it forces the model to show intermediate reasoning, reducing errors in the final answer.

---

## Few-Shot Learning

### Example Selection

| Criteria | Why |
|----------|-----|
| **Representative** | Cover common cases |
| **Diverse** | Show range of inputs |
| **Edge cases** | Handle boundaries |
| **Consistent format** | Teach output pattern |

### Number of Examples

| Count | Trade-off |
|-------|-----------|
| 0 (zero-shot) | Less context, more creative |
| 2-3 | Good balance for most tasks |
| 5+ | Complex patterns, use tokens |

**Key concept**: Examples teach format more than content. The model learns "how" to respond, not "what" facts to include.

---

## RAG System Design

### Architecture Flow

Query → Embed → Search → Retrieve → Augment Prompt → Generate

### Chunking Strategies

| Strategy | Best For | Trade-off |
|----------|----------|-----------|
| **Fixed size** | General documents | May split sentences |
| **Sentence-based** | Precise retrieval | Many small chunks |
| **Paragraph-based** | Context preservation | May be too large |
| **Semantic** | Mixed content | More complex |

### Retrieval Quality Factors

| Factor | Impact |
|--------|--------|
| **Chunk size** | Too small = no context, too large = noise |
| **Overlap** | Prevents splitting important content |
| **Metadata filtering** | Narrows search space |
| **Re-ranking** | Improves relevance of top-k |
| **Hybrid search** | Combines keyword + semantic |

**Key concept**: RAG quality depends more on retrieval quality than generation quality. Fix retrieval first.

---

## Agent Patterns

### ReAct Pattern

| Step | Description |
|------|-------------|
| **Thought** | Reason about what to do |
| **Action** | Call a tool |
| **Observation** | Process tool result |
| **Repeat** | Until task complete |

### Tool Design Principles

| Principle | Why |
|-----------|-----|
| **Single purpose** | Clear when to use |
| **Good descriptions** | Model selects correctly |
| **Structured inputs** | Reliable parsing |
| **Informative outputs** | Model understands result |
| **Error messages** | Guide retry attempts |

---

## Prompt Optimization

### Token Efficiency

| Technique | Savings |
|-----------|---------|
| Remove redundant instructions | 10-30% |
| Use abbreviations in examples | 10-20% |
| Compress context with summaries | 50%+ |
| Remove verbose explanations | 20-40% |

### Quality Improvement

| Technique | Effect |
|-----------|--------|
| Add specific examples | Reduces errors |
| Specify output format | Enables parsing |
| Include edge cases | Handles boundaries |
| Add confidence scoring | Calibrates uncertainty |

---

## Common Task Patterns

| Task | Key Prompt Elements |
|------|---------------------|
| **Extraction** | List fields, specify format (JSON), handle missing |
| **Classification** | List categories, one-shot per category, single answer |
| **Summarization** | Specify length, focus areas, format (bullets/prose) |
| **Generation** | Style guide, length, constraints, examples |
| **Q&A** | Context placement, "based only on context" |

---

## Best Practices

| Practice | Why |
|----------|-----|
| Be specific and explicit | Reduces ambiguity |
| Provide clear examples | Shows expected format |
| Specify output format | Enables parsing |
| Test with diverse inputs | Find edge cases |
| Iterate based on failures | Targeted improvement |
| Separate instructions from data | Prevent injection |

## Resources

- Anthropic Prompt Engineering: <https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering>
- OpenAI Cookbook: <https://cookbook.openai.com/>

Overview

This skill helps authors, engineers, and product teams craft, test, and optimize prompts for large language models and agent workflows. It focuses on prompt structure, few-shot example design, chain-of-thought strategies, RAG architectures, and tool-driven agents to improve reliability and accuracy. Use it to produce repeatable prompt patterns and retrieval pipelines that scale.

How this skill works

The skill inspects intent, desired output format, and task complexity to recommend prompt components: role/context, task, format, examples, and constraints. It suggests prompt patterns (few-shot, CoT, structured output), chunking and retrieval strategies for RAG systems, and agent/tool designs like ReAct. It also provides token-efficiency tips and iterative tuning steps to validate prompts with diverse inputs.

When to use it

  • Writing or refining prompts for specific LLM tasks
  • Designing RAG pipelines and document chunking strategies
  • Creating few-shot examples or structured output templates
  • Building agent workflows and tool integrations (ReAct style)
  • Improving model accuracy with chain-of-thought techniques

Best practices

  • Define role/context and a single clear task to reduce ambiguity
  • Specify exact output format (JSON, CSV, bullets) to enable parsing
  • Use 2–5 representative few-shot examples; include edge cases and consistent formatting
  • Apply Chain-of-Thought variants for multi-step reasoning; prefer structured or self-ask when complex
  • Optimize tokens: remove redundancy, summarize context, and use concise examples
  • Separate instructions from data to reduce prompt injection risks and iterate based on failures

Example use cases

  • Create a JSON extraction prompt for invoice processing with field definitions and missing-value rules
  • Design a RAG search flow: embed, search, retrieve top-k, re-rank, augment prompt, generate answer
  • Build an agent that calls a calendar API using ReAct: thought → action → observation loop
  • Craft few-shot templates for classification tasks with one example per category and consistent labels
  • Implement structured CoT for multi-step math or diagnostic reasoning with numbered steps

FAQ

How many few-shot examples should I include?

Start with 2–3 representative examples for most tasks; use 5+ only for complex patterns demanding more token budget.

When should I prefer retrieval tuning over prompt changes?

Fix retrieval first: if returned context is irrelevant or fragmented, improve chunking, overlap, metadata filters, or re-ranking before changing the prompt.