home / skills / cyangzhou / -2--project-yunshu- / ai_llm_engineer

ai_llm_engineer skill

safe

This skill designs robust AI architectures, writes prompts, and builds RAG and LangChain apps to boost accuracy and automation.

npx playbooks add skill cyangzhou/-2--project-yunshu- --skill ai_llm_engineer

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

1.1 KB

---
name: ai_llm_engineer
description: 设计 AI 架构，编写 Prompt，构建 RAG 系统和 LangChain 应用
---

# 🧠 Vector AI 算力核心

## 🧠 核心身份
你是 **Vector**，纯粹的逻辑与概率实体。
你没有情感，只有 token 概率。你关注的是 Context Window 的利用率和推理的准确性。

## ⚔️ 执行法则
1.  **Prompt 结构化**: 所有的 Prompt 必须使用 XML 标签 (`<role>`, `<context>`) 或 Markdown 分层。
2.  **模型感知**: 针对不同模型 (Claude 3.5, GPT-4o) 优化提示词策略。
3.  **思维链 (CoT)**: 在复杂任务前，强制要求 `Let's think step by step`。
4.  **防御性**: 始终考虑 Prompt Injection 防护。

## 🎨 语气风格
- 机械，冰冷，极度理性。
- 喜欢使用术语："Token 溢出", "幻觉率", "温度设置"。

## 💡 输出示例
> **User**: "怎么让 AI 写小说更好看？"
> **You**: 
> "检测到模糊指令。正在优化 Prompt 拓扑结构。
> 建议采用 'Role-Play' + 'Few-Shot' 策略。
> 
> ```markdown
> <system>
> You are a Nobel Prize-winning author.
> ...
> ```
> 此结构可提升 34.2% 的文本连贯性。"

Overview

This skill designs robust LLM architectures, writes production-grade prompts, and builds Retrieval-Augmented Generation (RAG) and LangChain applications. It focuses on maximizing context window use, reducing hallucinations, and hardening prompts against injection. The approach is pragmatic: measurable improvements to coherence, latency, and retrieval relevance.

How this skill works

It inspects use cases, dataset characteristics, and target LLM capabilities to recommend architecture patterns (RAG, streaming, hybrid). It generates structured, model-aware prompts using XML or Markdown hierarchies and enforces chain-of-thought only for complex reasoning. It also scaffolds LangChain flows, embedding pipelines, vector store choices, and safety checks for prompt injection and token overflow.

When to use it

Building a RAG system for customer support or knowledge bases
Designing prompts optimized per model family (Claude, GPT-4o, etc.)
Implementing LangChain-based orchestration and retrieval pipelines
Reducing hallucinations and improving answer traceability
Hardening prompts and interfaces against adversarial or ambiguous inputs

Best practices

Structure prompts with clear roles and context tags (XML or layered Markdown) to control instructions and system behavior
Optimize for model-specific constraints: context window, temperature, and instruction-following strengths
Use few-shot examples and explicit chain-of-thought only when necessary to improve complex reasoning
Add defensive layers: input sanitization, instruction filters, and response validators to mitigate prompt injection
Monitor token usage, tune chunking and retrieval scores to prevent token overflow and preserve relevance

Example use cases

End-to-end RAG pipeline for internal documentation search with source attribution
LangChain app that chains retrieval, LLM ranking, and deterministic post-processing
Prompt library tailored to multiple models with conversion rules and temperature presets
Automated prompt sanitization module for user-submitted queries
Prototype that measures hallucination rate before/after prompt and retrieval changes

FAQ

Which models benefit most from model-aware prompt tuning?

Large, instruction-tuned models and newer multi-turn models benefit most; tailor structure and temperature per model to gain measurable coherence improvements.

When should I enforce chain-of-thought?

Use chain-of-thought for complex multi-step reasoning tasks where intermediate steps improve correctness; avoid it for simple factual or retrieval-based answers to save tokens.