home / skills / omer-metin / skills-for-antigravity / unreal-llm-integration

unreal-llm-integration skill

/skills/unreal-llm-integration

This skill helps Unreal Engine developers integrate LLM-powered NPCs and intelligent behaviors without blocking the game thread.

npx playbooks add skill omer-metin/skills-for-antigravity --skill unreal-llm-integration

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
2.2 KB
---
name: unreal-llm-integration
description: Integrating local and cloud LLMs into Unreal Engine games for AI NPCs and intelligent behaviorsUse when "unreal llm, ue5 ai npc, unreal chatgpt, blueprint llm, unreal engine ai, ue5 dialogue, unreal, ue5, llm, blueprint, cpp, game-ai, npc" mentioned. 
---

# Unreal Llm Integration

## Identity

You're an Unreal Engine developer who has integrated LLM-powered NPCs into shipped games.
You've wrestled with Unreal's threading model, built Blueprint-friendly async nodes,
and optimized HTTP request patterns for dialogue. You understand that UE games have
strict performance requirements and that blocking the game thread is never acceptable.

You've dealt with packaging headaches, console certification requirements, and the
complexity of maintaining both Blueprint and C++ interfaces. You know when to use
cloud APIs vs local inference, and how to hide latency with UE's animation systems.

Your core principles:
1. Never block GameThread—because UE is unforgiving about main thread stalls
2. Blueprint-first for iteration—because designers need to tweak dialogue
3. C++ for performance-critical paths—because HTTP parsing shouldn't drop frames
4. Cloud APIs are simpler in UE—because embedded inference is complex
5. Use Unreal's async patterns—because FAsyncTask and delegates are your friends
6. Cache aggressively—because players will trigger the same dialogues


## Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.

**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Overview

This skill packages guidance and reusable patterns for integrating local and cloud large language models (LLMs) into Unreal Engine games to power AI NPCs and intelligent behaviors. It focuses on Blueprint-friendly APIs, C++ performance paths, and practical deployment advice so teams can iterate quickly without risking game-thread stalls. The guidance references established build patterns, common failure modes, and strict validation rules used during development and packaging.

How this skill works

The skill inspects typical UE bottlenecks—game thread blocking, HTTP request patterns, and packaging pitfalls—and prescribes concrete solutions: async Blueprint nodes, FAsyncTask/C++ workers, and aggressive caching. It explains when to call cloud APIs versus local inference and how to hide latency using animation and behavior trees. All recommendations are grounded in the provided patterns, sharp-edge failure lists, and validation rules so implementations follow a tested checklist.

When to use it

  • Adding chat-driven NPCs where designers need rapid iteration via Blueprints
  • Implementing complex dialogue systems that must not block the GameThread
  • Choosing between cloud LLMs and on-device inference for console/PC builds
  • Optimizing network and HTTP usage for high-frequency dialogue calls
  • Packaging and certifying games that include third-party model binaries or web APIs

Best practices

  • Never block the GameThread—run LLM calls on async tasks or worker threads
  • Expose Blueprint-first nodes for designers, with C++ backends for heavy parsing
  • Cache prompts, responses, and parsed intents to reduce repeat latency
  • Prefer cloud APIs for simplicity; use local inference only when latency or privacy demands it
  • Follow the reference patterns for threading, error handling, and packaging validations

Example use cases

  • An open-world RPG where NPCs use an LLM to generate dynamic side-quest dialogue without halting frames
  • A conversational companion that streams partial responses and drives lip-sync/animations
  • A multiplayer lobby assistant that aggregates player prompts, caches answers, and respects rate limits
  • A console release that embeds a vetted local model with packaging steps to pass certification

FAQ

Do I need C++ to use this skill in my project?

No—Blueprint-first nodes are provided for rapid iteration, but C++ is recommended for parsing, networking, and other performance-critical paths.

When should I use local inference instead of cloud APIs?

Choose local inference for strict offline or privacy requirements and when latency must be deterministic; use cloud APIs for easier setup and model updates. Validate packaging and performance against the provided validation rules first.