home / skills / personamanagmentlayer / pcl / elasticsearch-expert
This skill helps you optimize Elasticsearch search, indexing, and analytics with expert guidance on mappings, queries, and aggregations.
npx playbooks add skill personamanagmentlayer/pcl --skill elasticsearch-expertReview the files below or copy the command above to add this skill to your agents.
---
name: elasticsearch-expert
version: 1.0.0
description: Expert-level Elasticsearch, search, ELK stack, and full-text search
category: data
tags: [elasticsearch, search, elk, logstash, kibana, full-text-search]
allowed-tools:
- Read
- Write
- Edit
- Bash(*)
---
# Elasticsearch Expert
Expert guidance for Elasticsearch, search optimization, ELK stack, and distributed search systems.
## Core Concepts
- Full-text search and inverted indexes
- Document-oriented storage
- RESTful API
- Distributed architecture with sharding
- ELK stack (Elasticsearch, Logstash, Kibana)
- Aggregations and analytics
## Index Management
```python
from elasticsearch import Elasticsearch
es = Elasticsearch(['http://localhost:9200'])
# Create index with mapping
mapping = {
"mappings": {
"properties": {
"title": {"type": "text", "analyzer": "english"},
"content": {"type": "text"},
"author": {"type": "keyword"},
"created_at": {"type": "date"},
"views": {"type": "integer"}
}
}
}
es.indices.create(index='articles', body=mapping)
# Index document
doc = {
"title": "Elasticsearch Guide",
"content": "Complete guide to Elasticsearch",
"author": "John Doe",
"created_at": "2024-01-01",
"views": 100
}
es.index(index='articles', id=1, body=doc)
# Bulk indexing
from elasticsearch.helpers import bulk
actions = [
{"_index": "articles", "_id": i, "_source": doc}
for i, doc in enumerate(documents)
]
bulk(es, actions)
```
## Search Queries
```python
# Full-text search
query = {
"query": {
"match": {
"content": "elasticsearch guide"
}
}
}
results = es.search(index='articles', body=query)
# Boolean query
bool_query = {
"query": {
"bool": {
"must": [
{"match": {"content": "elasticsearch"}}
],
"filter": [
{"range": {"views": {"gte": 100}}}
],
"should": [
{"term": {"author": "john-doe"}}
],
"must_not": [
{"term": {"status": "draft"}}
]
}
}
}
# Multi-match query
multi_match = {
"query": {
"multi_match": {
"query": "elasticsearch guide",
"fields": ["title^2", "content"], # Boost title
"type": "best_fields"
}
}
}
# Fuzzy search
fuzzy = {
"query": {
"fuzzy": {
"title": {
"value": "elasticseerch",
"fuzziness": "AUTO"
}
}
}
}
```
## Aggregations
```python
# Aggregation query
agg_query = {
"aggs": {
"authors": {
"terms": {
"field": "author",
"size": 10
}
},
"avg_views": {
"avg": {
"field": "views"
}
},
"views_histogram": {
"histogram": {
"field": "views",
"interval": 100
}
},
"date_histogram": {
"date_histogram": {
"field": "created_at",
"calendar_interval": "month"
}
}
}
}
result = es.search(index='articles', body=agg_query)
```
## Best Practices
- Design mappings carefully
- Use appropriate analyzers
- Implement proper sharding strategy
- Monitor cluster health
- Use bulk operations
- Implement pagination with search_after
- Cache frequently used queries
## Anti-Patterns
❌ Deep pagination with from/size
❌ Wildcard queries without prefix
❌ No replica shards
❌ Over-sharding
❌ Not using filters for exact matches
❌ Ignoring cluster yellow/red status
## Resources
- Elasticsearch Guide: https://www.elastic.co/guide/
- ELK Stack: https://www.elastic.co/elk-stack
This skill provides expert-level guidance on Elasticsearch, full-text search, and the ELK stack for building and maintaining production search systems. It focuses on index design, query patterns, aggregations, and operational best practices to deliver fast, relevant search experiences. Guidance covers mappings, analyzers, sharding, bulk operations, and common anti-patterns.
I inspect typical search workflows: index creation, document indexing (including bulk), query construction (match, bool, multi_match, fuzzy), and aggregation pipelines. I explain trade-offs for analyzers, sharding, and replica settings, and show how to optimize search and analytics performance. I also highlight operational checks such as cluster health, monitoring, and safe pagination strategies.
How do I avoid expensive deep pagination?
Use search_after for reliable deep paging, or use use cursor-style pagination via point-in-time (PIT) to snapshot results without high cost.
When should I reindex?
Reindex when mappings or analyzers change in ways that affect tokenization or field types; plan downtime or use zero-downtime reindex patterns with aliases.
How many shards should I create per index?
Choose shard count based on expected index size, not node count; avoid over-sharding and favor fewer larger shards that can be split later if needed.