home / skills / openclaw / skills / news-aggregator-skill

news-aggregator-skill skill

/skills/cclank/news-aggregator-skill

This skill fetches real-time tech and finance news from multiple sources, filters by keywords, and provides deep analyses for daily briefings.

This is most likely a fork of the news-aggregator-skill skill from cclank
npx playbooks add skill openclaw/skills --skill news-aggregator-skill

Review the files below or copy the command above to add this skill to your agents.

Files (6)
SKILL.md
5.0 KB
---
name: news-aggregator-skill
description: "Comprehensive news aggregator that fetches, filters, and deeply analyzes real-time content from 8 major sources: Hacker News, GitHub Trending, Product Hunt, 36Kr, Tencent News, WallStreetCN, V2EX, and Weibo. Best for 'daily scans', 'tech news briefings', 'finance updates', and 'deep interpretations' of hot topics."
---

# News Aggregator Skill

Fetch real-time hot news from multiple sources.

## Tools

### fetch_news.py

**Usage:**

```bash
### Single Source (Limit 10)
```bash
### Global Scan (Option 12) - **Broad Fetch Strategy**
> **NOTE**: This strategy is specifically for the "Global Scan" scenario where we want to catch all trends.

```bash
#  1. Fetch broadly (Massive pool for Semantic Filtering)
python3 scripts/fetch_news.py --source all --limit 15 --deep

# 2. SEMANTIC FILTERING:
# Agent manually filters the broad list (approx 120 items) for user's topics.
```

### Single Source & Combinations (Smart Keyword Expansion)
**CRITICAL**: You MUST automatically expand the user's simple keywords to cover the entire domain field.
*   User: "AI" -> Agent uses: `--keyword "AI,LLM,GPT,Claude,Generative,Machine Learning,RAG,Agent"`
*   User: "Android" -> Agent uses: `--keyword "Android,Kotlin,Google,Mobile,App"`
*   User: "Finance" -> Agent uses: `--keyword "Finance,Stock,Market,Economy,Crypto,Gold"`

```bash
# Example: User asked for "AI news from HN" (Note the expanded keywords)
python3 scripts/fetch_news.py --source hackernews --limit 20 --keyword "AI,LLM,GPT,DeepSeek,Agent" --deep
```

### Specific Keyword Search
Only use `--keyword` for very specific, unique terms (e.g., "DeepSeek", "OpenAI").
```bash
python3 scripts/fetch_news.py --source all --limit 10 --keyword "DeepSeek" --deep
```

**Arguments:**

- `--source`: One of `hackernews`, `weibo`, `github`, `36kr`, `producthunt`, `v2ex`, `tencent`, `wallstreetcn`, `all`.
- `--limit`: Max items per source (default 10).
- `--keyword`: Comma-separated filters (e.g. "AI,GPT").
- `--deep`: **[NEW]** Enable deep fetching. Downloads and extracts the main text content of the articles.

**Output:**
JSON array. If `--deep` is used, items will contain a `content` field associated with the article text.

## Interactive Menu

When the user says **"news-aggregator-skill 如意如意"** (or similar "menu/help" triggers):
1.  **READ** the content of `templates.md` in the skill directory.
2.  **DISPLAY** the list of available commands to the user exactly as they appear in the file.
3.  **GUIDE** the user to select a number or copy the command to execute.

### Smart Time Filtering & Reporting (CRITICAL)
If the user requests a specific time window (e.g., "past X hours") and the results are sparse (< 5 items):
1.  **Prioritize User Window**: First, list all items that strictly fall within the user's requested time (Time < X).
2.  **Smart Fill**: If the list is short, you MUST include high-value/high-heat items from a wider range (e.g. past 24h) to ensure the report provides at least 5 meaningful insights.
2.  **Annotation**: Clearly mark these older items (e.g., "⚠️ 18h ago", "🔥 24h Hot") so the user knows they are supplementary.
3.  **High Value**: Always prioritize "SOTA", "Major Release", or "High Heat" items even if they slightly exceed the time window.
4.  **GitHub Trending Exception**: For purely list-based sources like **GitHub Trending**, strictly return the valid items from the fetched list (e.g. Top 10). **List ALL fetched items**. Do **NOT** perform "Smart Fill".
    *   **Deep Analysis (Required)**: For EACH item, you **MUST** leverage your AI capabilities to analyze:
        *   **Core Value (核心价值)**: What specific problem does it solve? Why is it trending?
        *   **Inspiration (启发思考)**: What technical or product insights can be drawn?
        *   **Scenarios (场景标签)**: 3-5 keywords (e.g. `#RAG #LocalFirst #Rust`).

### 6. Response Guidelines (CRITICAL)

**Format & Style:**
- **Language**: Simplified Chinese (简体中文).
- **Style**: Magazine/Newsletter style (e.g., "The Economist" or "Morning Brew" vibe). Professional, concise, yet engaging.
- **Structure**:
    - **Global Headlines**: Top 3-5 most critical stories across all domains.
    - **Tech & AI**: Specific section for AI, LLM, and Tech items.
    - **Finance / Social**: Other strong categories if relevant.
- **Item Format**:
    - **Title**: **MUST be a Markdown Link** to the original URL.
        - ✅ Correct: `### 1. [OpenAI Releases GPT-5](https://...)`
        - ❌ Incorrect: `### 1. OpenAI Releases GPT-5`
    - **Metadata Line**: Must include Source, **Time/Date**, and Heat/Score.
    - **1-Liner Summary**: A punchy, "so what?" summary.
    - **Deep Interpretation (Bulleted)**: 2-3 bullet points explaining *why* this matters, technical details, or context. (Required for "Deep Scan").

**Output Artifact:**
- Always save the full report to `reports/` directory with a timestamped filename (e.g., `reports/hn_news_YYYYMMDD_HHMM.md`).
- Present the full report content to the user in the chat.

Overview

This skill is a comprehensive news aggregator that fetches, filters, and deeply analyzes real-time content from eight major sources: Hacker News, GitHub Trending, Product Hunt, 36Kr, Tencent News, WallStreetCN, V2EX, and Weibo. It combines broad discovery with semantic filtering, keyword expansion, and optional deep content extraction to produce compact, actionable reports. The skill outputs structured JSON and supports deep AI-driven interpretation for each item.

How this skill works

The skill crawls selected sources and returns a JSON array of items; using the --deep flag it downloads and extracts full article text into a content field. Keywords are automatically expanded to domain-relevant variants when you request broad topics, and time-window requests trigger smart filling to ensure a minimum set of high-value items. For each fetched item the skill produces a short summary and an AI-generated deep interpretation covering core value, technical/product insights, and scenario tags.

When to use it

  • Daily briefings to catch top tech and finance headlines.
  • Deep scans when you need context and analysis for trending topics.
  • Targeted searches using specific unique keywords (e.g., product names or vulnerabilities).
  • Weekly or ad-hoc investor/market updates focused on Chinese and global sources.
  • GitHub Trending snapshots when you want raw lists of top projects without semantic filling.

Best practices

  • Use --deep for research or briefing outputs to get full-text context and better AI interpretation.
  • Provide short topic keywords; the agent will expand them, but very specific names should be used with --keyword for precision.
  • When requesting a time window, allow the smart-fill behavior to surface high-value items if results are sparse.
  • For list-heavy sources like GitHub Trending, expect strict top-N returns and no semantic augmentation.
  • Save or archive generated reports for audit and follow-up; filenames include timestamps for easy lookup.

Example use cases

  • Morning tech briefing: fetch top 20 items across Hacker News, Product Hunt, and GitHub with deep analysis for a team standup.
  • Investor alert: scan WallStreetCN, Tencent News, and 36Kr for finance-related keywords, prioritize major releases and hot signals.
  • Product scouting: use expanded keywords for AI or Android to discover relevant launches and inspiration across sources.
  • Incident discovery: search unique vulnerability names with --keyword and --deep to collect full-text evidence.
  • Weekly archive: run a global scan and store the full report JSON and markdown summary for research history.

FAQ

What output formats are available?

Primary output is JSON; when --deep is used each item includes a content field with extracted text. Reports can also be saved as timestamped markdown.

How does keyword expansion work?

Common topic keywords are auto-expanded into domain-relevant synonyms and related terms to improve recall (e.g., "AI" expands to LLM, GPT, generative, machine learning).

How does time-window smart filling behave?

If fewer than five items fall inside the requested window, the agent adds high-value or high-heat items from a wider window and clearly marks them as supplementary.