home / skills / openclaw / skills / system-monitor

system-monitor skill

/skills/zerofire03/system-monitor

This skill evaluates local server health by reporting CPU, RAM, and GPU usage and availability.

npx playbooks add skill openclaw/skills --skill system-monitor

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
306 B
---
name: system_monitor
description: Check the current CPU, RAM, and GPU status of the local server.
---

# System Monitor Skill

Use this skill when the user asks about the server's health, hardware usage, or GPU status.

## Usage
To check the system status, run the local monitor script:
`./monitor.sh`

Overview

This skill checks the current CPU, RAM, and GPU status of the local server and reports real-time hardware utilization. It provides a concise snapshot of system health so you can quickly assess load, memory pressure, and GPU availability. The skill targets on-host monitoring and is optimized for quick, actionable summaries.

How this skill works

The skill runs a local monitor script that collects metrics from the operating system and GPU drivers. It inspects CPU usage, per-core load, memory usage and swap, and GPU utilization/temperature via standard system interfaces (e.g., procfs, psutil-like APIs, and nvidia-smi where available). The output is formatted into a short status report highlighting bottlenecks and resource availability.

When to use it

  • Before starting resource-heavy jobs to confirm available CPU, RAM, and GPU capacity.
  • When investigating slow application performance or unexpected load spikes.
  • During deployment or CI runs to verify the target server meets hardware requirements.
  • For routine health checks and capacity planning to avoid outages.
  • After system updates or driver changes that might affect GPU behavior.

Best practices

  • Run the monitor from a user account with permission to query GPU drivers to get complete GPU details.
  • Use the tool before launching batch jobs and log the output for historical comparison.
  • Combine periodic snapshots with longer-term monitoring for trend analysis rather than relying on single samples.
  • If GPU metrics are missing, verify vendor drivers and that nvidia-smi or equivalent tools are installed.
  • Interpret short-lived spikes cautiously—correlate with process lists or job schedules to find causes.

Example use cases

  • Check CPU and memory usage prior to submitting a parallel training job to avoid OOM or CPU contention.
  • Quickly determine whether a GPU is idle or in use before assigning a compute task.
  • Capture a system snapshot during a performance incident to help root-cause analysis.
  • Validate that a new server instance provides the expected hardware performance after provisioning.
  • Run a pre-deployment health check to ensure resources meet application minimums.

FAQ

What does the monitor script require to report GPU stats?

GPU stats typically require vendor drivers and command-line tools such as nvidia-smi; ensure those are installed and the executing user can access them.

Can this skill run remotely?

This skill is designed for local execution on the target server. For remote checks, run the monitor over a secure shell session or integrate with remote monitoring tooling.