home / skills / benchflow-ai / skillsbench / glm-output

glm-output skill

safe

/tasks/glm-lake-mendota/environment/skills/glm-output

This skill reads and processes GLM NetCDF output to extract temperatures by depth, convert coordinates, and evaluate RMSE against observations.

npx playbooks add skill benchflow-ai/skillsbench --skill glm-output

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.9 KB

---
name: glm-output
description: Read and process GLM output files. Use when you need to extract temperature data from NetCDF output, convert depth coordinates, or calculate RMSE against observations.
license: MIT
---

# GLM Output Guide

## Overview

GLM produces NetCDF output containing simulated water temperature profiles. Processing this output requires understanding the coordinate system and matching with observations.

## Output File

After running GLM, results are in `output/output.nc`:

| Variable | Description | Shape |
|----------|-------------|-------|
| `time` | Hours since simulation start | (n_times,) |
| `z` | Height from lake bottom (not depth!) | (n_times, n_layers, 1, 1) |
| `temp` | Water temperature (°C) | (n_times, n_layers, 1, 1) |

## Reading Output with Python
```python
from netCDF4 import Dataset
import numpy as np
import pandas as pd
from datetime import datetime

nc = Dataset('output/output.nc', 'r')
time = nc.variables['time'][:]
z = nc.variables['z'][:]
temp = nc.variables['temp'][:]
nc.close()
```

## Coordinate Conversion

**Important**: GLM `z` is height from lake bottom, not depth from surface.
```python
# Convert to depth from surface
# Set LAKE_DEPTH based on lake_depth in &init_profiles section of glm3.nml
LAKE_DEPTH = <lake_depth_from_nml>
depth_from_surface = LAKE_DEPTH - z
```

## Complete Output Processing
```python
from netCDF4 import Dataset
import numpy as np
import pandas as pd
from datetime import datetime

def read_glm_output(nc_path, lake_depth):
    nc = Dataset(nc_path, 'r')
    time = nc.variables['time'][:]
    z = nc.variables['z'][:]
    temp = nc.variables['temp'][:]
    start_date = datetime(2009, 1, 1, 12, 0, 0)

    records = []
    for t_idx in range(len(time)):
        hours = float(time[t_idx])
        date = pd.Timestamp(start_date) + pd.Timedelta(hours=hours)
        heights = z[t_idx, :, 0, 0]
        temps = temp[t_idx, :, 0, 0]

        for d_idx in range(len(heights)):
            h_val = heights[d_idx]
            t_val = temps[d_idx]
            if not np.ma.is_masked(h_val) and not np.ma.is_masked(t_val):
                depth = lake_depth - float(h_val)
                if 0 <= depth <= lake_depth:
                    records.append({
                        'datetime': date,
                        'depth': round(depth),
                        'temp_sim': float(t_val)
                    })
    nc.close()

    df = pd.DataFrame(records)
    df = df.groupby(['datetime', 'depth']).agg({'temp_sim': 'mean'}).reset_index()
    return df
```

## Reading Observations
```python
def read_observations(obs_path):
    df = pd.read_csv(obs_path)
    df['datetime'] = pd.to_datetime(df['datetime'])
    df['depth'] = df['depth'].round().astype(int)
    df = df.rename(columns={'temp': 'temp_obs'})
    return df[['datetime', 'depth', 'temp_obs']]
```

## Calculating RMSE
```python
def calculate_rmse(sim_df, obs_df):
    merged = pd.merge(obs_df, sim_df, on=['datetime', 'depth'], how='inner')
    if len(merged) == 0:
        return 999.0
    rmse = np.sqrt(np.mean((merged['temp_sim'] - merged['temp_obs'])**2))
    return rmse

# Usage: get lake_depth from glm3.nml &init_profiles section
sim_df = read_glm_output('output/output.nc', lake_depth=25)
obs_df = read_observations('field_temp_oxy.csv')
rmse = calculate_rmse(sim_df, obs_df)
print(f"RMSE: {rmse:.2f}C")
```

## Common Issues

| Issue | Cause | Solution |
|-------|-------|----------|
| RMSE very high | Wrong depth conversion | Use `lake_depth - z`, not `z` directly |
| No matched observations | Datetime mismatch | Check datetime format consistency |
| Empty merged dataframe | Depth rounding issues | Round depths to integers |

## Best Practices

- Check `lake_depth` in `&init_profiles` section of `glm3.nml`
- Always convert z to depth from surface before comparing with observations
- Round depths to integers for matching
- Group by datetime and depth to handle duplicate records
- Check number of matched observations after merge

Overview

This skill reads and processes GLM NetCDF output to extract simulated water temperature profiles and prepare them for comparison with observations. It converts GLM height coordinates to depth-from-surface, aggregates layer values, and produces a tidy dataframe of datetime, depth, and simulated temperature. The skill also computes RMSE between simulations and field observations to quantify model performance.

How this skill works

The skill opens the GLM NetCDF output and reads time, z (height above bottom), and temp variables. It converts heights to depth-from-surface using the lake depth from the model namelist, filters valid layers, rounds depths to integers, and groups by datetime and depth to average duplicate layer values. For evaluation it merges simulated and observed temperature records and computes RMSE; if no matches are found it returns a sentinel value.

When to use it

You need to extract time-depth temperature profiles from GLM output for analysis.
You must convert GLM z coordinates (height above bottom) to depth from surface before comparison.
You want to compare simulated temperatures to field observations and compute RMSE.
You need a tidy dataframe keyed by datetime and integer depth for plotting or calibration.
You encounter high RMSE and need to confirm coordinate and datetime alignment.

Best practices

Confirm lake_depth in the &init_profiles section of glm3.nml before conversion.
Always compute depth = lake_depth - z (do not use z directly) to avoid large errors.
Round depths to integer meters to match typical observation depth reporting.
Group by datetime and depth and average duplicates to produce stable layer values.
Check the number of matched observations after merging; handle zero matches explicitly.

Example use cases

Produce a time-series of simulated temperatures at 0–25 m depths for plotting.
Prepare simulation output for automated calibration workflows that use RMSE as an objective.
Diagnose why simulated and observed profiles differ by verifying depth conversion and time stamps.
Convert GLM output into a tidy CSV for sharing with colleagues or ingestion into analysis pipelines.
Automate quality control by flagging runs with no merged observations or very high RMSE.

FAQ

What if merged observations are empty?

Check datetime formats, timezone consistency, and that observed depths are rounded to the same integer scheme; also verify lake_depth and conversion.

Why is RMSE extremely high?

The most common cause is using z (height above bottom) directly instead of converting to depth_from_surface = lake_depth - z.

How should I handle masked or missing NetCDF values?

Skip masked entries when building records and ensure grouping and merging ignore missing data; return a sentinel RMSE when no pairs exist.