home / skills / benchflow-ai / skillsbench / obspy-data-api
/tasks/seismic-phase-picking/environment/skills/obspy-data-api
This skill explains how to access and manipulate ObsPy waveform, event, and inventory data via the core Data API for seismology workflows.
npx playbooks add skill benchflow-ai/skillsbench --skill obspy-data-apiReview the files below or copy the command above to add this skill to your agents.
---
name: obspy-data-api
description: An overview of the core data API of ObsPy, a Python framework for processing seismological data. It is useful for parsing common seismological file formats, or manipulating custom data into standard objects for downstream use cases such as ObsPy's signal processing routines or SeisBench's modeling API.
---
# ObsPy Data API
## Waveform Data
### Summary
Seismograms of various formats (e.g. SAC, MiniSEED, GSE2, SEISAN, Q, etc.) can be imported into a `Stream` object using the `read()` function.
Streams are list-like objects which contain multiple `Trace` objects, i.e. gap-less continuous time series and related header/meta information.
Each Trace object has the attribute `data` pointing to a NumPy `ndarray` of the actual time series and the attribute `stats` which contains all meta information in a dict-like `Stats` object. Both attributes `starttime` and `endtime` of the Stats object are `UTCDateTime` objects.
A multitude of helper methods are attached to `Stream` and `Trace` objects for handling and modifying the waveform data.
### Stream and Trace Class Structure
**Hierarchy:** `Stream` → `Trace` (multiple)
**Trace - DATA:**
- `data` → NumPy array
- `stats`:
- `network`, `station`, `location`, `channel` — Determine physical location and instrument
- `starttime`, `sampling_rate`, `delta`, `endtime`, `npts` — Interrelated
**Trace - METHODS:**
- `taper()` — Tapers the data.
- `filter()` — Filters the data.
- `resample()` — Resamples the data in the frequency domain.
- `integrate()` — Integrates the data with respect to time.
- `remove_response()` — Deconvolves the instrument response.
### Example
A `Stream` with an example seismogram can be created by calling `read()` without any arguments. Local files can be read by specifying the filename, files stored on http servers (e.g. at https://examples.obspy.org) can be read by specifying their URL.
```python
>>> from obspy import read
>>> st = read()
>>> print(st)
3 Trace(s) in Stream:
BW.RJOB..EHZ | 2009-08-24T00:20:03.000000Z - ... | 100.0 Hz, 3000 samples
BW.RJOB..EHN | 2009-08-24T00:20:03.000000Z - ... | 100.0 Hz, 3000 samples
BW.RJOB..EHE | 2009-08-24T00:20:03.000000Z - ... | 100.0 Hz, 3000 samples
>>> tr = st[0]
>>> print(tr)
BW.RJOB..EHZ | 2009-08-24T00:20:03.000000Z - ... | 100.0 Hz, 3000 samples
>>> tr.data
array([ 0. , 0.00694644, 0.07597424, ..., 1.93449584,
0.98196204, 0.44196924])
>>> print(tr.stats)
network: BW
station: RJOB
location:
channel: EHZ
starttime: 2009-08-24T00:20:03.000000Z
endtime: 2009-08-24T00:20:32.990000Z
sampling_rate: 100.0
delta: 0.01
npts: 3000
calib: 1.0
...
>>> tr.stats.starttime
UTCDateTime(2009, 8, 24, 0, 20, 3)
```
## Event Metadata
Event metadata are handled in a hierarchy of classes closely modelled after the de-facto standard format [QuakeML](https://quake.ethz.ch/quakeml/). See `read_events()` and `Catalog.write()` for supported formats.
### Event Class Structure
**Hierarchy:** `Catalog` → `events` → `Event` (multiple)
**Event contains:**
- `origins` → `Origin` (multiple)
- `latitude`, `longitude`, `depth`, `time`, ...
- `magnitudes` → `Magnitude` (multiple)
- `mag`, `magnitude_type`, ...
- `picks`
- `focal_mechanisms`
## Station Metadata
Station metadata are handled in a hierarchy of classes closely modelled after the de-facto standard format [FDSN StationXML](https://www.fdsn.org/xml/station/) which was developed as a human readable XML replacement for Dataless SEED. See `read_inventory()` and `Inventory.write()` for supported formats.
### Inventory Class Structure
**Hierarchy:** `Inventory` → `networks` → `Network` → `stations` → `Station` → `channels` → `Channel`
**Network:**
- `code`, `description`, ...
**Station:**
- `code`, `latitude`, `longitude`, `elevation`, `start_date`, `end_date`, ...
**Channel:**
- `code`, `location_code`, `latitude`, `longitude`, `elevation`, `depth`, `dip`, `azimuth`, `sample_rate`, `start_date`, `end_date`, `response`, ...
## Classes & Functions
| Class/Function | Description |
|----------------|-------------|
| `read` | Read waveform files into an ObsPy `Stream` object. |
| `Stream` | List-like object of multiple ObsPy `Trace` objects. |
| `Trace` | An object containing data of a continuous series, such as a seismic trace. |
| `Stats` | A container for additional header information of an ObsPy `Trace` object. |
| `UTCDateTime` | A UTC-based datetime object. |
| `read_events` | Read event files into an ObsPy `Catalog` object. |
| `Catalog` | Container for `Event` objects. |
| `Event` | Describes a seismic event which does not necessarily need to be a tectonic earthquake. |
| `read_inventory` | Function to read inventory files. |
| `Inventory` | The root object of the `Network` → `Station` → `Channel` hierarchy. |
## Modules
| Module | Description |
|--------|-------------|
| `obspy.core.trace` | Module for handling ObsPy `Trace` and `Stats` objects. |
| `obspy.core.stream` | Module for handling ObsPy `Stream` objects. |
| `obspy.core.utcdatetime` | Module containing a UTC-based datetime class. |
| `obspy.core.event` | Module handling event metadata. |
| `obspy.core.inventory` | Module for handling station metadata. |
| `obspy.core.util` | Various utilities for ObsPy. |
| `obspy.core.preview` | Tools for creating and merging previews. |
This skill provides an overview of the core data API of ObsPy, a Python framework for processing seismological data. It explains waveform, event, and station metadata objects and the common functions used to read and manipulate them. The content is focused on how to parse common seismological file formats and convert custom data into standard ObsPy objects for downstream processing.
The skill describes three main object hierarchies: Stream → Trace for waveform time series, Catalog → Event for event metadata, and Inventory → Network → Station → Channel for station metadata. It explains how read(), read_events(), and read_inventory() parse files into those objects, and how Trace.data (NumPy arrays) and Trace.stats (metadata with UTCDateTime) are used. It also summarizes key methods available on Trace and Stream for filtering, tapering, resampling, integrating, and removing instrument response.
What object holds waveform samples and metadata?
A Trace holds waveform samples in data (NumPy array) and metadata in stats, and multiple Traces form a Stream.
How do I get time information for a trace?
Use tr.stats.starttime and tr.stats.endtime which are UTCDateTime objects; sampling_rate and delta describe time spacing.