home / skills / a5c-ai / babysitter / materials-database-querier

materials-database-querier skill

safe

/plugins/babysitter/skills/babysit/process/specializations/domains/science/nanotechnology/skills/materials-database-querier

This skill provides unified access to multiple materials databases for assessing structure and property data across repositories to streamline discovery.

npx playbooks add skill a5c-ai/babysitter --skill materials-database-querier

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

2.0 KB

---
name: materials-database-querier
description: Materials database query skill for accessing structure and property data from multiple repositories
allowed-tools:
  - Read
  - Write
  - Glob
  - Grep
  - Bash
metadata:
  specialization: nanotechnology
  domain: science
  category: computational
  priority: high
  phase: 6
  tools-libraries:
    - pymatgen
    - Materials Project API
    - AFLOW REST API
---

# Materials Database Querier

## Purpose

The Materials Database Querier skill provides unified access to multiple materials databases for structure and property retrieval, enabling comprehensive materials search and data aggregation across repositories.

## Capabilities

- Materials Project API integration
- AFLOW database queries
- ICSD/CSD structure retrieval
- NOMAD repository access
- Cross-database searches
- Property aggregation and comparison

## Usage Guidelines

### Database Query Workflow

1. **Query Design**
   - Define search criteria
   - Select target databases
   - Set property filters

2. **Data Retrieval**
   - Execute queries
   - Handle pagination
   - Aggregate results

3. **Data Processing**
   - Standardize formats
   - Compare across sources
   - Export for analysis

## Process Integration

- Machine Learning Materials Discovery Pipeline
- DFT Calculation Pipeline for Nanomaterials
- Structure-Property Correlation Analysis

## Input Schema

```json
{
  "query_type": "composition|structure|property",
  "databases": ["materials_project", "aflow", "icsd"],
  "criteria": {
    "elements": ["string"],
    "property_range": {"property": "string", "min": "number", "max": "number"}
  },
  "limit": "number"
}
```

## Output Schema

```json
{
  "materials": [{
    "id": "string",
    "formula": "string",
    "structure_file": "string",
    "properties": {
      "bandgap": "number",
      "formation_energy": "number"
    },
    "source": "string"
  }],
  "total_found": "number",
  "query_metadata": {
    "databases_searched": ["string"],
    "query_time": "number"
  }
}
```

Overview

This skill provides unified querying of multiple materials databases to retrieve crystal structures and property data. It aggregates results from sources like Materials Project, AFLOW, ICSD/CSD, and NOMAD, standardizes formats, and returns searchable materials records. The skill is designed for integration into discovery pipelines and analysis workflows to speed structure-property lookups.

How this skill works

You provide a query type (composition, structure, or property), target databases, and search criteria including element lists and property ranges. The skill executes parallel queries, handles pagination, and merges results into a consistent output schema with identifiers, structure files, properties, and provenance. It also supports property aggregation, cross-database comparison, and exporting results for downstream ML or DFT pipelines.

When to use it

Searching for materials by composition or element set across multiple repositories
Comparing property values (bandgap, formation energy, etc.) from different data sources
Populating datasets for ML training or validation with standardized structure files
Feeding structure lists into DFT or high-throughput calculation pipelines
Rapidly locating candidate materials that meet property ranges for screening

Best practices

Specify target databases to reduce latency and control provenance
Use property_range filters to narrow results before aggregation
Limit requests and use pagination handling for large searches
Standardize units and file formats immediately after retrieval
Record query_metadata for reproducibility and audit trails

Example use cases

Find all oxide compositions containing Fe and Ti with bandgaps between 1 and 3 eV across Materials Project and AFLOW
Aggregate formation energy values for a set of structures from ICSD and NOMAD for consistency checks
Export CIFs and properties for a candidate list to feed a DFT automation pipeline
Build a training dataset of structures and bandgaps for ML model development

FAQ

Which databases are supported?

Supported sources include Materials Project, AFLOW, ICSD/CSD, and NOMAD. Additional repositories can be added through adapters.

How are conflicting property values handled?

The skill returns all source values and provides aggregated summaries; users can select aggregation rules (mean, median, source-priority) during processing.