home / skills / a5c-ai / babysitter / materials-database-querier

This skill provides unified access to multiple materials databases for assessing structure and property data across repositories to streamline discovery.

npx playbooks add skill a5c-ai/babysitter --skill materials-database-querier

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
2.0 KB
---
name: materials-database-querier
description: Materials database query skill for accessing structure and property data from multiple repositories
allowed-tools:
  - Read
  - Write
  - Glob
  - Grep
  - Bash
metadata:
  specialization: nanotechnology
  domain: science
  category: computational
  priority: high
  phase: 6
  tools-libraries:
    - pymatgen
    - Materials Project API
    - AFLOW REST API
---

# Materials Database Querier

## Purpose

The Materials Database Querier skill provides unified access to multiple materials databases for structure and property retrieval, enabling comprehensive materials search and data aggregation across repositories.

## Capabilities

- Materials Project API integration
- AFLOW database queries
- ICSD/CSD structure retrieval
- NOMAD repository access
- Cross-database searches
- Property aggregation and comparison

## Usage Guidelines

### Database Query Workflow

1. **Query Design**
   - Define search criteria
   - Select target databases
   - Set property filters

2. **Data Retrieval**
   - Execute queries
   - Handle pagination
   - Aggregate results

3. **Data Processing**
   - Standardize formats
   - Compare across sources
   - Export for analysis

## Process Integration

- Machine Learning Materials Discovery Pipeline
- DFT Calculation Pipeline for Nanomaterials
- Structure-Property Correlation Analysis

## Input Schema

```json
{
  "query_type": "composition|structure|property",
  "databases": ["materials_project", "aflow", "icsd"],
  "criteria": {
    "elements": ["string"],
    "property_range": {"property": "string", "min": "number", "max": "number"}
  },
  "limit": "number"
}
```

## Output Schema

```json
{
  "materials": [{
    "id": "string",
    "formula": "string",
    "structure_file": "string",
    "properties": {
      "bandgap": "number",
      "formation_energy": "number"
    },
    "source": "string"
  }],
  "total_found": "number",
  "query_metadata": {
    "databases_searched": ["string"],
    "query_time": "number"
  }
}
```

Overview

This skill provides unified querying of multiple materials databases to retrieve crystal structures and property data. It aggregates results from sources like Materials Project, AFLOW, ICSD/CSD, and NOMAD, standardizes formats, and returns searchable materials records. The skill is designed for integration into discovery pipelines and analysis workflows to speed structure-property lookups.

How this skill works

You provide a query type (composition, structure, or property), target databases, and search criteria including element lists and property ranges. The skill executes parallel queries, handles pagination, and merges results into a consistent output schema with identifiers, structure files, properties, and provenance. It also supports property aggregation, cross-database comparison, and exporting results for downstream ML or DFT pipelines.

When to use it

  • Searching for materials by composition or element set across multiple repositories
  • Comparing property values (bandgap, formation energy, etc.) from different data sources
  • Populating datasets for ML training or validation with standardized structure files
  • Feeding structure lists into DFT or high-throughput calculation pipelines
  • Rapidly locating candidate materials that meet property ranges for screening

Best practices

  • Specify target databases to reduce latency and control provenance
  • Use property_range filters to narrow results before aggregation
  • Limit requests and use pagination handling for large searches
  • Standardize units and file formats immediately after retrieval
  • Record query_metadata for reproducibility and audit trails

Example use cases

  • Find all oxide compositions containing Fe and Ti with bandgaps between 1 and 3 eV across Materials Project and AFLOW
  • Aggregate formation energy values for a set of structures from ICSD and NOMAD for consistency checks
  • Export CIFs and properties for a candidate list to feed a DFT automation pipeline
  • Build a training dataset of structures and bandgaps for ML model development

FAQ

Which databases are supported?

Supported sources include Materials Project, AFLOW, ICSD/CSD, and NOMAD. Additional repositories can be added through adapters.

How are conflicting property values handled?

The skill returns all source values and provides aggregated summaries; users can select aggregation rules (mean, median, source-priority) during processing.