home / skills / sickn33 / antigravity-awesome-skills / gemini-api-dev

gemini-api-dev skill

/skills/gemini-api-dev

This skill helps you integrate Gemini API across multimodal, function calling, and structured outputs with SDKs for Python, JS, and Go.

npx playbooks add skill sickn33/antigravity-awesome-skills --skill gemini-api-dev

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.8 KB
---
name: gemini-api-dev
description: Use this skill when building applications with Gemini models, Gemini API, working with multimodal content (text, images, audio, video), implementing function calling, using structured outputs, or needing current model specifications. Covers SDK usage (google-genai for Python, @google/genai for JavaScript/TypeScript), model selection, and API capabilities.
---

# Gemini API Development Skill

## Overview

The Gemini API provides access to Google's most advanced AI models. Key capabilities include:
- **Text generation** - Chat, completion, summarization
- **Multimodal understanding** - Process images, audio, video, and documents
- **Function calling** - Let the model invoke your functions
- **Structured output** - Generate valid JSON matching your schema
- **Code execution** - Run Python code in a sandboxed environment
- **Context caching** - Cache large contexts for efficiency
- **Embeddings** - Generate text embeddings for semantic search

## Current Gemini Models

- `gemini-3-pro-preview`: 1M tokens, complex reasoning, coding, research
- `gemini-3-flash-preview`: 1M tokens, fast, balanced performance, multimodal
- `gemini-3-pro-image-preview`: 65k / 32k tokens, image generation and editing


> [!IMPORTANT]
> Models like `gemini-2.5-*`, `gemini-2.0-*`, `gemini-1.5-*` are legacy and deprecated. Use the new models above. Your knowledge is outdated.

## SDKs

- **Python**: `google-genai` install with `pip install google-genai`
- **JavaScript/TypeScript**: `@google/genai` install with `npm install @google/genai`
- **Go**: `google.golang.org/genai` install with `go get google.golang.org/genai`

> [!WARNING]
> Legacy SDKs `google-generativeai` (Python) and `@google/generative-ai` (JS) are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.

## Quick Start

### Python
```python
from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain quantum computing"
)
print(response.text)
```

### JavaScript/TypeScript
```typescript
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
  model: "gemini-3-flash-preview",
  contents: "Explain quantum computing"
});
console.log(response.text);
```

### Go
```go
package main

import (
	"context"
	"fmt"
	"log"
	"google.golang.org/genai"
)

func main() {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, nil)
	if err != nil {
		log.Fatal(err)
	}

	resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(resp.Text)
}
```

## API spec (source of truth)

**Always use the latest REST API discovery spec as the source of truth for API definitions** (request/response schemas, parameters, methods). Fetch the spec when implementing or debugging API integration:

- **v1beta** (default): `https://generativelanguage.googleapis.com/$discovery/rest?version=v1beta`  
  Use this unless the integration is explicitly pinned to v1. The official SDKs (google-genai, @google/genai, google.golang.org/genai) target v1beta.
- **v1**: `https://generativelanguage.googleapis.com/$discovery/rest?version=v1`  
  Use only when the integration is specifically set to v1.

When in doubt, use v1beta. Refer to the spec for exact field names, types, and supported operations.

## How to use the Gemini API

For detailed API documentation, fetch from the official docs index:

**llms.txt URL**: `https://ai.google.dev/gemini-api/docs/llms.txt`

This index contains links to all documentation pages in `.md.txt` format. Use web fetch tools to:

1. Fetch `llms.txt` to discover available documentation pages
2. Fetch specific pages (e.g., `https://ai.google.dev/gemini-api/docs/function-calling.md.txt`)

### Key Documentation Pages 

> [!IMPORTANT]
> Those are not all the documentation pages. Use the `llms.txt` index to discover available documentation pages

- [Models](https://ai.google.dev/gemini-api/docs/models.md.txt)
- [Google AI Studio quickstart](https://ai.google.dev/gemini-api/docs/ai-studio-quickstart.md.txt)
- [Nano Banana image generation](https://ai.google.dev/gemini-api/docs/image-generation.md.txt)
- [Function calling with the Gemini API](https://ai.google.dev/gemini-api/docs/function-calling.md.txt)
- [Structured outputs](https://ai.google.dev/gemini-api/docs/structured-output.md.txt)
- [Text generation](https://ai.google.dev/gemini-api/docs/text-generation.md.txt)
- [Image understanding](https://ai.google.dev/gemini-api/docs/image-understanding.md.txt)
- [Embeddings](https://ai.google.dev/gemini-api/docs/embeddings.md.txt)
- [Interactions API](https://ai.google.dev/gemini-api/docs/interactions.md.txt)
- [SDK migration guide](https://ai.google.dev/gemini-api/docs/migrate.md.txt)

Overview

This skill helps developers integrate and build applications with Gemini models and the Gemini API. It focuses on multimodal capabilities (text, images, audio, video), function calling, structured outputs, and up-to-date model selection. It covers SDK usage for Python and JavaScript/TypeScript and points to the authoritative API discovery spec for precise implementation details.

How this skill works

The skill summarizes current Gemini models, recommended SDKs (google-genai for Python, @google/genai for JS/TS), and practical code examples to generate content, call functions, and produce structured JSON. It directs you to fetch the latest REST API discovery spec (v1beta by default) and the llms.txt documentation index so your implementation uses the official request/response schemas. It also highlights multimodal and embedding workflows and migration guidance from legacy SDKs.

When to use it

  • Building chat, completion, summarization, or research assistants with Gemini models
  • Processing or generating multimodal content (images, audio, video, documents)
  • Implementing function calling where the model triggers backend functions
  • Producing strict structured outputs (valid JSON matching a schema) for downstream pipelines
  • Migrating from legacy Google generative SDKs to the new google-genai/@google/genai
  • When you need authoritative API field names and schema definitions for production integrations

Best practices

  • Prefer the v1beta discovery spec as the source of truth unless pinned to v1
  • Choose model by capability: gemini-3-pro for complex reasoning, gemini-3-flash for speed and multimodal balance
  • Use structured output and schema validation on the client to reject malformed responses
  • Implement function calling with strict input/output validation and least-privilege backends
  • Cache large contexts where supported to reduce token usage and latency
  • Migrate off legacy SDKs immediately and follow the official migration guide

Example use cases

  • A multimodal customer support agent that analyzes screenshots and generates step-by-step fixes
  • An automated code assistant that calls backend linting or test-run functions via model-invoked function calls
  • A document ingestion pipeline that extracts structured JSON data from PDFs using the structured-output features
  • Semantic search over product catalogs using embeddings generated by Gemini models
  • Image generation/editing workflows with gemini-3-pro-image-preview for creative assets

FAQ

Which SDK should I use for Python projects?

Use the google-genai SDK (pip install google-genai). It targets the v1beta API discovery spec and replaces legacy packages.

How do I ensure my integration stays compatible with the API?

Fetch the latest REST API discovery spec (v1beta by default) and use it as the source of truth for request/response schemas and supported methods.