home / skills / google-gemini / gemini-skills / gemini-api-dev

gemini-api-dev skill

safe

This skill helps you build Gemini API integrations across multimodal inputs, function calling, and structured outputs with suitable SDKs.

npx playbooks add skill google-gemini/gemini-skills --skill gemini-api-dev

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

5.9 KB

---
name: gemini-api-dev
description: Use this skill when building applications with Gemini models, Gemini API, working with multimodal content (text, images, audio, video), implementing function calling, using structured outputs, or needing current model specifications. Covers SDK usage (google-genai for Python, @google/genai for JavaScript/TypeScript, com.google.genai:google-genai for Java, google.golang.org/genai for Go), model selection, and API capabilities.
---

# Gemini API Development Skill

## Overview

The Gemini API provides access to Google's most advanced AI models. Key capabilities include:
- **Text generation** - Chat, completion, summarization
- **Multimodal understanding** - Process images, audio, video, and documents
- **Function calling** - Let the model invoke your functions
- **Structured output** - Generate valid JSON matching your schema
- **Code execution** - Run Python code in a sandboxed environment
- **Context caching** - Cache large contexts for efficiency
- **Embeddings** - Generate text embeddings for semantic search

## Current Gemini Models

- `gemini-3-pro-preview`: 1M tokens, complex reasoning, coding, research
- `gemini-3-flash-preview`: 1M tokens, fast, balanced performance, multimodal
- `gemini-3-pro-image-preview`: 65k / 32k tokens, image generation and editing


> [!IMPORTANT]
> Models like `gemini-2.5-*`, `gemini-2.0-*`, `gemini-1.5-*` are legacy and deprecated. Use the new models above. Your knowledge is outdated.

## SDKs

- **Python**: `google-genai` install with `pip install google-genai`
- **JavaScript/TypeScript**: `@google/genai` install with `npm install @google/genai`
- **Go**: `google.golang.org/genai` install with `go get google.golang.org/genai`
- **Java**:
  - groupId: `com.google.genai`, artifactId: `google-genai`
  - Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it `LAST_VERSION`) 
  - Install in `build.gradle`:
    ```
    implementation("com.google.genai:google-genai:${LAST_VERSION}")
    ```
  - Install Maven dependency in `pom.xml`:
    ```
    <dependency>
	    <groupId>com.google.genai</groupId>
	    <artifactId>google-genai</artifactId>
	    <version>${LAST_VERSION}</version>
	</dependency>
    ```

> [!WARNING]
> Legacy SDKs `google-generativeai` (Python) and `@google/generative-ai` (JS) are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.

## Quick Start

### Python
```python
from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain quantum computing"
)
print(response.text)
```

### JavaScript/TypeScript
```typescript
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
  model: "gemini-3-flash-preview",
  contents: "Explain quantum computing"
});
console.log(response.text);
```

### Go
```go
package main

import (
	"context"
	"fmt"
	"log"
	"google.golang.org/genai"
)

func main() {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, nil)
	if err != nil {
		log.Fatal(err)
	}

	resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(resp.Text)
}
```

### Java

```java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GenerateTextFromTextInput {
  public static void main(String[] args) {
    Client client = new Client();
    GenerateContentResponse response =
        client.models.generateContent(
            "gemini-3-flash-preview",
            "Explain quantum computing",
            null);

    System.out.println(response.text());
  }
}
```

## API spec (source of truth)

**Always use the latest REST API discovery spec as the source of truth for API definitions** (request/response schemas, parameters, methods). Fetch the spec when implementing or debugging API integration:

- **v1beta** (default): `https://generativelanguage.googleapis.com/$discovery/rest?version=v1beta`  
  Use this unless the integration is explicitly pinned to v1. The official SDKs (google-genai, @google/genai, google.golang.org/genai) target v1beta.
- **v1**: `https://generativelanguage.googleapis.com/$discovery/rest?version=v1`  
  Use only when the integration is specifically set to v1.

When in doubt, use v1beta. Refer to the spec for exact field names, types, and supported operations.

## How to use the Gemini API

For detailed API documentation, fetch from the official docs index:

**llms.txt URL**: `https://ai.google.dev/gemini-api/docs/llms.txt`

This index contains links to all documentation pages in `.md.txt` format. Use web fetch tools to:

1. Fetch `llms.txt` to discover available documentation pages
2. Fetch specific pages (e.g., `https://ai.google.dev/gemini-api/docs/function-calling.md.txt`)

### Key Documentation Pages 

> [!IMPORTANT]
> Those are not all the documentation pages. Use the `llms.txt` index to discover available documentation pages

- [Models](https://ai.google.dev/gemini-api/docs/models.md.txt)
- [Google AI Studio quickstart](https://ai.google.dev/gemini-api/docs/ai-studio-quickstart.md.txt)
- [Nano Banana image generation](https://ai.google.dev/gemini-api/docs/image-generation.md.txt)
- [Function calling with the Gemini API](https://ai.google.dev/gemini-api/docs/function-calling.md.txt)
- [Structured outputs](https://ai.google.dev/gemini-api/docs/structured-output.md.txt)
- [Text generation](https://ai.google.dev/gemini-api/docs/text-generation.md.txt)
- [Image understanding](https://ai.google.dev/gemini-api/docs/image-understanding.md.txt)
- [Embeddings](https://ai.google.dev/gemini-api/docs/embeddings.md.txt)
- [Interactions API](https://ai.google.dev/gemini-api/docs/interactions.md.txt)
- [SDK migration guide](https://ai.google.dev/gemini-api/docs/migrate.md.txt)

Overview

This skill helps developers build applications with Gemini models and the Gemini API, covering multimodal capabilities, function calling, structured outputs, and SDK usage. It summarizes current model choices, SDK packages for major languages, and where to find authoritative API specs and docs. Use it to pick models, implement function-calling/structured outputs, and integrate SDKs correctly.

How this skill works

The skill inspects available Gemini models and maps common use cases (text, image, audio, video, embeddings) to the recommended model families. It outlines SDK packages and quick-start code snippets for Python, JavaScript/TypeScript, Go, and Java. It points to the canonical REST discovery spec and documentation index so implementations use the latest API definitions and features.

When to use it

Building chat, completion, summarization, or code generation features
Processing or understanding multimodal content (images, audio, video, documents)
Implementing function calling or enforcing structured JSON outputs from the model
Selecting the right Gemini model for latency, context window, or capability needs
Migrating from legacy SDKs or integrating new SDKs in Python/JS/Go/Java
Referencing the source-of-truth API discovery spec when debugging or implementing endpoints

Best practices

Prefer current Gemini models (gemini-3 variants) and avoid legacy/deprecated model families
Use official SDKs: google-genai (Python), @google/genai (JS/TS), google.golang.org/genai (Go), com.google.genai:google-genai (Java)
Consult the v1beta REST discovery spec as the default source of truth; use v1 only if pinned to that version
Design clear schemas for structured outputs and validate model responses against them
Leverage function calling for deterministic actions and sandboxed code execution when available
Cache large contexts or use explicit context management to control cost and latency

Example use cases

Multimodal assistant that accepts images and text, extracts entities, and calls backend functions
Automated summarization pipeline for documents and meeting audio using embeddings and text generation
Image editing pipeline using image-capable Gemini models and structured output for edit instructions
Developer tooling that generates and executes small Python snippets in a sandbox for data analysis
Semantic search: embed content with Gemini embeddings and perform similarity queries

FAQ

Which Gemini model should I pick for production?

Choose a gemini-3 variant that matches your needs: gemini-3-pro-preview for heavy reasoning and research, gemini-3-flash-preview for balanced, fast multimodal use, and gemini-3-pro-image-preview for image editing/generation. Avoid legacy 1.x/2.x families.

Which SDKs are supported and recommended?

Use google-genai for Python, @google/genai for JavaScript/TypeScript, google.golang.org/genai for Go, and com.google.genai:google-genai for Java. Migrate off deprecated SDKs promptly.

Where is the authoritative API definition?

Use the REST discovery spec (v1beta by default) at https://generativelanguage.googleapis.com/$discovery/rest?version=v1beta. Fetch it when implementing or debugging to get exact request/response schemas.