home / mcp / fal mcp server

fal MCP Server

A Model Context Protocol (MCP) server for interacting with fal.ai models and services.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "derekalia-fal": {
      "url": "http://127.0.0.1:6274/mcp/",
      "headers": {
        "FAL_KEY": "YOUR_FAL_API_KEY_HERE",
        "MCP_TRANSPORT": "http"
      }
    }
  }
}

You set up an MCP server to interact with fal.ai models, enabling you to list, search, schema-fetch, and generate content through a streamlined HTTP or local execution workflow. This server supports real-time streaming and queue management, making it easy to integrate fal.ai capabilities into your tools and applications.

How to use

Connect to the MCP server from your client to start using fal.ai models. You can run the server in HTTP transport mode for remote access or in development mode to test interactively with an inspector UI.

How to use

Run the server in HTTP transport mode using your fal.ai API key. This starts the MCP server and exposes a URL you can connect to from your client or IDE.

./run_http.sh YOUR_FAL_API_KEY

How to connect your client

Configure your MCP client to point at the local HTTP endpoint shown by the server. Use the following connection snippet in your client configuration.

{
  "Fal": {
    "url": "http://127.0.0.1:6274/mcp/"
  }
}

Development mode with MCP Inspector

For testing and debugging, you can run the server in development mode to access an interactive web UI that lets you test all tools.

fastmcp dev main.py

Environment and defaults

The server uses an API key for fal.ai access and can be configured to run with HTTP transport by default or in stdio mode for local development.

If you prefer to set the API key permanently, you can prepare a configuration that exports the key as an environment variable and then run the HTTP server without passing the key each time.

Security and access

Keep your fal.ai API key secure. Do not expose it in client-side code or logs. When running in HTTP mode, limit access to trusted networks or use additional authentication mechanisms as needed.

Notes

Streaming support and queue management are available, allowing you to monitor long-running model generation tasks and retrieve results when ready.

Available tools

models

List available models with optional pagination

search

Search for models by keywords

schema

Get OpenAPI schema for a specific model

generate

Generate content using a model with optional queueing

result

Get result from a queued request

status

Check status of a queued request

cancel

Cancel a queued request

upload

Upload a file to fal.ai CDN