home / mcp / lakeflow mcp server
Launch Lakeflow compute jobs by spawning scalable runs via an MCP server with per-run arguments and secure environment handling.
Configuration
View docs{
"mcpServers": {
"arahimi-hims-lakeflow-mcp": {
"command": "uv",
"args": [
"run",
"--quiet",
"--directory",
"/path/to/lakeflow-mcp",
"python",
"lakeflow.py"
],
"env": {
"DATABRICKS_HOST": "https://hims-machine-learning-staging-workspace.cloud.databricks.com",
"DATABRICKS_TOKEN": "<your token>"
}
}
}
}You can run massively parallel compute jobs on the Lakeflow platform through an MCP server. This setup lets you spawn scalable data processing tasks, control how many workers run, and pass per-run arguments, all while keeping secrets secure and organized. The MCP server configuration shown here enables you to launch Lakeflow jobs from an agent you control and manage their lifecycle programmatically.
You use the MCP server to start, monitor, and retrieve logs for Lakeflow compute jobs. Start by configuring a local MCP client to communicate with your Lakeflow setup, then issue commands to create a job from source, trigger multiple runs with different arguments, and monitor or fetch logs for each run.
Prerequisites you need before configuring the MCP server: - Access to Databricks for Lakeflow execution - A working environment where you can store MCP configuration at the expected path - A client that can communicate with the MCP server (the MCP client is driven by the provided commands)
# Step 1: Ensure you have access to Databricks
# Step 2: Prepare MCP configuration for Lakeflow in your MCP client
# Step 3: Place the MCP config at the expected path (see below)The MCP server uses a JSON configuration to define how to launch Lakeflow as a local process. It specifies the runtime command, arguments, and the required environment variables needed to connect to Databricks. You will manage jobs by creating a job from source, triggering runs with different argument sets, and monitoring the run state and logs.
{
"mcpServers": {
"lakeflow": {
"command": "uv",
"args": [
"run",
"--quiet",
"--directory",
"/path/to/lakeflow-mcp",
"python",
"lakeflow.py"
],
"env": {
"DATABRICKS_HOST": "https://hims-machine-learning-staging-workspace.cloud.databricks.com",
"DATABRICKS_TOKEN": "<your token>"
}
}
}
}With the MCP server configured, you can instruct the agent to launch and manage Lakeflow runs. For example, you can request multiple copies of the same Lakeflow job to run concurrently, each with distinct arguments, and then gather their logs and results as needed.
Builds, uploads, and prepares a Lakeflow job from your local source, returning a job ID for subsequent actions.
Starts one or more parallel runs of the prepared Lakeflow job with specified arguments.
Lists all runs associated with a given Lakeflow job ID, showing status and metadata.
Retrieves logs for a specific run, enabling debugging and monitoring.