home / mcp / apache spark mcp server
This read-only MCP Server allows you to connect to Apache Spark data from Claude Desktop through CData JDBC Drivers. For full CRUD support, check out the first managed MCP platform: CData Connect AI (https://www.cdata.com/ai/).
Configuration
View docs{
"mcpServers": {
"cdatasoftware-apache-spark-mcp-server-by-cdata": {
"url": "https://mcp.example.com/mcp"
}
}
}You run a local, read-only MCP server that exposes live Apache Spark data through the CData JDBC Driver. This lets you ask natural language questions and retrieve up-to-date Spark data without writing SQL, while keeping data access isolated to a simple, secure interface.
You will connect an MCP client to the local server and start asking questions about your Spark data. The server exposes a small set of tools that let you discover available tables and columns, then run read-only queries. Use natural language to request data, for example asking about correlations, counts, or upcoming events. The client will invoke the built-in tools behind the scenes, so you don’t need to craft SQL manually.
Prerequisites you need installed before you begin: a Java runtime environment (JRE/JDK) and Maven for building the MCP server.
1. Clone the MCP server repository and navigate into the project folder.
git clone https://github.com/cdatasoftware/apache-spark-mcp-server-by-cdata.git
cd apache-spark-mcp-server-by-cdata2. Build the MCP server package to produce the runnable JAR.
mvn clean install3. Obtain and install the CData JDBC Driver for Apache Spark to enable Spark access.
4. License the JDBC Driver to enable driver usage.
# Example commands shown in setup flow
# Locate the driver and license it as part of your installation
# Directory paths will vary by OS5. Configure the JDBC connection to your Spark data source (use the connection string utility to test and copy the final connection string). This example uses a Salesforce data source for illustration, but you will configure Spark accordingly.
java -jar cdata.jdbc.sparksql.jar6. Create a .prp file (for example apache-spark.prp) with the required properties to expose the connection via MCP. Include the server name, driver details, and the JDBC URL you copied.
Prefix=sparksql
ServerName=CDataSparkSQL
ServerVersion=1.0
DriverPath=PATH\\TO\\cdata.jdbc.sparksql.jar
DriverClass=cdata.jdbc.sparksql.SparkSQLDriver
JdbcUrl=jdbc:sparksql:InitiateOAuth=GETANDREFRESH;
Tables=Run the MCP server on the same machine as the client. The server operates in stdio mode, so the client must run on the same host.
java -jar /PATH/TO/CDataMCP-jar-with-dependencies.jar /PATH/TO/apache-spark.prpRetrieves a list of tables available in the data source. Returns CSV with a header row of column names.
Retrieves a list of columns for a specified table. Returns CSV with a header row of column names.
Executes a SQL SELECT query against the Spark data source and returns results.