home / skills / a5c-ai / babysitter / vulkan-compute

This skill helps you implement Vulkan compute workflows by generating shaders, compiling to SPIR-V, and configuring pipelines, bindings, and synchronization.

npx playbooks add skill a5c-ai/babysitter --skill vulkan-compute

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
9.2 KB
---
name: vulkan-compute
description: Vulkan compute shader development and pipeline configuration. Generate GLSL/HLSL compute shaders, compile to SPIR-V, configure compute pipelines, manage descriptor sets and resource bindings, implement memory barriers and synchronization.
allowed-tools: Bash(*) Read Write Edit Glob Grep WebFetch
metadata:
  author: babysitter-sdk
  version: "1.0.0"
  category: compute-shaders
  backlog-id: SK-004
---

# vulkan-compute

You are **vulkan-compute** - a specialized skill for Vulkan compute shader development and pipeline configuration. This skill provides expert capabilities for GPU compute using the Vulkan API.

## Overview

This skill enables AI-powered Vulkan compute operations including:
- Generate GLSL/HLSL compute shaders
- Compile shaders to SPIR-V bytecode
- Configure Vulkan compute pipelines
- Manage descriptor sets and resource bindings
- Handle push constants and specialization constants
- Configure workgroup dimensions and dispatch
- Implement memory barriers and synchronization
- Support Vulkan validation layers for debugging

## Prerequisites

- Vulkan SDK 1.3+
- glslangValidator or glslc (SPIR-V compiler)
- SPIRV-Tools (optional)
- Vulkan-capable GPU

## Capabilities

### 1. GLSL Compute Shader Generation

Generate GLSL compute shaders:

```glsl
#version 450

// Workgroup size specification
layout(local_size_x = 256, local_size_y = 1, local_size_z = 1) in;

// Buffer bindings
layout(set = 0, binding = 0) readonly buffer InputBuffer {
    float inputData[];
};

layout(set = 0, binding = 1) writeonly buffer OutputBuffer {
    float outputData[];
};

// Push constants for runtime parameters
layout(push_constant) uniform PushConstants {
    uint dataSize;
    float multiplier;
} pc;

void main() {
    uint gid = gl_GlobalInvocationID.x;

    if (gid < pc.dataSize) {
        outputData[gid] = inputData[gid] * pc.multiplier;
    }
}
```

### 2. SPIR-V Compilation

Compile shaders to SPIR-V:

```bash
# Using glslangValidator
glslangValidator -V compute.glsl -o compute.spv

# Using glslc (Google's compiler)
glslc -fshader-stage=compute compute.glsl -o compute.spv

# With optimization
glslc -O compute.glsl -o compute.spv

# Generate human-readable SPIR-V
spirv-dis compute.spv -o compute.spvasm

# Validate SPIR-V
spirv-val compute.spv

# Optimize SPIR-V
spirv-opt -O compute.spv -o compute_opt.spv
```

### 3. Compute Pipeline Creation

Create Vulkan compute pipelines:

```c
// Load SPIR-V shader
VkShaderModuleCreateInfo shaderInfo = {
    .sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO,
    .codeSize = spirvSize,
    .pCode = spirvCode
};
VkShaderModule shaderModule;
vkCreateShaderModule(device, &shaderInfo, NULL, &shaderModule);

// Pipeline layout with descriptor set and push constants
VkPushConstantRange pushConstantRange = {
    .stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
    .offset = 0,
    .size = sizeof(PushConstants)
};

VkPipelineLayoutCreateInfo layoutInfo = {
    .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
    .setLayoutCount = 1,
    .pSetLayouts = &descriptorSetLayout,
    .pushConstantRangeCount = 1,
    .pPushConstantRanges = &pushConstantRange
};
VkPipelineLayout pipelineLayout;
vkCreatePipelineLayout(device, &layoutInfo, NULL, &pipelineLayout);

// Create compute pipeline
VkComputePipelineCreateInfo pipelineInfo = {
    .sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
    .stage = {
        .sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
        .stage = VK_SHADER_STAGE_COMPUTE_BIT,
        .module = shaderModule,
        .pName = "main"
    },
    .layout = pipelineLayout
};
VkPipeline computePipeline;
vkCreateComputePipelines(device, VK_NULL_HANDLE, 1, &pipelineInfo, NULL, &computePipeline);
```

### 4. Descriptor Set Management

Configure resource bindings:

```c
// Descriptor set layout
VkDescriptorSetLayoutBinding bindings[] = {
    {
        .binding = 0,
        .descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
        .descriptorCount = 1,
        .stageFlags = VK_SHADER_STAGE_COMPUTE_BIT
    },
    {
        .binding = 1,
        .descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
        .descriptorCount = 1,
        .stageFlags = VK_SHADER_STAGE_COMPUTE_BIT
    }
};

VkDescriptorSetLayoutCreateInfo layoutInfo = {
    .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
    .bindingCount = 2,
    .pBindings = bindings
};
VkDescriptorSetLayout descriptorSetLayout;
vkCreateDescriptorSetLayout(device, &layoutInfo, NULL, &descriptorSetLayout);

// Allocate and update descriptor set
VkDescriptorBufferInfo inputBufferInfo = {
    .buffer = inputBuffer,
    .offset = 0,
    .range = VK_WHOLE_SIZE
};

VkDescriptorBufferInfo outputBufferInfo = {
    .buffer = outputBuffer,
    .offset = 0,
    .range = VK_WHOLE_SIZE
};

VkWriteDescriptorSet writes[] = {
    {
        .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
        .dstSet = descriptorSet,
        .dstBinding = 0,
        .descriptorCount = 1,
        .descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
        .pBufferInfo = &inputBufferInfo
    },
    {
        .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
        .dstSet = descriptorSet,
        .dstBinding = 1,
        .descriptorCount = 1,
        .descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
        .pBufferInfo = &outputBufferInfo
    }
};
vkUpdateDescriptorSets(device, 2, writes, 0, NULL);
```

### 5. Specialization Constants

Runtime shader customization:

```glsl
// In shader
layout(constant_id = 0) const uint WORKGROUP_SIZE = 256;
layout(constant_id = 1) const bool USE_FAST_MATH = false;

layout(local_size_x_id = 0) in;
```

```c
// In C code
VkSpecializationMapEntry entries[] = {
    {0, 0, sizeof(uint32_t)},  // WORKGROUP_SIZE
    {1, sizeof(uint32_t), sizeof(VkBool32)}  // USE_FAST_MATH
};

struct {
    uint32_t workgroupSize;
    VkBool32 useFastMath;
} specData = {512, VK_TRUE};

VkSpecializationInfo specInfo = {
    .mapEntryCount = 2,
    .pMapEntries = entries,
    .dataSize = sizeof(specData),
    .pData = &specData
};

// Use in pipeline creation
pipelineInfo.stage.pSpecializationInfo = &specInfo;
```

### 6. Compute Dispatch

Execute compute work:

```c
// Record command buffer
vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipeline);
vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE,
    pipelineLayout, 0, 1, &descriptorSet, 0, NULL);
vkCmdPushConstants(commandBuffer, pipelineLayout, VK_SHADER_STAGE_COMPUTE_BIT,
    0, sizeof(PushConstants), &pushConstants);

// Dispatch
uint32_t groupCountX = (dataSize + 255) / 256;
vkCmdDispatch(commandBuffer, groupCountX, 1, 1);

// Indirect dispatch
vkCmdDispatchIndirect(commandBuffer, indirectBuffer, 0);
```

### 7. Memory Barriers and Synchronization

Proper synchronization:

```c
// Buffer memory barrier
VkBufferMemoryBarrier barrier = {
    .sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER,
    .srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT,
    .dstAccessMask = VK_ACCESS_SHADER_READ_BIT,
    .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
    .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
    .buffer = buffer,
    .offset = 0,
    .size = VK_WHOLE_SIZE
};

vkCmdPipelineBarrier(commandBuffer,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    0, 0, NULL, 1, &barrier, 0, NULL);

// Memory barrier for compute-to-transfer
VkMemoryBarrier memoryBarrier = {
    .sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER,
    .srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT,
    .dstAccessMask = VK_ACCESS_TRANSFER_READ_BIT
};

vkCmdPipelineBarrier(commandBuffer,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    VK_PIPELINE_STAGE_TRANSFER_BIT,
    0, 1, &memoryBarrier, 0, NULL, 0, NULL);
```

### 8. Validation Layers

Debug with validation:

```c
// Enable validation layers
const char* validationLayers[] = {
    "VK_LAYER_KHRONOS_validation"
};

VkInstanceCreateInfo createInfo = {
    .enabledLayerCount = 1,
    .ppEnabledLayerNames = validationLayers
};

// Debug messenger callback
VkDebugUtilsMessengerCreateInfoEXT debugInfo = {
    .sType = VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT,
    .messageSeverity = VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT |
                       VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT,
    .messageType = VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT |
                   VK_DEBUG_UTILS_MESSAGE_TYPE_PERFORMANCE_BIT_EXT,
    .pfnUserCallback = debugCallback
};
```

## Process Integration

This skill integrates with the following processes:
- `compute-shader-development.js` - Compute shader workflows

## Output Format

```json
{
  "operation": "compile-shader",
  "status": "success",
  "input": "compute.glsl",
  "output": "compute.spv",
  "spirv_size": 1024,
  "workgroup_size": [256, 1, 1],
  "bindings": [
    {"binding": 0, "type": "storage_buffer", "access": "readonly"},
    {"binding": 1, "type": "storage_buffer", "access": "writeonly"}
  ],
  "push_constants_size": 8,
  "artifacts": ["compute.spv", "compute.spvasm"]
}
```

## Dependencies

- Vulkan SDK 1.3+
- glslangValidator or glslc
- SPIRV-Tools (optional)

## Constraints

- Workgroup size limited by device (usually 1024 threads)
- Descriptor set count limited (usually 4)
- Push constant size limited (128+ bytes)
- SPIR-V version must match Vulkan version

Overview

This skill provides expert assistance for Vulkan compute shader development and pipeline configuration. It helps generate GLSL/HLSL compute shaders, compile them to SPIR-V, and set up Vulkan compute pipelines with correct descriptor bindings, push/specialization constants, and synchronization. The goal is reliable, debuggable GPU compute workflows ready for production or research use.

How this skill works

The skill generates compute shader code and compilation commands, then produces SPIR-V artifacts using glslangValidator or glslc. It outputs concrete Vulkan setup snippets for pipeline creation, descriptor set layouts, push and specialization constant wiring, and dispatch patterns. It also provides memory barrier and validation-layer guidance to ensure correct execution and easy debugging.

When to use it

  • Developing or prototyping GPU compute kernels for data-parallel workloads
  • Creating or updating Vulkan compute pipelines and descriptor layouts
  • Converting GLSL/HLSL shader source to SPIR-V with optimization and validation
  • Integrating push constants, specialization constants, or indirect dispatch into compute workflows
  • Diagnosing synchronization issues or validation layer errors

Best practices

  • Match SPIR-V version and capabilities to the target Vulkan runtime and device limits
  • Prefer specialization constants for compile-time tunables and push constants for small runtime parameters
  • Use validation layers during development and enable debug callbacks to catch synchronization mistakes
  • Choose workgroup sizes respecting device limits and memory access alignment to maximize throughput
  • Always insert precise memory barriers when crossing shader/transfer/host stages and prefer fine-grained buffer barriers

Example use cases

  • Generate a GLSL compute shader that multiplies an input buffer by a runtime multiplier and compile to compute.spv
  • Create pipeline layout with descriptor set for input/output storage buffers and a push-constant range for parameter passing
  • Use specialization constants to tune workgroup size at pipeline creation without re-compiling shader source
  • Record command buffer that binds pipeline, descriptor set, pushes constants, and dispatches compute work (including indirect dispatch)
  • Insert buffer memory barriers to synchronize shader writes before a transfer read or host-visible readback

FAQ

Which compiler should I use to produce SPIR-V?

Use glslangValidator or glslc; both produce correct SPIR-V. Use spirv-opt for extra optimization and spirv-val to validate output.

How do I pick workgroup sizes?

Start with powers-of-two local sizes that match vector widths and memory alignment, keep total local invocations under the device limit (commonly 1024), and benchmark for peak occupancy.