home / skills / pluginagentmarketplace / custom-plugin-ai-red-teaming / infrastructure-security
This skill helps secure AI/ML infrastructure by protecting API, model storage, and compute resources with defense-in-depth practices.
npx playbooks add skill pluginagentmarketplace/custom-plugin-ai-red-teaming --skill infrastructure-securityReview the files below or copy the command above to add this skill to your agents.
---
name: infrastructure-security
version: "2.0.0"
description: Securing AI/ML infrastructure including model storage, API endpoints, and compute resources
sasmp_version: "1.3.0"
bonded_agent: 06-api-security-tester
bond_type: PRIMARY_BOND
# Schema Definitions
input_schema:
type: object
required: [assessment_scope]
properties:
assessment_scope:
type: string
enum: [api, storage, compute, network, full]
environment:
type: string
enum: [cloud, on_prem, hybrid]
output_schema:
type: object
properties:
vulnerabilities:
type: array
compliance_status:
type: object
recommendations:
type: array
# Framework Mappings
owasp_llm_2025: [LLM03, LLM10]
nist_ai_rmf: [Govern, Manage]
---
# AI Infrastructure Security
Protect **AI/ML infrastructure** from attacks targeting model storage, APIs, and compute resources.
## Quick Reference
```yaml
Skill: infrastructure-security
Agent: 06-api-security-tester
OWASP: LLM03 (Supply Chain), LLM10 (Unbounded Consumption)
NIST: Govern, Manage
Use Case: Secure AI deployment infrastructure
```
## Infrastructure Attack Surface
```
[External Threats]
↓
[API Gateway] → [Load Balancer] → [Inference Servers]
↓ ↓ ↓
[Rate Limit] [DDoS Protection] [Model Storage]
↓ ↓ ↓
[Auth/AuthZ] [TLS Termination] [Secrets Manager]
```
## Security Layers
### 1. API Security
```yaml
Authentication:
methods:
- API keys (rotation: 90 days)
- OAuth 2.0 / OIDC
- mTLS for service-to-service
requirements:
- Strong key generation
- Secure transmission
- Revocation capability
Rate Limiting:
per_user: 100 req/min
per_ip: 1000 req/min
burst: 50
cost_based: true # Token-aware limiting
Input Validation:
max_length: 4096 tokens
content_type: application/json
schema_validation: strict
encoding: UTF-8 normalized
```
```python
# API Security Configuration
class APISecurityConfig:
def __init__(self):
self.auth_config = {
'type': 'oauth2',
'token_expiry': 3600,
'refresh_enabled': True,
}
self.rate_limits = {
'default': {'requests': 100, 'window': 60},
'premium': {'requests': 1000, 'window': 60},
'burst_multiplier': 2,
}
self.input_validation = {
'max_tokens': 4096,
'blocked_patterns': self._load_blocked_patterns(),
'sanitization': True,
}
```
### 2. Model Protection
```yaml
Storage Security:
encryption: AES-256-GCM
access_control: RBAC
audit_logging: enabled
backup: encrypted, offsite
Theft Prevention:
query_limits: 10000/day per user
output_perturbation: enabled
watermarking: model and output
access_logging: all queries
```
```python
class ModelProtection:
def __init__(self, model):
self.model = model
self.watermark = self._generate_watermark()
def protected_inference(self, input_data, user_id):
# Log the query
self.log_query(user_id, input_data)
# Check query limits
if self.exceeds_limit(user_id):
raise RateLimitError("Query limit exceeded")
# Run inference
output = self.model(input_data)
# Add output perturbation (anti-extraction)
output = self.add_perturbation(output)
# Apply watermark
output = self.apply_watermark(output)
return output
```
### 3. Network Security
```yaml
Network Configuration:
internal_only: true
vpc_isolation: enabled
firewall_rules:
- allow: internal_services
- deny: all_external (except API gateway)
TLS Configuration:
version: "1.3"
cipher_suites: [TLS_AES_256_GCM_SHA384]
certificate_rotation: 90 days
mtls: service_to_service
```
### 4. Compute Security
```yaml
Container Security:
base_image: distroless
user: non-root
filesystem: read-only
capabilities: minimal
seccomp: enabled
Resource Limits:
cpu: 4 cores max
memory: 16GB max
gpu_memory: 24GB max
disk: ephemeral only
Isolation:
runtime: gvisor
network: namespace isolated
secrets: mounted, not in env
```
## Security Checklist
```yaml
API Layer:
- [ ] Strong authentication (OAuth2/mTLS)
- [ ] Rate limiting implemented
- [ ] Input validation enabled
- [ ] Error messages sanitized
- [ ] Logging comprehensive
Storage Layer:
- [ ] Encryption at rest
- [ ] Access controls configured
- [ ] Audit logging enabled
- [ ] Backup encryption
Network Layer:
- [ ] TLS 1.3 enforced
- [ ] Internal VPC only
- [ ] Firewall rules configured
- [ ] DDoS protection enabled
Compute Layer:
- [ ] Non-root containers
- [ ] Resource limits set
- [ ] Secrets in vault
- [ ] Immutable infrastructure
```
## Vulnerability Testing
```python
class InfrastructureSecurityTester:
def test_api_security(self, endpoint):
results = []
# Test authentication bypass
results.append(self.test_auth_bypass(endpoint))
# Test rate limiting
results.append(self.test_rate_limits(endpoint))
# Test input validation
results.append(self.test_input_validation(endpoint))
# Test error handling
results.append(self.test_error_disclosure(endpoint))
return results
def test_auth_bypass(self, endpoint):
payloads = [
{'Authorization': ''},
{'Authorization': 'Bearer invalid'},
{'Authorization': 'Bearer ' + 'a' * 1000},
]
for payload in payloads:
response = requests.get(endpoint, headers=payload)
if response.status_code != 401:
return Finding("auth_bypass", "CRITICAL")
return None
```
## Severity Classification
```yaml
CRITICAL:
- Authentication bypass
- Model theft possible
- Data exposure
HIGH:
- Rate limiting bypassable
- Weak encryption
- Insufficient logging
MEDIUM:
- Missing input validation
- Verbose error messages
- Outdated dependencies
LOW:
- Non-optimal configurations
- Minor policy gaps
```
## Troubleshooting
```yaml
Issue: API rate limiting not effective
Solution: Implement token-based limits, add IP reputation
Issue: Model extraction detected
Solution: Lower query limits, add output perturbation
Issue: High latency from security layers
Solution: Optimize validation, use caching, async logging
```
## Integration Points
| Component | Purpose |
|-----------|---------|
| Agent 06 | Security testing |
| Agent 08 | CI/CD security gates |
| /test api | Security scanning |
| SIEM | Security monitoring |
---
**Protect AI infrastructure with defense-in-depth security.**
This skill helps secure AI/ML infrastructure by hardening model storage, API endpoints, networking, and compute resources. It provides practical controls, testing routines, and a checklist to reduce risks like model theft, authentication bypass, and resource exhaustion. The focus is defense-in-depth with repeatable configurations and testable validations.
The skill inspects and enforces layered protections across API, model, network, and compute surfaces. It validates authentication, rate limiting, input schemas, storage encryption, container isolation, and TLS settings, and includes vulnerability tests to simulate auth bypass, rate-limit abuse, and input validation failures. Results map to severity levels and actionable remediations.
What are the primary threats this skill addresses?
It targets authentication bypass, model theft and extraction, unbounded resource consumption, data exposure, and misconfiguration across API, storage, network, and compute layers.
How do I prevent model extraction without degrading utility?
Combine query limits, output perturbation, watermarking, strict logging, and adaptive throttling; tune perturbation to balance fidelity and anti-extraction effectiveness.