home / skills / mgd34msu / goodvibes-plugin / huggingface-js
/plugins/goodvibes/skills/webdev/skills/huggingface-js
npx playbooks add skill mgd34msu/goodvibes-plugin --skill huggingface-jsReview the files below or copy the command above to add this skill to your agents.
---
name: huggingface-js
description: Runs ML models in the browser and Node.js with Transformers.js and Hugging Face Inference API. Use when adding local inference, embeddings, or calling hosted models without GPU servers.
---
# Hugging Face JavaScript
Run ML models locally with Transformers.js or via the Inference API. Supports text generation, embeddings, image classification, speech recognition, and more.
## Transformers.js (Local Inference)
Run models directly in browser or Node.js using ONNX Runtime.
```bash
npm install @huggingface/transformers
```
### Text Generation
```typescript
import { pipeline } from '@huggingface/transformers';
const generator = await pipeline('text-generation', 'Xenova/gpt2');
const result = await generator('The quick brown fox', {
max_new_tokens: 50,
});
console.log(result[0].generated_text);
```
### Text Classification (Sentiment)
```typescript
import { pipeline } from '@huggingface/transformers';
const classifier = await pipeline(
'text-classification',
'Xenova/distilbert-base-uncased-finetuned-sst-2-english'
);
const result = await classifier('I love this product!');
// [{ label: 'POSITIVE', score: 0.9998 }]
```
### Embeddings
```typescript
import { pipeline } from '@huggingface/transformers';
const embedder = await pipeline(
'feature-extraction',
'Xenova/all-MiniLM-L6-v2'
);
const result = await embedder('Hello, world!', {
pooling: 'mean',
normalize: true,
});
const embedding = Array.from(result.data);
// [0.123, -0.456, ...] - 384 dimensions
```
### Question Answering
```typescript
import { pipeline } from '@huggingface/transformers';
const qa = await pipeline(
'question-answering',
'Xenova/distilbert-base-cased-distilled-squad'
);
const result = await qa({
question: 'What is the capital of France?',
context: 'France is a country in Europe. Paris is the capital of France.',
});
console.log(result);
// { answer: 'Paris', score: 0.98, start: 42, end: 47 }
```
### Translation
```typescript
import { pipeline } from '@huggingface/transformers';
const translator = await pipeline(
'translation',
'Xenova/nllb-200-distilled-600M'
);
const result = await translator('Hello, how are you?', {
src_lang: 'eng_Latn',
tgt_lang: 'fra_Latn',
});
console.log(result[0].translation_text);
```
### Speech Recognition (Whisper)
```typescript
import { pipeline } from '@huggingface/transformers';
const transcriber = await pipeline(
'automatic-speech-recognition',
'Xenova/whisper-tiny.en'
);
const result = await transcriber('./audio.mp3');
console.log(result.text);
```
### Image Classification
```typescript
import { pipeline } from '@huggingface/transformers';
const classifier = await pipeline(
'image-classification',
'Xenova/vit-base-patch16-224'
);
const result = await classifier('https://example.com/cat.jpg');
// [{ label: 'tabby cat', score: 0.95 }, ...]
```
### Object Detection
```typescript
import { pipeline } from '@huggingface/transformers';
const detector = await pipeline(
'object-detection',
'Xenova/detr-resnet-50'
);
const result = await detector('https://example.com/image.jpg');
// [{ label: 'cat', score: 0.98, box: { xmin, ymin, xmax, ymax } }, ...]
```
### Zero-Shot Classification
```typescript
import { pipeline } from '@huggingface/transformers';
const classifier = await pipeline(
'zero-shot-classification',
'Xenova/bart-large-mnli'
);
const result = await classifier(
'This is a tutorial about machine learning',
['education', 'politics', 'sports']
);
console.log(result);
// { labels: ['education', ...], scores: [0.95, ...] }
```
## Hugging Face Inference API
Call hosted models without local computation.
```bash
npm install @huggingface/inference
```
### Setup
```typescript
import { HfInference } from '@huggingface/inference';
const hf = new HfInference(process.env.HF_ACCESS_TOKEN);
```
### Text Generation
```typescript
const result = await hf.textGeneration({
model: 'meta-llama/Llama-2-7b-chat-hf',
inputs: 'What is the meaning of life?',
parameters: {
max_new_tokens: 100,
temperature: 0.7,
},
});
console.log(result.generated_text);
```
### Streaming Text Generation
```typescript
const stream = hf.textGenerationStream({
model: 'meta-llama/Llama-2-7b-chat-hf',
inputs: 'Tell me a story',
parameters: {
max_new_tokens: 200,
},
});
for await (const chunk of stream) {
process.stdout.write(chunk.token.text);
}
```
### Chat Completion
```typescript
const result = await hf.chatCompletion({
model: 'meta-llama/Llama-2-7b-chat-hf',
messages: [
{ role: 'user', content: 'Hello!' },
],
max_tokens: 100,
});
console.log(result.choices[0].message.content);
```
### Embeddings
```typescript
const result = await hf.featureExtraction({
model: 'sentence-transformers/all-MiniLM-L6-v2',
inputs: 'Hello, world!',
});
console.log(result); // embedding vector
```
### Image Generation
```typescript
const result = await hf.textToImage({
model: 'stabilityai/stable-diffusion-2',
inputs: 'A futuristic city at sunset',
parameters: {
negative_prompt: 'blurry, low quality',
},
});
// result is a Blob
const buffer = Buffer.from(await result.arrayBuffer());
fs.writeFileSync('output.png', buffer);
```
### Image Classification
```typescript
const result = await hf.imageClassification({
model: 'google/vit-base-patch16-224',
data: await fs.openAsBlob('cat.jpg'),
});
console.log(result);
// [{ label: 'tabby cat', score: 0.95 }, ...]
```
### Speech Recognition
```typescript
const result = await hf.automaticSpeechRecognition({
model: 'openai/whisper-large-v3',
data: await fs.openAsBlob('audio.mp3'),
});
console.log(result.text);
```
## Inference Endpoints
For dedicated hosted models.
```typescript
import { InferenceClient } from '@huggingface/inference';
const client = new InferenceClient(process.env.HF_ACCESS_TOKEN);
const endpoint = client.endpoint('https://your-endpoint.endpoints.huggingface.cloud');
const result = await endpoint.textGeneration({
inputs: 'Hello, world!',
});
```
## Next.js Integration
```typescript
// app/api/generate/route.ts
import { HfInference } from '@huggingface/inference';
import { NextResponse } from 'next/server';
const hf = new HfInference(process.env.HF_ACCESS_TOKEN);
export async function POST(request: Request) {
const { prompt } = await request.json();
const result = await hf.textGeneration({
model: 'meta-llama/Llama-2-7b-chat-hf',
inputs: prompt,
parameters: {
max_new_tokens: 200,
},
});
return NextResponse.json({ text: result.generated_text });
}
```
### Streaming Response
```typescript
// app/api/stream/route.ts
import { HfInference } from '@huggingface/inference';
const hf = new HfInference(process.env.HF_ACCESS_TOKEN);
export async function POST(request: Request) {
const { prompt } = await request.json();
const stream = hf.textGenerationStream({
model: 'meta-llama/Llama-2-7b-chat-hf',
inputs: prompt,
parameters: { max_new_tokens: 200 },
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const chunk of stream) {
controller.enqueue(encoder.encode(chunk.token.text));
}
controller.close();
},
});
return new Response(readable, {
headers: { 'Content-Type': 'text/plain' },
});
}
```
## Browser Usage
Transformers.js works in the browser with WebGPU acceleration.
```html
<script type="module">
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
const classifier = await pipeline('text-classification');
const result = await classifier('I love this!');
console.log(result);
</script>
```
### With WebGPU
```typescript
import { pipeline, env } from '@huggingface/transformers';
// Enable WebGPU
env.backends.onnx.wasm.proxy = true;
const classifier = await pipeline('text-classification', 'model-name', {
device: 'webgpu',
});
```
## Configuration
```typescript
import { env } from '@huggingface/transformers';
// Cache settings
env.cacheDir = './models';
env.localModelPath = './local-models';
// Disable remote models (offline mode)
env.allowRemoteModels = false;
// Disable local models
env.allowLocalModels = false;
```
## Available Tasks
| Task | Pipeline | Example Model |
|------|----------|---------------|
| Text Classification | text-classification | distilbert-base-uncased-finetuned-sst-2-english |
| Text Generation | text-generation | gpt2, llama |
| Question Answering | question-answering | distilbert-base-cased-distilled-squad |
| Summarization | summarization | t5-small |
| Translation | translation | nllb-200-distilled-600M |
| Feature Extraction | feature-extraction | all-MiniLM-L6-v2 |
| Image Classification | image-classification | vit-base-patch16-224 |
| Object Detection | object-detection | detr-resnet-50 |
| Speech Recognition | automatic-speech-recognition | whisper-tiny |
| Zero-Shot Classification | zero-shot-classification | bart-large-mnli |
## Environment Variables
```bash
HF_ACCESS_TOKEN=hf_xxxxxxxx
```
## Best Practices
1. **Cache models** - Download once, reuse
2. **Use WebGPU** - Faster inference in browsers
3. **Choose small models** - For client-side use
4. **Stream responses** - Better UX for generation
5. **Use Inference API** - For large models
6. **Consider endpoints** - For production workloads
7. **Quantized models** - Smaller, faster (look for ONNX models)