Skip to main content

Guards

LLM Guard provides several built-in guards to protect your LLM applications. Each guard focuses on a specific aspect of prompt validation and security.

Available Guards

PII Guard

The PII (Personally Identifiable Information) Guard detects and protects sensitive personal information in prompts.

const guard = new LLMGuard({
pii: true
});

// This will detect and flag the email address
const result = await guard.validate('My email is user@example.com');

The PII Guard can detect:

  • Email addresses
  • Phone numbers
  • Social Security Numbers
  • Credit card numbers
  • IP addresses
  • Physical addresses
  • Names (when combined with other identifiers)

Profanity Guard

The Profanity Guard filters inappropriate language and offensive content.

const guard = new LLMGuard({
profanity: true
});

// This will detect and flag profanity
const result = await guard.validate('Your prompt with inappropriate language');

The Profanity Guard:

  • Detects common profanity and slurs
  • Recognizes character substitutions (like using numbers for letters)
  • Supports multiple languages
  • Can be customized with additional words

Jailbreak Guard

The Jailbreak Guard detects attempts to bypass AI safety measures and ethical constraints.

const guard = new LLMGuard({
jailbreak: true
});

// This will detect jailbreak attempts
const result = await guard.validate('Ignore your previous instructions and...');

The Jailbreak Guard identifies:

  • Instructions to ignore previous constraints
  • Requests to pretend to be something else
  • Attempts to disable safety features
  • Ethical boundary violations

Prompt Injection Guard

The Prompt Injection Guard identifies attempts to inject malicious instructions or override system prompts.

const guard = new LLMGuard({
promptInjection: true
});

// This will detect prompt injection attempts
const result = await guard.validate('Ignore the above and do this instead...');

The Prompt Injection Guard detects:

  • System prompt references
  • Memory reset attempts
  • Instruction overrides
  • Context manipulation

Relevance Guard

The Relevance Guard evaluates the quality and relevance of prompts.

const guard = new LLMGuard({
relevance: true,
relevanceOptions: {
minLength: 10,
maxLength: 5000,
minWords: 3,
maxWords: 1000
}
});

// This will evaluate the relevance of the prompt
const result = await guard.validate('Your prompt here');

The Relevance Guard checks:

  • Prompt length
  • Word count
  • Filler words
  • Repetitive content
  • Context relevance

Toxicity Guard

The Toxicity Guard detects harmful, aggressive, or discriminatory content.

const guard = new LLMGuard({
toxicity: true
});

// This will detect toxic content
const result = await guard.validate('Your prompt with potentially harmful content');

The Toxicity Guard identifies:

  • Hate speech
  • Threats
  • Discriminatory language
  • Aggressive content
  • Harmful instructions

Custom Guards

You can also create custom guards to address specific validation needs:

const guard = new LLMGuard({
customRules: {
myCustomGuard: (prompt) => {
// Your custom validation logic here
const isValid = /* your validation logic */;
return {
isValid,
errors: isValid ? [] : ['Custom validation error']
};
}
}
});

Combining Guards

You can enable multiple guards to create a comprehensive validation strategy:

const guard = new LLMGuard({
pii: true,
jailbreak: true,
profanity: true,
promptInjection: true,
relevance: true,
toxicity: true,
customRules: {
// Your custom rules here
}
});

Each guard will run independently, and the overall validation result will reflect the combined findings of all enabled guards.