$$$$$β² $$$$$$β² $$β² $$β² $$β²
β²__$$ β$$ __$$β² $$ β β²__β$$ β
$$ β$$ β± β²__β$$β² $$β² $$$$$$β² $$$$$$β² $$$$$$$ β $$$$$$β² $$$$$$β² $$β² $$ β $$$$$$$β²
$$ β$$ β$$$$β² $$ β $$ β β²____$$β² $$ __$$β² $$ __$$ β$$ __$$β² β²____$$β² $$ β$$ β$$ _____β
$$β² $$ β$$ ββ²_$$ β$$ β $$ β $$$$$$$ β$$ β β²__β$$ β± $$ β$$ β β²__β$$$$$$$ β$$ β$$ ββ²$$$$$$β²
$$ β $$ β$$ β $$ β$$ β $$ β$$ __$$ β$$ β $$ β $$ β$$ β $$ __$$ β$$ β$$ β β²____$$β²
β²$$$$$$ ββ²$$$$$$ ββ²$$$$$$ ββ²$$$$$$$ β$$ β β²$$$$$$$ β$$ β β²$$$$$$$ β$$ β$$ β$$$$$$$ β
β²______β± β²______β± β²______β± β²_______ββ²__β β²_______ββ²__β β²_______ββ²__ββ²__ββ²_______β±
The first Java guardrails library for LLM applications.
JGuardrails is a framework-agnostic toolkit that adds programmable safety rails to any Java LLM application. Works with Spring AI, LangChain4j, or any custom LLM client β no vendor lock-in.
A system prompt is a request. Guardrails are enforcement.
- Why JGuardrails
- How It Works
- Installation
- Quick Start
- Built-in Rails
- Fluent API Reference
- YAML Configuration
- Custom Patterns and Engines
- Spring AI Integration
- LangChain4j Integration
- Custom Rails
- Audit Logging
- Metrics
- Running Examples
- Building from Source
- What's New in 1.0.0
| System Prompt | JGuardrails | |
|---|---|---|
| Enforcement | Soft β LLM can ignore it | Hard β enforced at code level |
| Jailbreak resistance | No | Yes |
| PII masking | Not possible | Built-in |
| Audit trail | None | Every block/modify is logged |
| Added latency | 0 ms | 1β5 ms (pattern mode) |
| Framework dependency | LLM-specific | Framework-agnostic |
Common problems JGuardrails solves:
- User tries to jailbreak the LLM β request is blocked before it ever reaches the model
- User pastes email/phone/credit card number β PII is masked before being sent to the LLM
- LLM returns a toxic response β response is blocked before reaching the user
- User asks about forbidden topics β blocked by keyword matching
- You need a full history of all blocks and modifications β audit log included out of the box
- Detection is pattern-based (regex + Aho-Corasick), without semantic understanding β JGuardrails is a guardrail layer, not a complete security solution.
- Officially tuned and tested languages for jailbreak/toxicity (regex): EN / RU / DE / FR / ES / PL / IT. Keyword detection also covers JA / ZH / AR / HI / TR / KO.
- Obfuscated toxicity (full leet, heavy spacing, reversed text) and sophisticated social-engineering prompts can still pass.
- PII patterns are intentionally conservative and may sometimes mask technical identifiers (UUIDs, ticket numbers, etc.).
- For high-risk or regulated use cases, JGuardrails should be combined with additional LLM- or ML-based safety systems.
User Input β [InputRail 1] β [InputRail 2] β ... β Your LLM
β
User β [OutputRail 1] β [OutputRail 2] β ... β
Each rail returns one of three decisions:
| Decision | Meaning |
|---|---|
| PASS | Text continues to the next rail unchanged |
| BLOCK | Chain stops; user receives the blockedResponse message |
| MODIFY | Text is transformed (e.g., PII masked) and forwarded to the next rail |
Important: the pipeline does not call the LLM itself. Your code calls the LLM β the pipeline only processes text before and after.
Gradle (Kotlin DSL):
Step 1 β add JitPack to settings.gradle.kts:
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
mavenCentral()
maven { url = uri("https://jitpack.io") }
}
}Step 2 β add dependencies to build.gradle.kts:
dependencies {
// Core + built-in detectors (required)
implementation("com.github.Ratila1:JGuardrails:v1.0.0")
// Spring AI adapter (optional)
implementation("com.github.Ratila1.JGuardrails:jguardrails-spring-ai:v1.0.0")
// LangChain4j adapter (optional)
implementation("com.github.Ratila1.JGuardrails:jguardrails-langchain4j:v1.0.0")
// LLM-as-judge support (optional)
implementation("com.github.Ratila1.JGuardrails:jguardrails-llm:v1.0.0")
}Gradle (Groovy DSL):
// settings.gradle
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
mavenCentral()
maven { url 'https://jitpack.io' }
}
}// build.gradle
dependencies {
implementation 'com.github.Ratila1:JGuardrails:v1.0.0'
}Maven:
<!-- pom.xml -->
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.github.Ratila1.JGuardrails</groupId>
<artifactId>jguardrails-core</artifactId>
<version>v1.0.0</version>
</dependency>
<dependency>
<groupId>com.github.Ratila1.JGuardrails</groupId>
<artifactId>jguardrails-detectors</artifactId>
<version>v1.0.0</version>
</dependency>
</dependencies>Tip: replace
v1.0.0withmaster-SNAPSHOTto always get the latest build from the master branch.
git clone https://github.com/Ratila1/JGuardrails.git
cd JGuardrails
./gradlew publishToMavenLocalThen in your project:
repositories {
mavenLocal()
mavenCentral()
}
dependencies {
implementation("io.jguardrails:jguardrails-core:1.0.0")
implementation("io.jguardrails:jguardrails-detectors:1.0.0")
}10 lines to get started:
import io.jguardrails.core.RailContext;
import io.jguardrails.detectors.input.jailbreak.JailbreakDetector;
import io.jguardrails.detectors.input.pii.*;
import io.jguardrails.detectors.output.toxicity.ToxicityChecker;
import io.jguardrails.pipeline.GuardrailPipeline;
GuardrailPipeline pipeline = GuardrailPipeline.builder()
.addInputRail(new JailbreakDetector())
.addInputRail(PiiMasker.builder().entities(PiiEntity.EMAIL, PiiEntity.PHONE).build())
.addOutputRail(new ToxicityChecker())
.blockedResponse("I'm unable to process this request.")
.build();
// Full cycle with your LLM in one line:
String safeResponse = pipeline.execute(
userMessage,
RailContext.empty(),
processedInput -> myLlmClient.chat(processedInput) // β your LLM here
);Detects prompt injection and jailbreak attempts locally β no API calls required.
Uses a hybrid engine: regex for complex structural patterns, Aho-Corasick automaton for O(n) literal phrase matching. Both engines run in parallel and the result with the earlier position in text wins.
JailbreakDetector detector = JailbreakDetector.builder()
.sensitivity(JailbreakDetector.Sensitivity.HIGH) // LOW | MEDIUM | HIGH
.build();What it blocks:
"Ignore previous instructions..."/"Forget all prior instructions...""You are now DAN"/"Act as if you are..."/"Pretend to be...""Developer mode enabled"/"Jailbreak mode"/"bypass safety filter"- Delimiter injection:
```system```,[SYSTEM],<<<override>>> - Patterns in English, Russian, German, French, Spanish, Polish, Italian
- Literal phrases in Japanese (Aho-Corasick)
// Add your own patterns:
JailbreakDetector detector = JailbreakDetector.builder()
.sensitivity(JailbreakDetector.Sensitivity.MEDIUM)
.addCustomPattern("reveal.*system.*prompt")
.addCustomPattern("bypass.*filter")
.build();
// Load patterns from your own YAML file (replaces defaults):
detector = JailbreakDetector.builder()
.patternsFromFile(myFile, "my_jailbreak_section")
.build();
// Extend defaults with extra patterns from a YAML file:
detector = JailbreakDetector.builder()
.addPatternsFromFile(myFile, "extra_jailbreak_section")
.build();Masks personally identifiable information before it reaches the LLM.
PiiMasker masker = PiiMasker.builder()
.entities(
PiiEntity.EMAIL, // john@example.com β [EMAIL REDACTED]
PiiEntity.PHONE, // +1 555 000 1234 β [PHONE REDACTED]
PiiEntity.CREDIT_CARD, // 4276 1234 5678 9012 β [CREDIT_CARD REDACTED]
PiiEntity.SSN, // 123-45-6789 β [SSN REDACTED]
PiiEntity.IBAN, // DE89370400440532013000 β [IBAN REDACTED]
PiiEntity.IP_ADDRESS, // 192.168.1.1 β [IP_ADDRESS REDACTED]
PiiEntity.DATE_OF_BIRTH // 01/01/1990 β [DATE_OF_BIRTH REDACTED]
)
.strategy(PiiMaskingStrategy.REDACT) // Full replacement (default)
// .strategy(PiiMaskingStrategy.MASK_PARTIAL) // j***@g***.com | +1***1234
// .strategy(PiiMaskingStrategy.HASH) // [EMAIL:a3f8c2d1e4b5]
.build();Blocks or allows requests based on topic keyword matching.
// BLOCKLIST mode β listed topics are blocked, everything else is allowed:
TopicFilter filter = TopicFilter.builder()
.blockTopics("politics", "religion", "violence", "adult", "drugs")
.build();
// ALLOWLIST mode β only listed topics are allowed, everything else is blocked:
TopicFilter filter = TopicFilter.builder()
.allowTopics("banking", "payments", "account")
.build();
// Custom topics with your own keywords:
TopicFilter filter = TopicFilter.builder()
.mode(TopicFilter.Mode.BLOCKLIST)
.customTopic("competitors", "CompetitorX", "RivalCorp", "OtherProduct")
.customTopic("legal_risk", "lawsuit", "litigation", "court", "sue")
.build();Built-in topics: politics, religion, violence, adult, drugs, medical_advice, financial_advice
Blocks inputs that exceed configured length limits (prevents context-overflow attacks).
InputLengthValidator validator = InputLengthValidator.builder()
.maxCharacters(5000) // 0 = disabled
.maxWords(800) // 0 = disabled
.build();Blocks toxic LLM responses before they reach the user.
Uses the same hybrid engine as JailbreakDetector: regex for structural patterns, Aho-Corasick for literal phrases. Multilingual keyword matching (ZH / JA / AR / HI / TR / KO) runs as a second phase via KeywordMatcher.
ToxicityChecker checker = ToxicityChecker.builder()
.categories(
ToxicityChecker.Category.PROFANITY, // offensive language
ToxicityChecker.Category.HATE_SPEECH, // discrimination, hate speech
ToxicityChecker.Category.THREATS, // threats and incitement to violence
ToxicityChecker.Category.SELF_HARM, // self-harm content
ToxicityChecker.Category.THIRD_PERSON_ABUSE // insults / death-wishes about absent persons
)
.addBlockedWord("my_custom_word")
.build();Supported languages:
- Regex patterns: EN / RU / FR / DE / ES / PL / IT
- Keyword (Aho-Corasick): EN + JA β hate phrases, threats, aggressive dismissals
- Multilingual keyword phase: ZH / JA / AR / HI / TR / KO
THIRD_PERSON_ABUSE category covers:
- pronoun + copula + insult: "he is an idiot", "she is worthless"
- dehumanising phrases: "waste of space", "not worth anything"
- third-person death wishes: "she should die", "he doesn't deserve to live"
// Disable multilingual detection (e.g., for performance):
ToxicityChecker checker = ToxicityChecker.builder()
.multilingualEnabled(false)
.build();
// Load patterns from your own YAML file:
checker = ToxicityChecker.builder()
.patternsFromFile(myFile, "my_toxicity_section")
.build();
// Extend defaults:
checker = ToxicityChecker.builder()
.addPatternsFromFile(myFile, "extra_toxicity_section")
.build();
// Replace multilingual keywords:
checker = ToxicityChecker.builder()
.keywordsFromFile(myKeywordsFile)
.build();
// Add keywords on top of defaults:
checker = ToxicityChecker.builder()
.addKeywordsFromFile(myKeywordsFile)
.build();Masks PII in LLM responses (in case the model recalls personal data from training).
OutputPiiScanner scanner = OutputPiiScanner.builder()
.entities(PiiEntity.EMAIL, PiiEntity.PHONE, PiiEntity.CREDIT_CARD)
.strategy(PiiMaskingStrategy.MASK_PARTIAL)
.build();Limits the length of LLM responses.
OutputLengthValidator validator = OutputLengthValidator.builder()
.maxCharacters(2000)
.truncate(true) // true = truncate with "...", false = block
.build();Validates that the LLM returned valid JSON (useful for structured output).
JsonSchemaValidator validator = JsonSchemaValidator.builder()
.requireValidJson(true)
.build();GuardrailPipeline pipeline = GuardrailPipeline.builder()
// Input rails β executed in priority order (lower number = earlier)
.addInputRail(InputLengthValidator.builder().maxCharacters(5000).build()) // priority=5
.addInputRail(JailbreakDetector.builder().build()) // priority=10
.addInputRail(PiiMasker.builder().entities(PiiEntity.EMAIL).build()) // priority=20
.addInputRail(TopicFilter.builder().blockTopics("violence").build()) // priority=30
// Output rails
.addOutputRail(ToxicityChecker.builder().build()) // priority=10
.addOutputRail(OutputPiiScanner.builder().build()) // priority=20
.addOutputRail(OutputLengthValidator.builder().maxCharacters(2000).build())// priority=30
// Static blocked-response message:
.blockedResponse("I'm unable to process this request.")
// Or dynamic, based on context:
// .onBlocked(ctx -> "Blocked for session: " + ctx.getSessionId().orElse("unknown"))
// Fail strategy on rail exception:
// true = fail-open: skip the broken rail and continue (lenient)
// false = fail-closed: block the request (safe default)
.failOpen(false)
.auditLogger(new DefaultAuditLogger())
.metrics(new DefaultMetrics())
.build();Use this when you need full control over each step:
RailContext context = RailContext.builder()
.sessionId("session-abc123")
.userId("user-456")
.attribute("language", "en")
.build();
// Step 1: process input
PipelineExecutionResult inputResult = pipeline.processInput(userMessage, context);
if (inputResult.isBlocked()) {
return inputResult.getText(); // returns blockedResponse β do not call LLM
}
// Step 2: call your LLM (pipeline never does this itself)
String llmResponse = myLlmClient.chat(inputResult.getText()); // text may be modified
// Step 3: process output
PipelineExecutionResult outputResult = pipeline.processOutput(llmResponse, userMessage, context);
return outputResult.getText(); // safe response (or blockedResponse if blocked)String response = pipeline.execute(
userMessage,
context,
processedInput -> myLlmClient.chat(processedInput)
);PipelineExecutionResult result = pipeline.processInput(userMessage, context);
result.isBlocked(); // whether the pipeline blocked this request
result.getText(); // final text (or blockedResponse if blocked)
result.getOriginalText(); // original text before any rails ran
result.getExecutionTime(); // Duration β total pipeline execution time
result.getRailResults(); // List<RailResult> β result from every rail
// Details of the blocking rail (if blocked):
result.getBlockingResult().ifPresent(r -> {
System.out.println("Blocked by: " + r.railName());
System.out.println("Reason: " + r.reason());
System.out.println("Confidence: " + r.confidence()); // 0.0β1.0
System.out.println("Metadata: " + r.metadata());
});RailContext context = RailContext.builder()
.sessionId("ses-123") // session ID for audit
.userId("usr-456") // user ID for audit
.addHistory("previous message") // conversation history
.attribute("region", "EU") // arbitrary attributes
.attribute("role", "admin")
.build();
// Inside any rail, read and write attributes:
context.getAttribute("region", String.class); // Optional<String>
context.setAttribute("detectedLanguage", "en"); // visible to downstream railsConfigure the entire pipeline via YAML without recompiling.
jguardrails:
# Behavior when a rail throws an exception
# closed = block the request (safer, default)
# open = skip the broken rail and continue
fail-strategy: closed
# Message returned when a request is blocked
blocked-response: "I'm unable to process this request. Please rephrase and try again."
# Input rails β executed in priority order (lower = earlier)
input-rails:
- type: input-length
enabled: true
priority: 5
config:
max-characters: 8000
- type: jailbreak-detect
enabled: true
priority: 10
config:
sensitivity: high # low | medium | high
mode: pattern # pattern | llm-judge | hybrid
- type: pii-mask
enabled: true
priority: 20
config:
entities:
- EMAIL
- PHONE
- CREDIT_CARD
- IBAN
strategy: redact # redact | mask-partial | hash
- type: topic-filter
enabled: true
priority: 30
config:
mode: blocklist # blocklist | allowlist
topics:
- violence
- adult
- drugs
custom-topics:
competitors:
- "CompetitorName"
- "RivalProduct"
# Output rails
output-rails:
- type: toxicity-check
enabled: true
priority: 10
config:
categories:
- PROFANITY
- HATE_SPEECH
- THREATS
- SELF_HARM
- THIRD_PERSON_ABUSE
- type: output-pii-scan
enabled: true
priority: 20
config:
entities:
- EMAIL
- PHONE
strategy: mask-partial
- type: output-length
enabled: true
priority: 30
config:
max-characters: 3000
truncate: true # true = truncate with "...", false = block
# Audit logging
audit:
enabled: true
log-level: INFO
include-original-text: false # keep false for privacy
# Metrics
metrics:
enabled: true// From classpath (src/main/resources/)
GuardrailConfig config = YamlConfigLoader.loadFromClasspath("guardrails.yml");
// From filesystem
GuardrailConfig config = YamlConfigLoader.load(Path.of("/etc/myapp/guardrails.yml"));
// From any InputStream
GuardrailConfig config = YamlConfigLoader.loadFromStream(inputStream);JGuardrails 1.0.0 exposes the full pattern-matching stack so you can extend or replace every part.
Each entry in a pattern YAML file can declare type: REGEX (default) or type: KEYWORD:
my_section:
# Regex β compiled to java.util.regex.Pattern, supports \b, lookaheads, etc.
- id: MY_REGEX_PATTERN
flags: CI
pattern: "ignore\\s+all\\s+instructions"
# Keyword β matched by Aho-Corasick O(n) engine, case-insensitive.
# Preferred for literal phrases and for CJK / Arabic / Devanagari scripts
# where \b word boundaries are undefined.
- id: MY_KEYWORD_PHRASE
type: KEYWORD
pattern: "bypass safety filter"// PatternSpec carries id, category, and type (REGEX or KEYWORD):
PatternSpec spec = new PatternSpec("MY_ID", "my_category", PatternSpec.Type.KEYWORD);
// RegexPatternEngine β matches one REGEX spec against text:
RegexPatternEngine regexEngine = RegexPatternEngine.builder()
.register("MY_ID", Pattern.compile("my regex", Pattern.CASE_INSENSITIVE))
.build();
// KeywordAutomatonEngine β Aho-Corasick multi-phrase matching:
KeywordAutomatonEngine kwEngine = new KeywordAutomatonEngine(
Map.of("KW_1", "bypass filter", "KW_2", "ignore instructions")
);
// CompositePatternEngine β routes each spec to the correct sub-engine by type:
CompositePatternEngine engine = new CompositePatternEngine(regexEngine, kwEngine);
// findFirst() β single call that dispatches KEYWORD specs to Aho-Corasick
// and REGEX specs to the regex engine; returns the earliest match in text:
Optional<MatchedSpec> hit = engine.findFirst(text, activeSpecs);
hit.ifPresent(ms -> {
System.out.println("Matched id: " + ms.spec().id());
System.out.println("Category: " + ms.spec().category());
System.out.println("Engine type: " + ms.spec().type());
System.out.println("Matched text: " + ms.result().matchedText());
System.out.println("Position: " + ms.result().start() + "β" + ms.result().end());
});// Implement TextPatternEngine with your own logic (ML model, bloom filter, etc.):
TextPatternEngine myEngine = new TextPatternEngine() {
@Override
public MatchResult find(String text, PatternSpec spec) { /* ... */ }
};
JailbreakDetector detector = JailbreakDetector.builder()
.engine(myEngine)
.build();// Replace all default patterns with patterns from a file:
JailbreakDetector detector = JailbreakDetector.builder()
.patternsFromFile(Path.of("my-patterns.yml"), "custom_section")
.build();
// Add patterns on top of defaults:
detector = JailbreakDetector.builder()
.addPatternsFromFile(Path.of("extra-patterns.yml"), "extra_section")
.build();
// Same API for ToxicityChecker:
ToxicityChecker checker = ToxicityChecker.builder()
.addPatternsFromFile(Path.of("extra-toxicity.yml"), "extra_threats")
.build();
// Replace / extend multilingual keywords:
checker = ToxicityChecker.builder()
.keywordsFromFile(Path.of("my-keywords.yml")) // replace
.build();
checker = ToxicityChecker.builder()
.addKeywordsFromFile(Path.of("extra-keywords.yml")) // extend
.build();// Load specs (id + category + type) from a YAML section:
List<PatternSpec> specs = PatternLoader.loadSpecs("my-resource.yml", "my_section");
// Build engines directly:
RegexPatternEngine regexEngine = PatternLoader.buildRegexEngine("my.yml", "sec1", "sec2");
KeywordAutomatonEngine kwEngine = PatternLoader.buildKeywordEngine("my.yml", "sec1");
CompositePatternEngine composite = PatternLoader.buildCompositeEngine("my.yml", "sec1", "sec2");
// From filesystem path:
RegexPatternEngine fromFile = PatternLoader.buildRegexEngineFromFile(path, "section");| Language | Code | Jailbreak | Toxicity | Engine |
|---|---|---|---|---|
| English | EN | β regex | β regex + keywords | Regex + Aho-Corasick |
| Russian | RU | β regex | β regex | Regex |
| French | FR | β regex | β regex | Regex |
| German | DE | β regex | β regex | Regex |
| Spanish | ES | β regex | β regex | Regex |
| Polish | PL | β regex | β regex | Regex |
| Italian | IT | β regex | β regex | Regex |
| Japanese | JA | β keywords | β keywords | Aho-Corasick |
| Chinese | ZH | β keywords | β keywords | KeywordMatcher |
| Arabic | AR | β keywords | β keywords | KeywordMatcher |
| Hindi | HI | β keywords | β keywords | KeywordMatcher |
| Turkish | TR | β keywords | β keywords | KeywordMatcher |
| Korean | KO | β keywords | β keywords | KeywordMatcher |
Detection layers:
- Main engine (Aho-Corasick + Regex) β
jailbreak-patterns.yml/toxicity-patterns.ymlβ covers EN regex, 6 European languages regex, EN+JA keyword phrases. - Multilingual keyword phase β
multilingual-jailbreak-keywords.yml/multilingual-toxicity-keywords.ymlβ covers ZH / JA / AR / HI / TR / KO viaKeywordMatcher(simple substring scan). JA appears in both layers for deeper coverage.
implementation("com.github.Ratila1.JGuardrails:jguardrails-spring-ai:v1.0.0")Just add the dependency and create guardrails.yml. Spring Boot auto-configures everything via GuardrailAutoConfiguration.
# application.yml
jguardrails:
enabled: true
config-path: classpath:guardrails.ymlGuardrailPipeline and GuardrailAdvisor are created automatically as Spring beans. No additional code needed.
@Configuration
public class LlmConfig {
@Bean
public GuardrailPipeline guardrailPipeline() {
return GuardrailPipeline.builder()
.addInputRail(new JailbreakDetector())
.addInputRail(PiiMasker.builder()
.entities(PiiEntity.EMAIL, PiiEntity.PHONE)
.build())
.addOutputRail(new ToxicityChecker())
.blockedResponse("I'm unable to process this request.")
.build();
}
@Bean
public ChatClient chatClient(ChatClient.Builder builder, GuardrailPipeline pipeline) {
return builder
.defaultAdvisors(new GuardrailAdvisor(pipeline))
.build();
}
}@Service
public class ChatService {
private final ChatClient chatClient;
public ChatService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String chat(String userMessage) {
// Guardrails are applied automatically via the Advisor
return chatClient.prompt()
.user(userMessage)
.call()
.content();
}
}implementation("com.github.Ratila1.JGuardrails:jguardrails-langchain4j:v1.0.0")Wraps any ChatLanguageModel. All generate() calls automatically pass through the pipeline.
ChatLanguageModel baseModel = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o")
.build();
GuardrailPipeline pipeline = GuardrailPipeline.builder()
.addInputRail(new JailbreakDetector())
.addInputRail(PiiMasker.builder().entities(PiiEntity.EMAIL).build())
.addOutputRail(new ToxicityChecker())
.blockedResponse("Request blocked.")
.build();
// Wrap the model β guardrails are applied transparently
ChatLanguageModel guardedModel = new GuardrailChatModelFilter(baseModel, pipeline);
String response = guardedModel.generate("Tell me about Java 21");interface MyAssistant {
String chat(String userMessage);
}
MyAssistant assistant = AiServices.builder(MyAssistant.class)
.chatLanguageModel(model)
.build();
GuardrailAiServiceInterceptor interceptor = new GuardrailAiServiceInterceptor(pipeline);
// Wrap the AiService call:
String response = interceptor.intercept(
userInput,
processedInput -> assistant.chat(processedInput)
);Creating a custom rail requires implementing a single method.
public class CompanyPolicyRail implements InputRail {
@Override
public String name() {
return "company-policy";
}
@Override
public int priority() {
return 50;
}
@Override
public RailResult process(String input, RailContext context) {
if (input.toLowerCase().contains("confidential")) {
return RailResult.block(name(), "Input contains restricted keyword 'confidential'");
}
return RailResult.pass(input, name());
}
}public class DisclaimerRail implements OutputRail {
private static final String DISCLAIMER =
"\n\n*This response was generated by AI and does not constitute professional advice.*";
@Override
public String name() { return "disclaimer-appender"; }
@Override
public int priority() { return 200; } // run last
@Override
public RailResult process(String output, String originalInput, RailContext context) {
return RailResult.modify(output + DISCLAIMER, name(), "Appended legal disclaimer");
}
}public class ToggleableRail implements InputRail {
private volatile boolean enabled = true;
@Override
public String name() { return "toggleable"; }
@Override
public boolean isEnabled() { return enabled; } // pipeline checks this before calling process()
public void setEnabled(boolean enabled) { this.enabled = enabled; }
@Override
public RailResult process(String input, RailContext context) {
// your logic
return RailResult.pass(input, name());
}
}GuardrailPipeline pipeline = GuardrailPipeline.builder()
.addInputRail(new CompanyPolicyRail())
.addOutputRail(new DisclaimerRail())
.build();By default, DefaultAuditLogger writes to SLF4J:
WARNfor every blockINFOfor every modification
[GUARDRAIL AUDIT] BLOCKED by rail='jailbreak-detector' reason='Prompt injection detected' at 2024-...
[GUARDRAIL AUDIT] MODIFIED by rail='pii-masker' reason='Masked 2 PII entities' at 2024-...
InMemoryAuditLogger auditLogger = new InMemoryAuditLogger();
GuardrailPipeline pipeline = GuardrailPipeline.builder()
.addInputRail(new JailbreakDetector())
.auditLogger(auditLogger)
.build();
pipeline.processInput("bad input", context);
// Assert in tests:
assertThat(auditLogger.getEntries()).hasSize(1);
assertThat(auditLogger.getEntries().get(0).getType()).isEqualTo(AuditEntry.Type.BLOCKED);
assertThat(auditLogger.getEntries().get(0).getRailName()).isEqualTo("jailbreak-detector");
// Filter by type:
List<AuditEntry> blocks = auditLogger.getEntries(AuditEntry.Type.BLOCKED);
List<AuditEntry> modifications = auditLogger.getEntries(AuditEntry.Type.MODIFIED);public class DatabaseAuditLogger implements AuditLogger {
private final AuditRepository repo;
@Override
public void log(AuditEntry entry) {
repo.save(new AuditRecord(
entry.getTimestamp(),
entry.getType().name(),
entry.getRailName(),
entry.getReason()
));
}
}DefaultMetrics keeps in-memory counters (thread-safe, backed by LongAdder).
DefaultMetrics metrics = new DefaultMetrics();
GuardrailPipeline pipeline = GuardrailPipeline.builder()
.metrics(metrics)
.build();
MetricsSnapshot snapshot = metrics.getSnapshot();
snapshot.totalBlocked(); // long β total requests blocked
snapshot.totalModified(); // long β total texts modified
snapshot.totalPassed(); // long β total requests passed
snapshot.totalErrors(); // long β rail errors
snapshot.blockedByRail(); // Map<String, Long> β per rail
snapshot.modifiedByRail(); // Map<String, Long>public class MicrometerGuardrailMetrics implements GuardrailMetrics {
private final MeterRegistry registry;
@Override
public void recordBlock(String railName) {
registry.counter("guardrail.blocks", "rail", railName).increment();
}
@Override
public void recordModification(String railName) {
registry.counter("guardrail.modifications", "rail", railName).increment();
}
@Override
public void recordPass(String railName) {
registry.counter("guardrail.passes", "phase", railName).increment();
}
@Override
public void recordError(String railName) {
registry.counter("guardrail.errors", "rail", railName).increment();
}
}All examples are in the jguardrails-examples module and require no LLM API key.
# Basic example: jailbreak detection, PII masking, toxicity check
./gradlew :jguardrails-examples:run -PmainClass=io.jguardrails.examples.BasicExample
# YAML configuration example with audit log output
./gradlew :jguardrails-examples:run -PmainClass=io.jguardrails.examples.YamlConfigExample
# Custom rails: language restriction + disclaimer appender
./gradlew :jguardrails-examples:run -PmainClass=io.jguardrails.examples.CustomRailExamplegit clone https://github.com/Ratila1/JGuardrails.git
cd JGuardrails
# Build all modules
./gradlew build
# Run all tests
./gradlew test
# Run a specific test class
./gradlew :jguardrails-detectors:test --tests "*.JailbreakDetectorTest"
# Publish to local Maven (~/.m2)
./gradlew publishToMavenLocalJGuardrails/
βββ jguardrails-core/ # Interfaces, pipeline, config, audit, metrics
βββ jguardrails-detectors/ # Built-in rails (jailbreak, PII, toxicity, ...)
βββ jguardrails-llm/ # LLM clients for LLM-as-judge
βββ jguardrails-spring-ai/ # Spring AI Advisor + AutoConfiguration
βββ jguardrails-langchain4j/ # LangChain4j ChatLanguageModel wrapper
βββ jguardrails-examples/ # Runnable examples with no external dependencies
A new KeywordAutomatonEngine implements multi-keyword matching via the Aho-Corasick automaton. All registered keywords are scanned in a single O(n + m) pass over the input text, where n = text length and m = total keyword length. This replaces the previous per-pattern regex loop for literal phrases and is particularly efficient when many keywords need to be checked simultaneously.
CompositePatternEngine routes each PatternSpec to the correct sub-engine at findFirst() time based on its declared type. REGEX specs go to RegexPatternEngine; KEYWORD specs go to KeywordAutomatonEngine. Both engines run in parallel during findFirst() and the result with the earlier position in text wins. JailbreakDetector and ToxicityChecker both use this composite engine by default.
PatternSpec now carries a Type field (REGEX or KEYWORD). YAML entries default to REGEX; entries with type: KEYWORD are compiled as Aho-Corasick keywords rather than regex patterns.
TextPatternEngine now declares a findFirst(String text, List<PatternSpec> specs) default method that iterates specs and returns the first match. KeywordAutomatonEngine overrides it with a true single-pass Aho-Corasick scan. CompositePatternEngine overrides it to partition by type, run both sub-engines, and return the earliest positional match.
The bundled YAML files (jailbreak-patterns.yml, toxicity-patterns.yml) now support type: KEYWORD entries. Literal jailbreak phrases ("bypass safety filter", "developer mode enabled", etc.) and toxicity phrases ("kill yourself", "i hate you", etc.) are now KEYWORD entries matched by Aho-Corasick.
Japanese jailbreak and toxicity phrases are now defined directly in jailbreak-patterns.yml and toxicity-patterns.yml as type: KEYWORD entries and matched by the Aho-Corasick engine β the same engine used for all literal phrase matching. Japanese also remains in the multilingual keyword files for double coverage.
A new ToxicityChecker.Category.THIRD_PERSON_ABUSE detects derogatory content about absent third parties: insults ("he is an idiot"), dehumanising phrases ("waste of space"), and death wishes ("she should die"). Subjects are restricted to human-referencing pronouns and references to avoid false positives on abstract narrative text. Patterns use UNICODE_CHARACTER_CLASS for correct \b handling across all 7 supported languages.
New public methods on PatternLoader:
| Method | Description |
|---|---|
buildRegexEngine(resource, sections...) |
Build RegexPatternEngine from classpath YAML (skips KEYWORD entries) |
buildKeywordEngine(resource, sections...) |
Build KeywordAutomatonEngine from classpath YAML (skips REGEX entries) |
buildCompositeEngine(resource, sections...) |
Build combined CompositePatternEngine from classpath YAML |
buildRegexEngineFromFile(path, section) |
Same, from filesystem path |
buildKeywordEngineFromFile(path, section) |
Same, from filesystem path |
buildCompositeEngineFromFile(path, section) |
Same, from filesystem path |
loadSpecs(resource, section) |
Load List<PatternSpec> with type information |
loadAllKeywordsFromFile(path) |
Load all keyword strings from a YAML file |
JailbreakDetector.Builder and ToxicityChecker.Builder now expose:
// Plug in a fully custom engine:
.engine(myTextPatternEngine)
// Replace defaults with patterns from a YAML file (regex + keyword):
.patternsFromFile(path, sectionKey)
// Extend defaults with extra patterns from a YAML file:
.addPatternsFromFile(path, sectionKey)
// ToxicityChecker only β replace / extend multilingual keywords:
.keywordsFromFile(path)
.addKeywordsFromFile(path)
// ToxicityChecker only β toggle multilingual detection:
.multilingualEnabled(boolean)| Language | Before 1.0.0 | 1.0.0 |
|---|---|---|
| EN / RU / FR / DE / ES / PL / IT | regex | regex + Aho-Corasick (literal phrases) |
| JA | multilingual keyword phase only | main engine (Aho-Corasick) + multilingual keyword phase |
| ZH / AR / HI / TR / KO | multilingual keyword phase | multilingual keyword phase (unchanged) |