quarkusio
diff --git a/‎_posts/2025-11-19-wasm-ai-agents.adoc‎
Lines changed: 158 additions & 0 deletions b/‎_posts/2025-11-19-wasm-ai-agents.adoc‎
Lines changed: 158 additions & 0 deletions
diff --git a/‎assets/images/posts/agentic/wasm.png‎
210 KB b/‎assets/images/posts/agentic/wasm.png‎
210 KB
@@ -0,0 +1,158 @@
+---
+layout: post
+title: 'Polyglot AI Agents: WebAssembly Meets the JVM'
+date: 2025-11-19T00:00:00Z
+tags: ai llm agents wasm jvm
+synopsis: 'Showcase how to run multi-language WebAssembly AI agents in a self-contained, enterprise-ready way with Quarkus'
+author: andreatp, mariofusco
+---
+:imagesdir: /assets/images/posts/agentic
+
+In two recent posts about WASM agents, Davide Eynard and Baris Guler show, respectively, https://blog.mozilla.ai/wasm-agents-ai-agents-running-in-your-browser/[how to bring agentic frameworks into the browser] and https://blog.mozilla.ai/3w-for-in-browser-ai-webllm-wasm-webworkers/[how to extend them] with in-browser inference and support for multiple programming languages. This article explores a different point of view: is it possible to leverage the JVM's polyglot capabilities to create a server-side equivalent, that's equally self-contained, but optimized for enterprise deployment? The result is a blueprint that showcases how the JVM can serve as a viable polyglot runtime for AI agents, combining the performance benefits of WebAssembly with the reliability and maturity of Java's ecosystem.
+
+== Why the JVM for AI Agents?
+
+The browser-based approach has its advantages, like privacy, offline capability, and user control; enterprise environments often require different characteristics: centralized management, resource optimization, security controls, and integration with existing infrastructure.
+
+The JVM is one of the most widely used runtimes in the world, powering everything from enterprise applications to mobile development. Its platform independence enables applications to run seamlessly across diverse operating systems without changes, while its robust memory management with automatic garbage collection simplifies development and reduces memory leaks. The JVM's security model enforces strict policies including bytecode verification and sandboxing, and it supports multiple programming languages beyond Java, fostering a versatile and expansive ecosystem. Furthermore, it offers several compelling advantages for AI agent deployment:
+
+*Enterprise-Grade Infrastructure*: Built-in monitoring, profiling, debugging tools, and enterprise security features that are battle-tested in production environments.
+
+*Self-Contained Deployment*: Everything runs within JVM boundaries, no external dependencies, no complex toolchain management, just a single JAR file that contains all the AI capabilities.
+
+*Polyglot Capabilities*: By leveraging WebAssembly on the JVM, we provide a unified execution model that supports multiple languages such as Rust, Go, Python, and JavaScript, while ensuring that each module runs in a securely isolated and sandboxed memory environment. This allows diverse agents to coexist within the same process without sharing memory and with strong safety boundaries.
+
+== Architecture: The JVM as a Polyglot AI Runtime
+
+Our architecture demonstrates how the JVM can serve as a unified runtime for multi-language AI agents.
+
+A *REST API layer* exposes an endpoint that handles HTTP requests and offers routing capabilities (in our example we provide path-based routing following a `/hello/{language}/{lang}/{name}` format).
+
+A *service layer* makes different kinds of services available: a ChatService, taking care of prompting the LLM, and individual services for each agent type. Note that here we provide language-specific services (e.g. one for Rust, another for Go, and so on) but the agents themselves are also parametric with respect to those parameters specified via the REST API.
+
+A *WebAssembly runtime layer* takes care of integrating WASM modules built in different programming languages, relying on pure https://github.com/dylibso/chicory[Chicory] for Rust/Go modules and on the https://extism.org/[Extism]’s https://github.com/extism/chicory-sdk[Chicory-sdk] for Python.
+
+An *AI integration layer* takes care of actually integrating the LLM into our system, using https://github.com/langchain4j/langchain4j[LangChain4j] as a Java integration framework, https://github.com/jlama-ai/jlama[JLama] (Java implementation of LLaMA) for model inference, and TinyLlama-1.1B as a lightweight model for efficient local processing. LangChain4j provides seamless integration with both local and cloud-based language models, with a modular architecture that makes it easy to switch between different model providers. JLama provides optimized inference and memory management specifically designed for the JVM environment (both JLama and Chicory run entirely within the JVM boundaries, ensuring everything is self-contained within the Java ecosystem). TinyLlama-1.1B-Chat-v1.0 is a compact model that works well for demo purposes and can run efficiently on development machines, showing that local inference does not always require massive resources.
+
+[.text-center]
+.JVM as a Polyglot AI Agent Runtime
+image::wasm.png[width=50%, align="center", alt="JVM as a Polyglot AI Agent Runtime"]
+
+== The Polyglot Advantage
+
+Each language brings its strengths to the AI agent ecosystem:
+
+*Rust*: High-performance systems programming with memory safety, compiled to WebAssembly for maximum efficiency
+
+*Go*: Efficient and with clean syntax, leveraging TinyGo for WASM compilation with WASI support
+
+*Python*: Flexible scripting via PyO3 compilation to WASM, maintaining Python's expressiveness
+
+*JavaScript*: Dynamic scripting with QuickJS integration, enabling runtime flexibility
+
+The beauty of this approach is that all these languages run within the same JVM process, sharing resources and memory efficiently while maintaining their individual characteristics. This means that whatever language you used to build your agent can be integrated into this system, possibly together with other agents written in different languages.
+
+== Performance and Resource Efficiency
+
+The JVM approach offers several performance advantages:
+
+*Memory Efficiency*: The JVM's garbage collector manages memory for all languages uniformly, eliminating the memory management complexity of multi-runtime approaches.
+
+*Resource Sharing*: All agents share the same JVM process, reducing memory overhead and enabling efficient resource utilization.
+
+*Enterprise Monitoring*: Built-in JVM monitoring tools provide visibility into agent performance, memory usage, and execution patterns.
+
+*Scalability*: The stateless design allows for horizontal scaling across multiple JVM instances, with load balancing and container orchestration.
+
+=== Try It Yourself
+
+Getting started is straightforward with our Docker-based approach:
+
+[source,shell]
+----
+# Clone the repository
+git clone https://github.com/mozilla-ai/wasm-java-agents-blueprint.git
+cd wasm-java-agents-blueprint
+
+# Start the application
+./mvnw quarkus:dev
+
+# Test the polyglot agents
+curl -X PUT "http://localhost:8080/hello/rust/en/Alice" \
+     -H "Content-Type: text/plain" \
+     --data "Tell me about yourself"
+
+curl -X PUT "http://localhost:8080/hello/go/fr/Bob" \
+-H "Content-Type: text/plain" \
+--data "What can you do?"
+
+curl -X PUT "http://localhost:8080/hello/py/de/Charlie" \
+-H "Content-Type: text/plain" \
+--data "Explain your capabilities"
+
+curl -X PUT "http://localhost:8080/hello/js/es/Diana" \
+-H "Content-Type: text/plain" \
+--data "How do you work?"
+----
+
+== What This Enables
+
+The JVM approach opens up several enterprise-focused use cases:
+
+*Centralized AI Services*: Deploy AI agents as microservices within existing Java infrastructure, leveraging existing monitoring, security, and deployment tools.
+
+*Multi-Language AI Pipelines*: Build complex AI workflows that support agent development in different programming languages.
+
+*Enterprise Integration*: Seamless integration with existing Java-based systems, databases, and enterprise middleware.
+
+*Resource Optimization*: Efficient resource utilization through shared JVM processes and optimized garbage collection.
+
+*Security and Compliance*: Leverage JVM security features and enterprise-grade access controls for sensitive AI applications.
+
+== Enhancement Areas
+
+Several architectural improvements could significantly enhance this JVM-based approach. The current implementation works well, but there's room for optimization and new capabilities:
+
+*Multi-Model Agent Orchestration*: Enable agents to work together across languages, with Rust agents handling performance-critical tasks, Python agents managing data processing, and JavaScript agents providing dynamic behavior.
+
+*Performance Optimization*: Add JVM-specific optimizations like GraalVM native compilation for reduced memory footprint and faster startup times, plus advanced garbage collection tuning for WASM workloads.
+
+*Distributed Agent Coordination*: Extend the architecture to support distributed agent execution across multiple JVM instances with shared state and message passing between agents.
+
+*Enhanced Monitoring and Observability*: Integrate with enterprise monitoring tools like Micrometer, Prometheus, and distributed tracing to provide comprehensive visibility into agent performance and behavior.
+
+*Dynamic Agent Loading*: Support for hot-swapping agent implementations without service restarts, enabling A/B testing and gradual rollouts of new agent capabilities.
+
+*Integration with Enterprise Middleware*: Enhanced integration with message queues, event streams, and enterprise service buses for building complex AI workflows.
+
+== Conclusion
+
+The JVM's polyglot capabilities, combined with WebAssembly and modern AI frameworks, create a viable platform for deploying AI agents in enterprise environments. By leveraging WASI for secure WebAssembly execution and LangChain4j for AI integration, we can create self-contained, efficient, and scalable AI agent systems that integrate seamlessly with existing Java infrastructure.
+
+Both the browser-based 3W approach and our JVM-based approach represent different but complementary visions for AI agent deployment. The browser approach prioritizes user control and privacy, while the JVM approach emphasizes enterprise integration and resource efficiency.
+
+The key insight is that the choice of runtime, whether this is browser or JVM, should be driven by the specific requirements of your use case. For consumer applications and privacy-sensitive scenarios, the browser approach is ideal. For enterprise deployments and resource-intensive applications, the JVM approach provides the necessary infrastructure and performance characteristics.
+
+The future of AI agent deployment isn't about choosing between different available approaches—it's about having the right tool for the right job. As we continue to explore the boundaries of what's possible with AI agents, the JVM's mature ecosystem and polyglot capabilities provide a solid foundation for building the next generation of enterprise AI applications.
+
+== Links and Resources
+
+This blueprint is built on top of several excellent open-source projects. Here are the key technologies and resources:
+
+=== Technologies
+
+- https://quarkus.io/[Quarkus] - Modern, cloud-native Java framework with fast startup times
+- https://github.com/langchain4j/langchain4j[LangChain4j] - Java AI framework for LLM integration
+- https://github.com/jlama-ai/jlama[JLama] - Java implementation of LLaMA for local inference
+- https://github.com/dylibso/chicory[Chicory] - Pure Java WebAssembly runtime
+- https://github.com/extism/chicory-sdk[Extism Chicory SDK] - Extism SDK for Chicory WebAssembly runtime
+- https://github.com/extism/python-pdk[Extism Python PDK] - Python Plugin Development Kit for Extism
+- https://github.com/roastedroot/quickjs4j[QuickJS4j] - JavaScript execution within the JVM
+- https://tinygo.org/[TinyGo] - Go compiler for WebAssembly
+- https://github.com/PyO3/pyo3[PyO3] - Rust bindings for Python
+
+=== Related Projects
+
+- https://github.com/mozilla-ai/wasm-agents-blueprint[wasm-agents-blueprint] - Original browser-based WASM agents blueprint
+- https://github.com/hwclass/wasm-browser-agents-blueprint[wasm-browser-agents-blueprint] - Browser-native AI agents with WebLLM + WASM + WebWorkers
+- https://developer-hub.mozilla.ai/[Mozilla.ai Blueprints Hub] - Collection of AI agent blueprints and examples