-
Notifications
You must be signed in to change notification settings - Fork 24
[models] Support for Qwen3 models #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
df3a6aa
Add initial support for Qwen3 models on CPU
orionpapadakis fa30068
Move ModelLoader classes to model.loader package
orionpapadakis 489d1ee
Move State classes to inference.state package
orionpapadakis 6e8b0f5
Move Weights to inference.weights package
orionpapadakis 388aecd
[WIP] Refactor Weight class for modularity and extensibility
orionpapadakis c1ae6bc
Add weights for qwen3 in tornado format
orionpapadakis 3652383
Refactor Model design. Abandon Records, adopt interface with abstract…
orionpapadakis 7893bfa
Increase bytecode size
orionpapadakis 68e2d70
[WIP] Add a initial Tornado inference implementation for Qwen3 with c…
orionpapadakis e1eed87
[WIP] Cleanup
orionpapadakis 6b01570
Use optimized tornado kernel for Attention
orionpapadakis 7fa548e
Optimize Qcur rmsnorm
orionpapadakis 86378ad
Apply optimizations to Kcur rmsnorm and rename some Qcur fields
orionpapadakis e6b279c
Add an optimized kernel for attention
orionpapadakis 9f5929a
Cleanup and add some comments in forwardJavaQwen3
orionpapadakis 5f3b6c2
Cleanup InferenceEngine
orionpapadakis 717257a
Fix naming consistency of generateTokensXXX methods and add comments
orionpapadakis 4a29063
Cleanup dbg buffers functionality
orionpapadakis c4cd588
Clean up Qwen3State
orionpapadakis 4541c14
Cleanup model
orionpapadakis e1a4632
Provide an optimized rmsnorm kernel that fuses steps 1 and 2
orionpapadakis 6224ae0
Cleanup Qwen3Kernels
orionpapadakis 5b5e9e3
Cleanup Qwen3TornadoVMLayerPlanner
orionpapadakis 7dc5056
General cleanup
orionpapadakis 7b5f052
Point to latest tornadovm
orionpapadakis 480a1f0
Additional cleanup
orionpapadakis 5224381
Move things around for smooth interactive mode and consistency
orionpapadakis dabbdfb
Remove duplicative kernel
orionpapadakis 04ba434
Refactor and improve code formatting across multiple files
orionpapadakis 16f5114
Finalize review comments
orionpapadakis 2fd98ef
Update README.md
orionpapadakis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule tornadovm
updated
from a81afa to 6e29a5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| package com.example.auxiliary; | ||
|
|
||
| /** mask of a byte-sequence in UTF-8 encoding */ | ||
| public record Utf8Mask(int mask, int pattern, int len) { | ||
| //@formatter:off | ||
| public static final Utf8Mask[] MASKS = { | ||
| new Utf8Mask(0b11100000, 0b11000000, 2), | ||
| new Utf8Mask(0b11110000, 0b11100000, 3), | ||
| new Utf8Mask(0b11111000, 0b11110000, 4) | ||
| }; | ||
| //@formatter:on | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
48 changes: 48 additions & 0 deletions
48
src/main/java/com/example/core/model/tensor/F32FloatTensor.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| package com.example.core.model.tensor; | ||
|
|
||
| import com.example.core.model.GGMLType; | ||
| import jdk.incubator.vector.FloatVector; | ||
| import jdk.incubator.vector.VectorSpecies; | ||
|
|
||
| import java.lang.foreign.MemorySegment; | ||
| import java.lang.foreign.ValueLayout; | ||
|
|
||
| public final class F32FloatTensor extends FloatTensor { | ||
| final int size; | ||
| final MemorySegment segment; | ||
|
|
||
| public F32FloatTensor(int size, MemorySegment segment) { | ||
| this.size = size; | ||
| this.segment = segment; | ||
| } | ||
|
|
||
| @Override | ||
| public int size() { | ||
| return size; | ||
| } | ||
|
|
||
| @Override | ||
| public GGMLType type() { | ||
| return GGMLType.F32; | ||
| } | ||
|
|
||
| @Override | ||
| public MemorySegment asMemorySegment() { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public float getFloat(int index) { | ||
| return segment.get(ValueLayout.OfFloat.JAVA_FLOAT, index * Float.BYTES); | ||
| } | ||
|
|
||
| @Override | ||
| public void setFloat(int index, float value) { | ||
| segment.set(ValueLayout.OfFloat.JAVA_FLOAT, index * Float.BYTES, value); | ||
| } | ||
|
|
||
| @Override | ||
| protected FloatVector getFloatVector(VectorSpecies<Float> species, int offset) { | ||
| throw new UnsupportedOperationException("getFloatVector is not yet implemented."); | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.