Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kotlin multiplatform support #7969

Merged
merged 11 commits into from
May 26, 2023
Merged

Conversation

paulovap
Copy link
Collaborator

This PR adds a new kotlin code generator and new runtime for Kotlin multi platform.

Previous implementation of the kotlin generator relies on java libraries and limited to JVM only implementation. This new generator breaks away for that approach by creating a pure kotlin runtime, which can be used, in theory, on any platform that Kotlin multiple forms supports.

Aside that, the implementation adds more type safety in the API at little to no cost, due to the value class types on kotlin.

For now the new kotlin code can be generated with the flag --kotlin-kmp on flatc. This will change as soon as we publish the libraries on maven, making it the default implementation in the future. Deprecating the old one over time.

Summary of features:

  • 1:1 feature parity with old kotlin implementation
  • add type safe API for passing offsets as parameters
  • enums are represented as value classes
  • Supports iOS and MacOS as targets, more will come later.

What is missing:

  • As the old Kotlin and Java API, unions can only return Tables.
  • Support Java's ByteBuffer
  • Support Okio
  • Release on Maven

The first implementation of the Kotlin code generation was made years
ago at the time Kotlin Multiplaform was not stable and Kotlin is mostly
used on JVM-based targets. For this reason the generated code uses java
based runtime.

That design decision comes with many drawbacks, leaving the code
generated more java-like and making it impossible to use more advanced
features of the Kotlin language.

In this change we are adding two parts: A pure, multi-plaform, Kotlin
runtime and a new code generator to accompany it.
Now that we have a new runtime the accepts unsigned types, we don't
need to code generate casting back and from signed scalars. This
MR removes this from both code generations and adds the necessary
API to the runtime.
Currently, kotlin was following Java's approach of representing objects,
vectors, tables as "Int" (the position of it in the buffer). This change
replaces naked Int with Offset<T>, offering a type-safe API. So,
instead of

fun Table.createTable(b: FlatBufferBuilder, subTable: Int)

We will have

fun Table.createTable(b: FlatBufferBuilder, subTable: Offset<SubTable>)

Making impossible to accidentally switch parameters.

The performance should be similar to use Int as we are using value
class for Offset and ArrayOffset, which most of the time translate to
Int in the bytecode.
Add builder constructor to make create of table more ergonomic.
For example the movie sample for the test set could be written as:

Movie.createMovie(fbb,
    mainCharacterType = Character_.MuLan,
    mainCharacter = att) {
    charactersType = charsType
    this.characters = characters
}

instead of:

Movie.startMovie(fbb)
Movie.addMainCharacterType(fbb, Character_.MuLan)
Movie.addMainCharacter(fbb, att as Offset<Any>)
Movie.addCharactersType(fbb, charsType)
Movie.addCharacters(fbb, charsVec)
Movie.endMovie(fbb)
Moving to flatbuffer enums to value class adds type safety for parameters
with minimum to no performance impact.
Just a small change on the APIs that receive union as parameters,
creating a typealias UnionOffset to avoid using Offset<Any>. To "convert"
an table offset to an union, one just call Offset.toUnion().
1 - Add benchmark comparing serialization between Java & Kotlin
2 - ReadWriteBuffer does not auto-grow (thus avoid check size in every op)
3 - Add specialized add functions on FlatBufferBuilder to avoid boxing
offsets.
4 - Remove a few Kotlin syntax sugar that generated performance penalties.
@github-actions github-actions bot added c++ CI Continuous Integration codegen Involving generating code from schema kotlin labels May 20, 2023
@paulovap
Copy link
Collaborator Author

Added a small benchmark comparing with Java implementation.

jvm summary:
Benchmark                                          Mode  Cnt         Score         Error  Units
FlatbufferBenchmark.monstersDeserializationJava    avgt    5  60202904,433 ± 2571267,473  ns/op
FlatbufferBenchmark.monstersDeserializationKotlin  avgt    5  63491321,720 ±  582610,015  ns/op
FlatbufferBenchmark.monstersSerializationJava      avgt    5  43302673,950 ±  195710,100  ns/op
FlatbufferBenchmark.monstersSerializationKotlin    avgt    5  46153101,229 ±   87102,836  ns/op

@mikeholler
Copy link
Contributor

This is excellent! It's been on my list to review this and I'm glad to find it was already merged.

@dbaileychess this is an important functionality addition. What do you think about creating a new tag so that we can start using it without needing to publish on our own? I'm not sure what the release overhead looks like but if it's easy I'd appreciate it and I'm sure others would, too.

@paulovap
Copy link
Collaborator Author

This is excellent! It's been on my list to review this and I'm glad to find it was already merged.

@dbaileychess this is an important functionality addition. What do you think about creating a new tag so that we can start using it without needing to publish on our own? I'm not sure what the release overhead looks like but if it's easy I'd appreciate it and I'm sure others would, too.

It will take some time to get the whole maven setup right, but this is in progress on #8014

@mikeholler
Copy link
Contributor

Thanks, I've subscribed for updates on that ticket. I also want to say that I've done a prototype where I've built flatc and interned the library code manually and verified everything worked with our current schema. We previously had manually written common layer, with specific implementation code for JS, JVM, and Native that used generated flatc code for each language. It was such a mess and I was able to delete three of those layers mostly seamlessly. So @paulovap just wanted to say thank you and that it's nice to see it working so well.

I do think I found a small bug that I need to narrow down where, in JS, null is returned from a function that's supposed to return a non-nullable Byte, which was strange. I was able to work around it but I want to isolate it and see if I can reproduce it into a bug report. Should be able to get to that in the next few days but I don't think it should stop or halt release.

@paulovap
Copy link
Collaborator Author

paulovap commented Jun 28, 2023

So @paulovap just wanted to say thank you and that it's nice to see it working so well.

I am glad it will be of use :)

I do think I found a small bug that I need to narrow down where, in JS, null is returned from a function that's supposed to return a non-nullable Byte, which was strange. I was able to work around it but I want to isolate it and see if I can reproduce it into a bug report. Should be able to get to that in the next few days but I don't think it should stop or halt release.

Interesting. If you can isolate the test case, I can definitely fix it.

jochenparm pushed a commit to jochenparm/flatbuffers that referenced this pull request Oct 29, 2024
* [Kotlin] Introduction to Kotlin Multiplaform

The first implementation of the Kotlin code generation was made years
ago at the time Kotlin Multiplaform was not stable and Kotlin is mostly
used on JVM-based targets. For this reason the generated code uses java
based runtime.

That design decision comes with many drawbacks, leaving the code
generated more java-like and making it impossible to use more advanced
features of the Kotlin language.

In this change we are adding two parts: A pure, multi-plaform, Kotlin
runtime and a new code generator to accompany it.

* [Kotlin] Remove scalar sign cast from code generation

Now that we have a new runtime the accepts unsigned types, we don't
need to code generate casting back and from signed scalars. This
MR removes this from both code generations and adds the necessary
API to the runtime.

* [Kotlin] Use offset on public API to represent buffer position

Currently, kotlin was following Java's approach of representing objects,
vectors, tables as "Int" (the position of it in the buffer). This change
replaces naked Int with Offset<T>, offering a type-safe API. So,
instead of

fun Table.createTable(b: FlatBufferBuilder, subTable: Int)

We will have

fun Table.createTable(b: FlatBufferBuilder, subTable: Offset<SubTable>)

Making impossible to accidentally switch parameters.

The performance should be similar to use Int as we are using value
class for Offset and ArrayOffset, which most of the time translate to
Int in the bytecode.

* [Kotlin] Add builder for tables

Add builder constructor to make create of table more ergonomic.
For example the movie sample for the test set could be written as:

Movie.createMovie(fbb,
    mainCharacterType = Character_.MuLan,
    mainCharacter = att) {
    charactersType = charsType
    this.characters = characters
}

instead of:

Movie.startMovie(fbb)
Movie.addMainCharacterType(fbb, Character_.MuLan)
Movie.addMainCharacter(fbb, att as Offset<Any>)
Movie.addCharactersType(fbb, charsType)
Movie.addCharacters(fbb, charsVec)
Movie.endMovie(fbb)

* [Kotlin] Move enum types to value class

Moving to flatbuffer enums to value class adds type safety for parameters
with minimum to no performance impact.

* [Kotlin] Simplify Union parameters to avoid naked casting

Just a small change on the APIs that receive union as parameters,
creating a typealias UnionOffset to avoid using Offset<Any>. To "convert"
an table offset to an union, one just call Offset.toUnion().

* [Kotlin] Apply clang-format on kotlin code generators

* [Kotlin] Update kotlin generator to follow official naming conventions

Updating directory, package and enum naming to follow Kotlin official
convention.

https://kotlinlang.org/docs/coding-conventions.html#naming-rules

* [Kotlin] Add fixes to improve performance

1 - Add benchmark comparing serialization between Java & Kotlin
2 - ReadWriteBuffer does not auto-grow (thus avoid check size in every op)
3 - Add specialized add functions on FlatBufferBuilder to avoid boxing
offsets.
4 - Remove a few Kotlin syntax sugar that generated performance penalties.

* [Kotlin] Remove builder from Kotlin KMP and add some optimizations
to avoid boxing of Offset classes

---------

Co-authored-by: Derek Bailey <derekbailey@google.com>
jochenparm pushed a commit to jochenparm/flatbuffers that referenced this pull request Oct 29, 2024
* [Kotlin] Introduction to Kotlin Multiplaform

The first implementation of the Kotlin code generation was made years
ago at the time Kotlin Multiplaform was not stable and Kotlin is mostly
used on JVM-based targets. For this reason the generated code uses java
based runtime.

That design decision comes with many drawbacks, leaving the code
generated more java-like and making it impossible to use more advanced
features of the Kotlin language.

In this change we are adding two parts: A pure, multi-plaform, Kotlin
runtime and a new code generator to accompany it.

* [Kotlin] Remove scalar sign cast from code generation

Now that we have a new runtime the accepts unsigned types, we don't
need to code generate casting back and from signed scalars. This
MR removes this from both code generations and adds the necessary
API to the runtime.

* [Kotlin] Use offset on public API to represent buffer position

Currently, kotlin was following Java's approach of representing objects,
vectors, tables as "Int" (the position of it in the buffer). This change
replaces naked Int with Offset<T>, offering a type-safe API. So,
instead of

fun Table.createTable(b: FlatBufferBuilder, subTable: Int)

We will have

fun Table.createTable(b: FlatBufferBuilder, subTable: Offset<SubTable>)

Making impossible to accidentally switch parameters.

The performance should be similar to use Int as we are using value
class for Offset and ArrayOffset, which most of the time translate to
Int in the bytecode.

* [Kotlin] Add builder for tables

Add builder constructor to make create of table more ergonomic.
For example the movie sample for the test set could be written as:

Movie.createMovie(fbb,
    mainCharacterType = Character_.MuLan,
    mainCharacter = att) {
    charactersType = charsType
    this.characters = characters
}

instead of:

Movie.startMovie(fbb)
Movie.addMainCharacterType(fbb, Character_.MuLan)
Movie.addMainCharacter(fbb, att as Offset<Any>)
Movie.addCharactersType(fbb, charsType)
Movie.addCharacters(fbb, charsVec)
Movie.endMovie(fbb)

* [Kotlin] Move enum types to value class

Moving to flatbuffer enums to value class adds type safety for parameters
with minimum to no performance impact.

* [Kotlin] Simplify Union parameters to avoid naked casting

Just a small change on the APIs that receive union as parameters,
creating a typealias UnionOffset to avoid using Offset<Any>. To "convert"
an table offset to an union, one just call Offset.toUnion().

* [Kotlin] Apply clang-format on kotlin code generators

* [Kotlin] Update kotlin generator to follow official naming conventions

Updating directory, package and enum naming to follow Kotlin official
convention.

https://kotlinlang.org/docs/coding-conventions.html#naming-rules

* [Kotlin] Add fixes to improve performance

1 - Add benchmark comparing serialization between Java & Kotlin
2 - ReadWriteBuffer does not auto-grow (thus avoid check size in every op)
3 - Add specialized add functions on FlatBufferBuilder to avoid boxing
offsets.
4 - Remove a few Kotlin syntax sugar that generated performance penalties.

* [Kotlin] Remove builder from Kotlin KMP and add some optimizations
to avoid boxing of Offset classes

---------

Co-authored-by: Derek Bailey <derekbailey@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ CI Continuous Integration codegen Involving generating code from schema kotlin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants