Sentence Embeddings in Android

An Android library that provides a port to sentence-transformers, which are used to generate sentence embeddings (fixed-size vectors for text/sentences)

Read the blog: From Python To Android: HF Sentence Transformers (Embeddings)

Updates

2025-06

Add support for 16 KB page-size Android devices by updating project NDK version to r28b
Modify Gradle scripts of sentence_embeddings and model2vec modules to publish the AAR as a package on Maven Central

2025-03

Move the Rust source code from the libs branch to the main branch: We now use the rust-android-plugin to initiate cargo build from Gradle
Removed Git LFS: The ONNX models present in app/src/main/assets have been removed from the repository. Instead, app/build.gradle.kts downloads the models and tokenizer configs from HuggingFace using download_model.sh shell script.
Add Model2Vec: Model2Vec provides static sentence-embeddings through a fast-lookup
Remove Jitpack: A GitHub CI script now builds AARs for model2vec and sentence_embeddings Gradle modules that can be included in other projects

2024-08

Along with token_ids and attention_mask, the native library now also returns token_type_ids to support additional models like the bge-small-en-v1.5 (issue #3)

Supported Models

To add more models, refer the Adding New Models section.

Installation

Maven Artifacts

Include the following in your build.gradle script,

dependencies {
    // ... other packages
    
    // To use sentence-embeddings
    implementation 'io.gitlab.shubham0204:sentence-embeddings:v6'
    
    // To also use model2vec
    implementation 'io.gitlab.shubham0204:model2vec:v6'
}

Using the AAR from the Releases directly

The AARs for the sentence_embeddings and model2vec modules are available in the Releases which can be downloaded. Add the AARs to the app/libs directory and then in app/build.gradle.kts,

dependencies {
    // ...
    // Add one or both of them as needed
    implementation(file("libs/sentence_embeddings.aar"))
    implementation(file("libs/model2vec.aar"))
    // ...
}

Building the Project

Set up Android NDK version r27c

# Using the nttld/setup-ndk action
# Example manual equivalent:
wget https://dl.google.com/android/repository/android-ndk-r27c-linux.zip
unzip android-ndk-r27c-linux.zip
export ANDROID_NDK_HOME=/path/to/android-ndk-r27c

Install Rust targets for Android

rustup target add aarch64-linux-android armv7-linux-androideabi i686-linux-android x86_64-linux-android

Build the Rust code
```
./gradlew cargoBuild --stacktrace
```

Build AAR for sentence_embeddings module

./gradlew :sentence_embeddings:assembleRelease --stacktrace

Build AAR for model2vec module

./gradlew :model2vec:assembleRelease --stacktrace

Build APK for app module

./gradlew :app:assembleRelease --stacktrace

Build APK for app-model2vec module

./gradlew :app-model2vec:assembleRelease --stacktrace

Usage

API

The library provides a SentenceEmbedding class with init and encode suspend functions that initialize the model and generate the sentence embedding respectively.

The init function takes two mandatory arguments, modelBytes and tokenizerBytes.

import com.ml.shubham0204.sentence_embeddings.SentenceEmbedding

val sentenceEmbedding = SentenceEmbedding()

// Download the model and store it in the app's internal storage
// (OR) copy the model from the assets folder (see the app module in the repo)
val modelFile = File(filesDir, "model.onnx")
val tokenizerFile = File(filesDir, "tokenizer.json")
val tokenizerBytes = tokenizerFile.readBytes()

CoroutineScope(Dispatchers.IO).launch {
    sentenceEmbedding.init(
        modelFilepath = modelFile.absolutePath,
        tokenizerBytes = tokenizerBytes,
        useTokenTypeIds = false,
        outputTensorName = "sentence_embedding",
        useFP16 = false,
        useXNNPack = false
    )
}

Once the init functions completes its execution, we can call the encode function to transform the given sentence to an embedding,

CoroutineScope(Dispatchers.IO).launch {
    val embedding: FloatArray = sentenceEmbedding.encode("Delhi has a population 32 million")
    println("Embedding: $embedding")
    println("Embedding size: ${embedding.size}")
}

Compute Cosine Similarity

The embeddings are vectors whose relative similarity can be computed by measuring the cosine of the angle between the vectors, also termed as cosine similarity,

Tip

Here's an excellent blog to under cosine similarity

private fun cosineDistance(
    x1: FloatArray,
    x2: FloatArray
): Float {
    var mag1 = 0.0f
    var mag2 = 0.0f
    var product = 0.0f
    for (i in x1.indices) {
        mag1 += x1[i].pow(2)
        mag2 += x2[i].pow(2)
        product += x1[i] * x2[i]
    }
    mag1 = sqrt(mag1)
    mag2 = sqrt(mag2)
    return product / (mag1 * mag2)
}

CoroutineScope(Dispatchers.IO).launch {
    val e1: FloatArray = sentenceEmbedding.encode("Delhi has a population 32 million")
    val e2: FloatArray = sentenceEmbedding.encode("What is the population of Delhi?")
    val e3: FloatArray =
        sentenceEmbedding.encode("Cities with a population greater than 4 million are termed as metro cities")

    val d12 = cosineDistance(e1, e2)
    val d13 = cosineDistance(e1, e3)
    println("Similarity between e1 and e2: $d12")
    println("Similarity between e1 and e3: $d13")
}

Adding New Models

We demonstrate how the snowflake-arctic-embed-s model can be added to the sample application present in the app module.

Download the model.onnx and tokenizer.json files from the HF snowflake-arctic-embed-s repository.
Create a new sub-directory in app/src/main/assets named snowflake-arctic-embed-s, the copy the two files to the sub-directory.
In Config.kt, add a new entry in the Models enum and a new branch in getModelConfig corresponding to the new model entry added in the enum,

enum class Model {
    ALL_MINILM_L6_V2,
    BGE_SMALL_EN_V1_5,
    SNOWFLAKE_ARCTIC_EMBED_S // Add the new entry
}

fun getModelConfig(model: Model): ModelConfig {
    return when (model) {
        Model.ALL_MINILM_L6_V2 -> ModelConfig(
            modelName = "all-minilm-l6-v2",
            modelAssetsFilepath = "all-minilm-l6-v2/model.onnx",
            tokenizerAssetsFilepath = "all-minilm-l6-v2/tokenizer.json",
            useTokenTypeIds = false,
            outputTensorName = "sentence_embedding"
        )
        Model.BGE_SMALL_EN_V1_5 -> ModelConfig(
            modelName = "bge-small-en-v1.5",
            modelAssetsFilepath = "bge-small-en-v1_5/model.onnx",
            tokenizerAssetsFilepath = "bge-small-en-v1_5/tokenizer.json",
            useTokenTypeIds = true,
            outputTensorName = "last_hidden_state"
        )
        // Add a new branch for the model
        Model.SNOWFLAKE_ARCTIC_EMBED_S -> ModelConfig(
            modelName = "snowflake-arctic-embed-s",
            modelAssetsFilepath = "snowflake-arctic-embed-s/model.onnx",
            tokenizerAssetsFilepath = "snowflake-arctic-embed-s/tokenizer.json",
            useTokenTypeIds = true,
            outputTensorName = "last_hidden_state"
        )
    }
}

To determine the values for useTokenTypeIds and outputTensorName, open the model with Netron or load the model in Python with onnxruntime. We need to check the names of the input and output tensors.

With Netron, check if token_type_ids is the name of an input tensor. Accordingly, set the value of useTokenTypeIds while creating an instance of ModelConfig. For outputTensorName, choose the name of the output tensor which provides the embedding. For the snowflake-arctic-embed-s model, the name of that output tensor is last_hidden_state.

The same information can be printed to the console with following Python snippet using the onnxruntime package,

import onnxruntime as ort

session = ort.InferenceSession("model.onnx" )

print("Inputs: ")
print( [ t.shape for t in session.get_inputs() ] )
print( [ t.type for t in session.get_inputs() ] )
print( [ t.name for t in session.get_inputs() ] )

print("Outputs: ")
print( [ t.shape for t in session.get_outputs() ] )
print( [ t.type for t in session.get_outputs() ] )
print( [ t.name for t in session.get_outputs() ] )

Run the app on the test-device

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github		.github
app-model2vec		app-model2vec
app		app
gradle		gradle
model2vec		model2vec
resources		resources
rs-hf-tokenizer		rs-hf-tokenizer
rs-model2vec		rs-model2vec
sentence_embeddings		sentence_embeddings
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
download_models.bat		download_models.bat
download_models.sh		download_models.sh
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Sentence Embeddings in Android

Updates

2025-06

2025-03

2024-08

Supported Models

Installation

Maven Artifacts

Using the AAR from the Releases directly

Building the Project

Usage

API

Compute Cosine Similarity

Adding New Models

About

Uh oh!

Releases 4

Sponsor this project

Uh oh!

Uh oh!

Contributors 2

Languages

Uh oh!

License

shubham0204/Sentence-Embeddings-Android

Folders and files

Latest commit

History

Repository files navigation

Sentence Embeddings in Android

Updates

2025-06

2025-03

2024-08

Supported Models

Installation

Maven Artifacts

Using the AAR from the Releases directly

Building the Project

Usage

API

Compute Cosine Similarity

Adding New Models

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Sponsor this project

Uh oh!

Uh oh!

Contributors 2

Languages