GitHub - wild-edge/wildedge-android: Android SDK for WildEdge

On-device ML inference monitoring for Android. Tracks latency, confidence, drift, and hardware metrics without ever sending raw inputs.

Pre-release: API is unstable until v1.0.

Quick start

Add your DSN to AndroidManifest.xml:

<application ...>
    <meta-data
        android:name="dev.wildedge.dsn"
        android:value="@string/wildedge_dsn" />
</application>

Then wrap your TFLite interpreter:

val wildEdge = WildEdge.getInstance()
val interpreter = wildEdge.decorate(
    Interpreter(modelFile, Interpreter.Options()), modelFile
)

interpreter.run(inputBuffer, outputBuffer)
interpreter.close()

For other frameworks, see Integrations below.

No DSN? The client runs in noop mode: all calls work, events are discarded. Safe to ship in all build variants.

Install

Not yet published. Watch Releases for the first published version.

dependencies {
    implementation("dev.wildedge:wildedge-android:0.2.0")
}

With a version catalog (gradle/libs.versions.toml):

[versions]
wildedge = "0.2.0"

[libraries]
wildedge = { group = "dev.wildedge", name = "wildedge-android", version.ref = "wildedge" }

dependencies {
    implementation(libs.wildedge)
}

Setup

Option A: manifest

WildEdge initializes itself before Application.onCreate() runs. Add to AndroidManifest.xml:

<application ...>
    <meta-data
        android:name="dev.wildedge.dsn"
        android:value="@string/wildedge_dsn" />
</application>

<!-- keep out of source control -->
<resources>
    <string name="wildedge_dsn">https://<pubkey>@ingest.wildedge.dev/<project-id></string>
</resources>

val wildEdge = WildEdge.getInstance()

Option B: manual init

val wildEdge: WildEdgeClient = WildEdge.init(applicationContext) {
    dsn = "https://<pubkey>@ingest.wildedge.dev/<project-id>" // or WILDEDGE_DSN env var
}

init() sets the shared instance, so WildEdge.getInstance() works after this call.

Integrations

TFLite

modelId and quantization are inferred from the filename:

val modelFile = File(modelPath) // e.g. "yolo_v8_int8.tflite"
val interpreter = wildEdge.decorate(
    Interpreter(modelFile, Interpreter.Options()), modelFile, modelVersion = "8.0"
)
// modelId = "yolo_v8_int8", quantization = "int8"

interpreter.run(inputBuffer, outputBuffer)
interpreter.close()

Explicit override:

val interpreter = wildEdge.decorate(
    Interpreter(modelFile, Interpreter.Options()),
    modelId = "yolo-v8", modelVersion = "8.0", quantization = "int8"
)

ONNX Runtime

val modelFile = File(modelPath) // e.g. "face_detector_fp16.onnx"
val session = wildEdge.decorate(
    env.createSession(modelFile.absolutePath, OrtSession.SessionOptions()),
    modelFile
)
// modelId = "face_detector_fp16", quantization = "f16"

val result = session.run(inputs)
session.close()

MLKit

val handle = wildEdge.registerMlKitModel("face-detector", modelVersion = "16.1")

faceDetector.process(image).trackWith(handle) { faces ->
    DetectionOutputMeta(numPredictions = faces.size)
}

LiteRT LLM (litertlm)

val engineConfig = EngineConfig(modelPath = modelPath)
val engine = wildEdge.decorate(Engine(engineConfig), engineConfig)
val conversation = engine.createConversation()

val listener = resultListener.trackWith(engine.handle, WildEdge.analyzeText(userInput))
conversation.sendMessageAsync(contents, listener)
engine.close()

Captures total duration, time to first token, tokens/sec, and estimated tokens in/out. Works identically for AICore.

Play Services TFLite

val interpreter = wildEdge.decorate(
    InterpreterApi.create(modelFile, InterpreterApi.Options()),
    modelFile
)

interpreter.run(inputBuffer, outputBuffer)
interpreter.close()

Google AI (Gemini)

val gemini = wildEdge.decorate(
    GenerativeModel(modelName = "gemini-2.0-flash", apiKey = "<key>"),
    modelId = "gemini-2.0-flash",
    modelFamily = "gemini",
)

// Streaming — tracking fires when the flow completes
gemini.generateContentStream(prompt, inputMeta = WildEdge.analyzeText(prompt))
    .collect { response -> append(response.text.orEmpty()) }

// Unary
val response = gemini.generateContent(prompt, inputMeta = WildEdge.analyzeText(prompt))

// Untracked operations go through .model directly
val chat = gemini.model.startChat(history)
val tokens = gemini.model.countTokens(prompt)

Remote models

val handle = wildEdge.registerModel("gpt-4o-mini", ModelInfo(
    modelName = "GPT-4o mini",
    modelVersion = "2024-07-18",
    modelSource = "api",
    modelFormat = "remote",
    inputModality = InputModality.Text,
    outputModality = OutputModality.Generation,
))

val response = handle.trackSuspendInference {
    callRemoteApi(prompt)
}

To include token counts from the response:

val response = handle.trackSuspendInference(
    outputMetaExtractor = { r ->
        GenerationOutputMeta(
            tokensIn = r.usage.promptTokens,
            tokensOut = r.usage.completionTokens,
        ).toMap()
    },
) {
    callRemoteApi(prompt)
}

Manual tracking

val handle = wildEdge.registerModel("my-model", ModelInfo(
    modelName = "MobileNet",
    modelVersion = "v3",
    modelSource = "local",
    modelFormat = "custom",
    inputModality = InputModality.Image,
    outputModality = OutputModality.Detection,
))

handle.trackLoad(durationMs = loadMs, accelerator = Accelerator.CPU, coldStart = true)
val output = handle.trackInference { model.run(input) }
handle.trackUnload()

Feedback

handle.trackFeedback(FeedbackType.ThumbsUp)
handle.trackFeedback(FeedbackType.Custom("hallucination"))

trackFeedback links to the most recent inference on the handle. Pass relatedInferenceId to link to an earlier one:

val inferenceId = handle.trackInference(durationMs = ms)
handle.trackFeedback(FeedbackType.Edited, relatedInferenceId = inferenceId, editDistance = 5)

Value	Meaning
`FeedbackType.ThumbsUp`	User approved the result
`FeedbackType.ThumbsDown`	User rejected the result
`FeedbackType.Accepted`	User acted on the result without editing
`FeedbackType.Edited`	User accepted but modified the result
`FeedbackType.Rejected`	User dismissed or ignored the result
`FeedbackType.Custom(value)`	Domain-specific signal (e.g. `"hallucination"`, `"safety_flag"`)

Tracing

Group related inferences so the server can reconstruct the full pipeline:

wildEdge.trace("user-query") { trace ->
    val embedding = trace.span("embed") { embedHandle.trackInference { embedModel.run(input) } }
    trace.span("classify") { classifyHandle.trackInference { classifyModel.run(embedding) } }
}

trace {} creates a root span and emits a span event when the block returns.
span {} creates a child span linked via parent_span_id.
trackInference() inside a trace or span block picks up trace_id and parent_span_id automatically.
Explicit traceId/parentSpanId arguments on trackInference() take precedence.

Output metadata

handle.trackInference(
    durationMs = ms,
    outputMeta = DetectionOutputMeta(
        numPredictions = result.size,
        avgConfidence = result.map { it.score }.average().toFloat(),
    ).toMap(),
)

Available types: DetectionOutputMeta, GenerationOutputMeta, EmbeddingOutputMeta.

Configuration

Parameter	Default	Description
`dsn`	-	`https://<pubkey>@ingest.wildedge.dev/<project-id>` (or `WILDEDGE_DSN`)
`appVersion`	auto-detected	App version string attached to every batch
`batchSize`	`10`	Events per HTTP request
`maxQueueSize`	`200`	Max in-memory events; oldest dropped on overflow
`flushIntervalMs`	`60_000`	How often the consumer wakes to send
`maxEventAgeMs`	`900_000`	Events older than this go to the dead-letter store
`samplingIntervalMs`	`30_000`	Hardware polling interval; `null` to disable
`lowConfidenceThreshold`	`0.5`	Threshold for the sampling envelope
`debug`	`false`	Verbose logcat output (or `WILDEDGE_DEBUG=true`)
`strict`	`false`	Throw on queue overflow instead of dropping

Testing

Declare your field against WildEdgeClient and inject WildEdgeClient.noop() in tests:

class InferenceService(private val wildEdge: WildEdgeClient) {
    private val handle = wildEdge.registerModel("my-model", ...)

    fun run(input: ByteArray): Result = handle.trackInference { model.run(input) }
}

// In tests:
val service = InferenceService(WildEdgeClient.noop())

The noop client runs trace and span blocks normally but discards all events. No background threads, no DSN required.

Lifecycle

Call close() before your process exits to drain remaining events:

override fun onTerminate() {
    wildEdge.close()
    super.onTerminate()
}

close() is main-thread safe. On the main thread it flushes asynchronously; off the main thread it blocks until the flush timeout (default 5 seconds).

Diagnostics

Inspect the SDK's internal state at any point:

val d = wildEdge.diagnostics

Log.d("wildedge", "pending events: ${wildEdge.pendingCount}")
Log.d("wildedge", "queue heap:     ${d.eventQueueSizeBytes} bytes")
Log.d("wildedge", "queue JSON:     ${d.eventQueueJsonBytes} bytes")

Field	Description
`pendingCount`	Number of events queued and not yet delivered
`eventQueueSizeBytes`	Estimated JVM heap bytes consumed by queued events (ART object-graph walk)
`eventQueueJsonBytes`	Total size of queued events serialized as UTF-8 JSON — matches the wire payload size

Both size fields reflect the current snapshot; call diagnostics again to get updated values. The noop client always returns 0 for both.

AI-assisted integration

Paste the prompt below into your coding agent (Claude Code, Cursor, Copilot, etc.) to wire up WildEdge across your codebase automatically.

Integrate the WildEdge Android SDK (dev.wildedge:wildedge-android) into this project.

1. Search the codebase for all ML inference code: TFLite Interpreter, ONNX OrtSession,
   Play Services InterpreterApi, LiteRT Engine/Conversation, MLKit Task calls, and any
   direct HTTP calls to remote LLM APIs (OpenAI, Gemini, etc.).

2. For each one found, wrap it with the right WildEdge integration:
   - TFLite: val interpreter = wildEdge.decorate(Interpreter(modelFile, ...), modelFile, modelVersion = "...")
   - ONNX: val session = wildEdge.decorate(env.createSession(...), modelFile, modelVersion = "...")
   - LiteRT: val engine = wildEdge.decorate(Engine(config), config, modelVersion = "...")
   - MLKit: wildEdge.registerMlKitModel(...) and Task.trackWith(handle)
   - Google AI (Gemini): val gemini = wildEdge.decorate(GenerativeModel(modelName = "...", apiKey = "..."), modelId = "gemini-2.0-flash")
     Use gemini.generateContentStream(prompt, inputMeta = WildEdge.analyzeText(prompt)) for streaming,
     gemini.generateContent(prompt, inputMeta = WildEdge.analyzeText(prompt)) for unary.
     Untracked operations (startChat, countTokens) go through gemini.model directly.
   - Remote LLM: wildEdge.registerModel("id", ModelInfo(
         inputModality = InputModality.Text, outputModality = OutputModality.Generation, ...))
     Use handle.trackSuspendInference { } for suspend calls. Pass outputMetaExtractor to
     capture token counts from the response:
       outputMetaExtractor = { r ->
           GenerationOutputMeta(tokensIn = r.usage.promptTokens, tokensOut = r.usage.completionTokens).toMap()
       }
   For decorator integrations, assign the result to the same variable name as the original
   so call sites don't change. Only pass modelVersion if you find a real version string
   in the model filename, asset path, or an existing version constant.
   For streaming LLM output (Flow<String>) use flow.trackWith(handle).

3. Set up WildEdge (pick one):
   Option A (zero code): add to AndroidManifest.xml inside <application>:
      <meta-data android:name="dev.wildedge.dsn" android:value="YOUR_DSN" />
   Then call WildEdge.getInstance() wherever inference code lives.
   Option B (manual): call WildEdge.init() in Application.onCreate():
      val wildEdge: WildEdgeClient = WildEdge.init(this) {
          dsn = "YOUR_DSN"   // get yours at wildedge.dev
      }
   Either way, inject WildEdgeClient.noop() in tests instead of the real client.

4. If multiple models run in sequence for a single user request (embed then classify,
   prefill then decode), wrap the pipeline in wildEdge.trace("name") { } so events
   are correlated on the dashboard.

5. Call wildEdge.close() in Application.onTerminate() or the appropriate lifecycle hook.

6. Pass only metadata as inputMeta (WildEdge.analyzeText / analyzeImage).
   WildEdge never transmits raw inputs.

Samples

Sample	What it shows
image-classification	TFLite image classifier with inference tracking and feedback
local-llm	On-device LLM chat using LiteRT with token metrics
local-llm-agent	LLM agent with tool calling, session spans, and per-turn tracing
cloud-llm	Streaming travel itinerary generator using Google AI (Gemini) with TTFT and token tracking

To run a sample:

Connect a device or start an emulator.

Copy and fill in your config:

cp local.properties.example local.properties
# set sdk.dir and optionally add your DSN

Install:

./gradlew :samples:image-classification:installDebug
./gradlew :samples:local-llm:installDebug
./gradlew :samples:local-llm-agent:installDebug
./gradlew :samples:cloud-llm:installDebug

cloud-llm also requires a Google AI API key (free at https://aistudio.google.com):

google.ai.api.key=AIza...

Without a DSN the samples run in noop mode.

Development

Requirements: JDK 17+, Android SDK with compileSdk 35, Gradle 9.4+ (the wrapper downloads it automatically).

# Unit tests (JVM, no emulator needed)
./gradlew :wildedge:testDebugUnitTest

# Single test class
./gradlew :wildedge:testDebugUnitTest --tests "dev.wildedge.sdk.EventQueueTest"

# Lint
./gradlew :wildedge:lint

# Detekt
./gradlew detekt

# Build AAR
./gradlew :wildedge:assembleRelease
# Output: wildedge/build/outputs/aar/wildedge-release.aar

# Publish to local Maven (for local app integration testing)
./gradlew :wildedge:publishToMavenLocal

Local Maven usage:

repositories { mavenLocal() }
dependencies { implementation("dev.wildedge:wildedge-android:0.2.0") }

Runtime requirements: minSdk 24 (Android 7.0), no required transitive dependencies. TFLite / ONNX Runtime are compileOnly; bring your own version.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
gradle		gradle
samples		samples
wildedge		wildedge
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
detekt.yml		detekt.yml
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
local.properties.example		local.properties.example
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quick start

Install

Setup

Option A: manifest

Option B: manual init

Integrations

TFLite

ONNX Runtime

MLKit

LiteRT LLM (litertlm)

Play Services TFLite

Google AI (Gemini)

Remote models

Manual tracking

Feedback

Tracing

Output metadata

Configuration

Testing

Lifecycle

Diagnostics

AI-assisted integration

Samples

Development

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Quick start

Install

Setup

Option A: manifest

Option B: manual init

Integrations

TFLite

ONNX Runtime

MLKit

LiteRT LLM (litertlm)

Play Services TFLite

Google AI (Gemini)

Remote models

Manual tracking

Feedback

Tracing

Output metadata

Configuration

Testing

Lifecycle

Diagnostics

AI-assisted integration

Samples

Development

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages