You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would suggest to not include big files as assets, because of the following (on top of my head) issues:
Compile time increases due to asset files being copied from dev machine to target device (e.g. your phone)
You will run into available memory issues during debug sessions, requiring you to increase the java memory space from the default 2048 MB to 4096 MB for example in Android projects. This file for example requires a change: gradle.properties with line org.gradle.jvmargs=-Xmx2048M -Dkotlin.daemon.jvm.options\="-Xmx2048M". Note that this is limited, don't expect to be able to add yi-34b GGUF for example, I think the Android build process will just crash no matter how much you increase that jvm arg.
Your app size will increase, e.g. a model of 2 GB means an increase of 2 GB to your app.
App stores have a max file size limit, I commented this before in another issue but forgot the details. I guess it's between 500 MB to 1.5 GB maybe, depending on whether its the Play Store or App Store. Thus you're already limited by these restrictions, even if there wouldn't be any technical difficulties on your dev machine.
Instead, I suggest you to look at 2 other approaches:
BYOM (Just made this up): Bring Your Our Model. Aka, like my example app where users need to use the file picker to select their own GGUF file. This will then be copied to the app's cache folder. You can proof this by looking at the file size of the app on Android before, -and after selecting a GGUF. Spoiler alert: My example app doesn't delete the previous selected GGUF, thus if you used my app with 5 different GGUFS and each were 2 GB's big, the app size will be [default app size] + (5x 2GB = 10 GB) big.
Like how games work, simply let your app download the asset (your GGUF) while the app is running.
Pros: The app size stays small, thus users will be more likely to download/try your app.
Cons: Users can't use the app right away the very first time, because your app needs to download the GGUF file first. Background downloads are probably limited, especially on iOS, thus you might need to ask your users to keep the app open while [amount] GB's is being downloaded.. Not a good experience but there's no other solution I can think of. Games can get away by downloading delta's / small chunks of the game, but for LLM inference, you will need the whole file at once before you can interact with it.
I hope this answer helps you in some way. Otherwise, shoot me another question.
Hi,
Say I put my guff in assets and added in yaml like this
Can or will it be possible to use it like
talkAsync
, without dump the file into App's files dir?The text was updated successfully, but these errors were encountered: