-
Notifications
You must be signed in to change notification settings - Fork 105
[Android-sample] Kotlin android sample app #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Introduced a comprehensive README for the Local LLM implementation on Android, detailing features, architecture, and setup instructions. - Added model management capabilities, including downloading, viewing, and deleting models. - Implemented a unified LLM manager to handle multiple frameworks (MediaPipe and ONNX Runtime). - Created a chat interface with real-time streaming responses and conversation history. - Enhanced the app's architecture with a clear separation of concerns, following SOLID principles. - Included performance optimizations and troubleshooting guidelines for better user experience.
- Added Detekt plugin to the Android project for enhanced static code analysis. - Configured Detekt with a custom configuration file and baseline for issue tracking. - Updated dependencies to use the latest version of ONNX Runtime. - Removed unnecessary formatting section from the Detekt configuration to streamline the setup. - Ensured all Detekt reports (HTML, XML, TXT, SARIF) are generated for better visibility of code quality issues.
- Changed the chat icon from Email to Chat in the navigation bar for better representation. - Added necessary import for dp unit to ensure proper layout handling. This update enhances the user interface by providing a more intuitive icon for the chat feature.
… and Performance - Simplified the initialization process in MediaPipeService by removing unnecessary options and ensuring compatibility with the latest API changes. - Updated the generateStream method to use a non-streaming API for response generation, reflecting changes in the MediaPipe streaming API. - Cleaned up unused code in ONNXRuntimeService by removing redundant CPU fallback and tensor cleanup operations. - Enhanced the ChatViewModel to specify the full package path for ModelInfo, improving clarity and avoiding potential conflicts. - Updated icon representations in the UI for better consistency and user experience. These changes enhance code readability, maintainability, and align with SOLID principles while ensuring the application remains scalable.
- Replaced the folder icon with a list icon in the navigation bar of MainActivity for better representation of models. - Updated the leading icon in ModelCard from folder to account box to enhance visual clarity. - Changed the download icon in ModelCard from get app to add, improving user experience. These changes contribute to a more intuitive user interface while maintaining adherence to SOLID principles.
- Added support for Gemini Nano in ModelRepository, allowing for on-device inference if the device meets compatibility requirements. - Introduced GeminiNanoService for managing the Gemini Nano model, including initialization, generation, and compatibility checks. - Updated getAvailableModels and getDownloadedModels methods to include Gemini Nano, ensuring seamless integration with existing model management. - Registered Gemini Nano service in UnifiedLLMManager for improved framework recommendations based on device capabilities. - Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability. These changes enhance the application's capabilities while maintaining clarity and performance.
- Introduced TFLiteService for on-device LLM inference, supporting GPU and NNAPI delegates for improved performance. - Updated ModelRepository to include new models: MobileNet GPT-2 and DistilBERT QA, enhancing the variety of available models. - Registered TFLiteService in UnifiedLLMManager for seamless integration with existing frameworks. - Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability. These changes significantly enhance the application's capabilities while maintaining clarity and performance.
- Added LlamaCppService for GGUF model inference, supporting various quantization formats. - Introduced LlamaTokenizer to leverage native tokenization capabilities, ensuring compatibility with GGUF models. - Enhanced ModelRepository to include new models such as TinyLlama and Phi-2, expanding the model offerings. - Implemented multiple tokenizer classes (BPE, SentencePiece, WordPiece, Simple) for diverse tokenization strategies. - Registered LlamaCppService in UnifiedLLMManager for seamless integration with existing frameworks. - Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability. These changes significantly enhance the application's capabilities in handling Llama models while maintaining clarity and performance.
…gement - Added file integrity verification using SHA-256 hash in ModelRepository to ensure downloaded models are valid. - Introduced methods for calculating SHA-256 hash and checking available memory, improving model loading reliability. - Updated ModelInfo to include SHA-256 hash and required RAM for better model management. - Enhanced UI to display download verification progress, providing users with real-time feedback during model downloads. - Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability. These changes significantly improve the robustness of model handling and user experience in the application.
- Introduced ExecuTorchService for running optimized Llama 2 and other models on edge devices, enhancing model inference capabilities. - Updated ModelRepository to include new ExecuTorch models, expanding the range of available options for users. - Enhanced UnifiedLLMManager to register ExecuTorchService, ensuring seamless integration with existing frameworks. - Added HardwareDetector utility class for detecting device capabilities and optimal backends, improving model loading efficiency. These changes significantly enhance the application's capabilities for edge device inference while maintaining clarity and performance.
- Introduced MLC-LLM service for hardware-accelerated language model inference, leveraging Apache TVM for edge devices. - Added JNI integration with native methods for engine creation, chat completion, and streaming responses. - Updated ModelRepository to include new MLC-LLM models, expanding the range of available options for users. - Enhanced UnifiedLLMManager to register MLC-LLM service, ensuring seamless integration with existing frameworks. - Implemented device detection and model initialization logic in MLCLLMService, improving model loading efficiency. These changes significantly enhance the application's capabilities for edge device inference while maintaining clarity and performance.
- Introduced Room database integration with ConversationDatabase, enabling efficient storage and retrieval of conversation and message data. - Added ConversationEntity and MessageEntity data classes to represent database entities, enhancing data management capabilities. - Implemented ConversationDao and MessageDao interfaces for database operations, ensuring a clean separation of concerns. - Updated LLMService interface to return GenerationResult instead of String, improving the handling of generation outcomes. - Registered new AI Core and picoLLM services in UnifiedLLMManager, expanding the range of supported frameworks for text generation. These changes significantly enhance the application's data management and AI capabilities while maintaining clarity and performance.
- Introduced Conversation and Message data models to encapsulate conversation and message details, enhancing data structure clarity. - Implemented ConversationRepository to manage conversation and message data, providing methods for CRUD operations and context window management. - Added encryption capabilities for conversation data through the EncryptionManager, ensuring secure storage and retrieval. - Developed EnhancedChatScreen UI component for improved user interaction, featuring message display, input handling, and model selection. - Created ChatViewModel to manage UI state and interactions, facilitating communication between the UI and repository.
- Introduced PreferencesRepository for managing encrypted application settings, ensuring secure storage of user preferences. - Added SettingsViewModel to handle UI state and interactions, facilitating communication between the UI and repository. - Developed SettingsScreen with tabbed sections for various settings categories, enhancing user experience and organization. - Implemented UI components for settings, including sliders, switches, and dropdowns, to allow intuitive adjustments of application parameters. - Enhanced error handling and loading states in the settings UI for improved user feedback.
- Updated build.gradle.kts to include Hilt for dependency injection and improved build types for debug, release, and benchmark configurations. - Refactored LLMService interface to return GenerationResult instead of String, enhancing the handling of generation outcomes across various implementations. - Improved error handling in LLM service implementations (ExecuTorch, GeminiNano, MLCLLM, ONNXRuntime, TFLite) to return GenerationResult on failure, ensuring consistent response structure. - Added comprehensive ProGuard rules for various frameworks to optimize code obfuscation and ensure necessary classes are retained.
- Introduced dynamic feature modules for ExecuTorch and Llama.cpp, enabling on-demand installation and optimized performance for specific frameworks. - Developed ExecuTorchFeatureModule and LlamaCppFeatureModule to provide necessary services and requirements for their respective frameworks. - Enhanced model management UI with detailed components, including EnhancedModelCard, ModelComparisonView, and ModelSearchAndFilter, improving user experience and interaction. - Implemented advanced model comparison and filtering capabilities, allowing users to easily navigate and select models based on performance metrics and requirements.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.