Skip to content

Conversation

zavocc
Copy link
Owner

@zavocc zavocc commented Sep 26, 2025

This version of JakeyBot now has numbering - 1.0

This changelog is a draft 2 - written in 09/26/2025, edited in 9/29/2025 11:56AM UTC+8

Major update to JakeyBot brings significant changes to user experience, under the hood, and bugfixes.

For Discord Bot Users

New features and models

  • Support for multi-file uploads
  • New models across AI features include Grok 4 Fast, Nano Banana, Imagen 4, Seedream 4, hosted on Fal.ai and OpenRouter
  • Replacing gradio_client with fal-client for most generative media tasks
  • Avatar remix command is now powered by Nano Banana
  • /summarize command can now be steered using steer argument using natural language instructions and models can now be chosen such as Gemini 2.5 Flash, Grok 4 Fast, and GPT-5 Mini
  • Discord text commands can now use models thats set by default such as Gemini 2.5 Flash, Grok 4, or GPT-5 Mini, this includes:
    • /avatar show describe:true
    • Explain this message
    • Rephrase this message
    • Suggest a response
    • /polls create
  • /feature command is now /agent command.
  • Web Search agent can now browse the web using url_browse tool. This means in addition to web snippets from search results, the model can now use the URLs as context which can be good for tasks like page summarization and nuanced query understanding.
  • Audio Tools is updated, audio_generator tool is now text_to_speech powered by Elevenlabs v3 TTS model. Podcast generation podcastgen is now more expressive, faster, and has background music. Music generation music_generator tool is also added powered by Stable Audio 2.5. All powered by Fal.AI (requires Fal.AI account and sufficient credits to use)
  • Updated Gemini 2.5 Flash models and existing users who still use Gemini 2.0 Flash will be automatically be upgraded to 2.5 Flash Lite Thinking model September 25 update.

Removed features

* - indicates the feature is temporarily removed

  • Chat variable /model:model-name is removed
  • Grok 3, now replaced in favor of Grok 4 Fast
  • YouTube watcher as a Tool
  • /models list is removed, replaced by checking the options available in /models set
  • OpenRouter as a model openrouter in /models set and /openrouter command while it still works, the model itself is not available*
  • Claude and DeepSeek models*
  • Audio editor*
  • PDF and Text files as attachments for OpenAI models*
  • Image, PDF, Video, and Audio as context in Explain this message feature*
  • youtube_search is now moved as part of Web Search agent under a tool name youtube_video_search. Please switch to Web Search agent if you encounter chat errors

For Bot Developers and Admins

  • Significantly refactored chunks of code and squashed most bugs, updated structures and imports
  • New models project module, all AI related code such as models list, SDKs, chat history, and other utility functions and code are now housed in models directory instead of core and aimodels directory
  • Updated models.yaml syntax to a more explicit, cleaner, and less error-prone. Instead of using the syntax model::provider, we now configure models in a more declarative way.
- model_alias: openai::gpt-5
  model_human_name: OpenAI GPT-5
  model_description: OpenAI's latest reasoning model
  sdk: openai
  model_id: gpt-5
  has_reasoning: true
  enable_tools: true
  enable_files: true
  enable_threads: true
  thread_name: openai
  enable_system_instructions: true
  client_name: openai_client

Rather than relying on guessing based on provider::model-name alone, we now provide more explicit way of declaring new models and it's capabilities and reducing the number of errors by guessing which model to apply right parameters implicitly in inference sdk code, it also means each models with same provider and sdk has distinct capabilities that can be set without needing to create new provider handler just for that model.

For example, Gemini and Gemma models use same google-genai SDK but because of their differences as a result we had to create new provider handler python code, which previously aimodels/gemini/infer.py is for Gemini models under gemini::gemini-1.5-pro and aimodels/google/infer.py for Gemma under gemma::gemma-3-27b with different imports but uses the same SDK code which results in DRY and unmaintainable code, adding technical debt.

Now, it's more granular to set model capabilities.

Because of this change, models that use same provider can now share the same SDK inference code without needing to write a new one. so xAI, OpenRouter, OpenAI models can utilize the same SDK in models.providers.openai and Gemini and Gemma models in models.providers.google

  • core directory now contains code for subclasses and database
  • renamed core.ai.history to core.database
  • restructured tools directory with new utility functions, and structural changes. Tool schema now uses manifest.yaml syntax instead of inheritable manifest.py. Tools will not be registered without manifest.yaml. The new tools structure is tools/apis/ToolName/{manifest.py,tool.py}
  • New task specific model inferencing code under models.tasks.{media, text}.provider for media and text completions without full chat experience, using fal ai, openai, and google models. Reducing the number of repetitive code. This include run_audio and run_image for media tasks, and completion for text. Same KWargs based signatures to make things flexible.
  • Decoupling JakeyBot against purely Gemini only AI bot, for instance renaming cogs.ai.gemini to cogs.ai.tasks and cogs organization.
  • Cleaned up core.database by reducing redundant utility methods and instead we only have two methods when fetching and setting data: get_key and set_key to directly create and set key values scoped for that Discord user snowflake ID

Removed features

* - indicates the feature is temporarily removed

  • LiteLLM SDK for other models outside of Google and OpenAI sdk*

  • SHARED_CHAT_HISTORY

did
- add multi image uploads

- added validation tools using pydantic

- introducing new declarative models.yaml which rather than making guesses, it is now configurable based on model capabilities, such as when it supports file inputs, history, reasoning, tool use etc. It is more explicit

- making inference code more maintainable, now similar providers will  use same compatible shared SDK (e.g. groq and openai uses openai sdk) and utilizing declarative models.

- Tool use code is now separate and makes the completions code much more clean

- Completions code (send_message) returns chat thread than dict

- Simplified interstitials

What removed under reconstruction
- List models

- "/model" chat variable

- Chat history (WIP, deciding whether the load and save history code should be executed in generative_chat.py and pass the history to send_message) - Rendering /chat:ephemeral variable useless

- Attachment metadata (will put on BaseChat instead, todo: add extra_metadata parameter to upload_files method and add as text part)

- /chat:info chat variable - syntax change needed

- Google, Anthropic, most models. Only OpenAI is available for now, more SDKs to come

TO BE IMPLEMENTED:
- Robust and simplified history.py, to purely use get_key, set_key and clear_history... which will be streamlined with new refactored load_history and save_history as well as sweeping chat history and changing tools

- Other missing features like PDF and audio/video uploads
What works now
- More models testing different configs including threads, tools, and reasoning modes and mismatches

- File metadatas

- Streamlined chat history loading and saving now uses generic get_key and set_key in models/utils.yaml

- Use model props in "/model set" from chat.py rather than messy parsing

- Load and save chat history is now directly calling utils.save_history and utils.load_history to generative_chat.py
  This also means this we can directly manipulate history if lets say detecting if the conversations should be saved

- Add checks from model_props in generative_chat.py
  This includes checking if file attachment is supported or chat threads is supported

- Removed legacy load_history and save_history from history.py to streamline tasks, might also be soon the same for get and set methods dedicated to tools and instead we just use generic get and set key methods

========== ROADMAPS:

- Simplify history management, for instance, in chat.py, clear, model set, set tools will now call generic set and get database methods from models/utils to centralize database instances to generative_chat

- Rename history.py to database.py?

- We probably need to reduce reliance to Google gemini and update defaults

- In generative_chat.py - We simplify error handling, most errors should be either generic (logged on server side) or CustomErrorMessage

This means, we might want to remove HistoryDatabase and ModelAPIKeyUnset as well as other exceptions with generic "Something went wrong while trying to generate response, please try again later or switch to other models" while detailed logs remain in server side
1. Simplify how database is queried

Now every database operations using mongodb can only be done with 3 methods: set_key, get_key, and clear_history that's it, applies to all methods in chat.py

2. Consolidate error handling in chat.py, since errors are redundant and repetitive

====== TODO:

Change history.py to database.py, with class name changing History to DatabaseConnection or DBConn (wip naming), and clear_history to clear_user_data
…nrouter/anthropic

Also wip we need a utility function in "/models" to normalize this syntax
- Move the reasoning normalization as a utility function to return relevant params

- Update indicator of file upload status
- Add support for custom openai instances that may have custom baseurl clients
- Add Kimi K2
…ing the same SDK, rename "provider" to "sdk", while renaming "provider" to "thread_name" in utils.py history tools
Change core.ai.core to models.core
Change core.ai.history to core.history
Update object instance reference names

TODO: tool use revamp, fetch_default_model revamp, default key in models.yaml to use the first and only defaults
- Use manifest.yaml
- Add properties to determine if its an MCP or local
Upgrade image models to Imagen 4, Nano Banana, and Seedream 4
Update imagegen tool and dependencies
1. fetch_default_model from HelperFunctions is ceased, now it's part of models.core utility functions and called

2. Make models.core a purely utility functions and update functions name and imports

3. Update database.py to use the new get_default_chat_model method from models.core instead of fetch_default_model

As well as update _ensure_document with default constant values since we only have getters and setters so having those params are very unnecessary

4. Update validation.py for models.yaml to have default key and now models.yaml requires atleast one default model or else get_default_chat_model would fail

---------------------------------------

TODO:

Remove /aimodels folder to be included from development workflows (e.g. searching for references)

In commands.yaml, remove gemini specific commands, since there's a lot of references that needed significant rewrite and use latest ones, as well, as new text task type which we already use models.task.text, google for now

ai.gemini.fun, ai.gemini.message_actions, ai.gemini.oneoff, ai.gemini.summarize, WIP to support default model choice just like models.yaml for chat
- Merge youtube search to Internet Search (formerly known as WebSearch)

- Add Grok 4 Fast as OpenROuter model in models.yaml

- Update initbot.py to startup.py with SubClassBotPlugServices class
Next steps: Test the other cogs
Update avatar command to use set image url in embeds instead
Benefits
- Adds models on the fly without needing a discord bot restart to add new models, it only pulls and reads files from models.yaml

TODO:
- Add features like user ID based allowlist, meaning to filter the list to only show available models to them, if the user is not part of allowlist then only public models will be shown, TO WORK ON: ModelProps pydantic class to add `allowed_user_ids` array and add a check to get_chat_models_autocomplete

- Replace generators like get_tools_list_generator and get_remix_styles_generator to autocomplete though tools may still remain as generator since bot needs to be restarted to properly register new modules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant