From e6f468a95e97d944ee748571c14296487b79311b Mon Sep 17 00:00:00 2001
From: Harsha S <sharsha315@gmail.com>
Date: Wed, 3 Apr 2024 08:22:18 +0530
Subject: [PATCH] Multiline docstrings fix (#2130)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* DOC FIX - Formatted Docstrings for the retrieve_user_proxy_agent.py and Added first single line for the class RetrieveUserProxyAgent.

* DOC FIX - Formatted Docstrings for  theinitiate_chats functiion of ChatResult class in  autogen/agentchat/chat.py

* Add vision capability (#2025)

* Add vision capability

* Configurate: description_prompt

* Print warning instead of raising issues for type

* Skip vision capability test if dependencies not installed

* Append "vision" to agent's system message when enabled VisionCapability

* GPT-4V notebook update with ConversableAgent

* Clean GPT-4V notebook

* Add vision capability test to workflow

* Lint import

* Update system message for vision capability

* Add a `custom_caption_func` to VisionCapability

* Add custom function example for vision capability

* Skip test Vision capability custom func

* GPT-4V notebook metadata to website

* Remove redundant files

* The custom caption function takes more inputs now

* Add a more complex example of custom caption func

* Remove trailing space

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Native tool call support for Mistral AI API and topic notebook. (#2135)

* Support for Mistral AI API and topic notebook.

* formatting

* formatting

* New conversational chess notebook using nested chats and tool use (#2137)

* add chess notebook

* update

* update

* Update notebook with figure

* Add example link

* redirect

* Clean up example format

* address gagan's comments

* update references

* fix links

* add webarena in samples (#2114)

* add webarena in samples/tools

* Update samples/tools/webarena/README.md

Co-authored-by: gagb <gagb@users.noreply.github.com>

* Update samples/tools/webarena/README.md

Co-authored-by: gagb <gagb@users.noreply.github.com>

* Update samples/tools/webarena/README.md

Co-authored-by: gagb <gagb@users.noreply.github.com>

* update installation instructions

* black formatting

* Update README.md

---------

Co-authored-by: gagb <gagb@users.noreply.github.com>
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

* context to kwargs (#2064)

* context to kwargs

* add tag

* add test

* text to kwargs

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /website (#2131)

Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4.
- [Release notes](https://github.com/webpack/webpack-dev-middleware/releases)
- [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4)

---
updated-dependencies:
- dependency-name: webpack-dev-middleware
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

* Parse Any HTML-esh Style Tags (#2046)

* tried implementing my own regex

* improves tests

* finally works

* removes prints

* fixed test

* adds start and end

* delete unused imports

* refactored to use new tool

* significantly improved algo

* tag content -> tag attr

* fix tests + adds new field

* return full match

* return remove start and end

* update docstrings

* update docstrings

* update docstrings

---------

Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Integrate AgentOptimizer (#1767)

* draft agent optimizer

* refactor

* remove

* change openai config interface

* notebook

* update blog

* add test

* clean up

* redir

* update

* update interface

* change model name

* move to contrib

* Update autogen/agentchat/contrib/agent_optimizer.py

Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com>

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com>
Co-authored-by: Jieyu Zhang <jieyuz2@cs.washington.edu>
Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com>

* Introducing IOStream protocol and adding support for websockets (#1551)

* Introducing IOStream

* bug fixing

* polishing

* refactoring

* refactoring

* refactoring

* wip: async tests

* websockets added

* wip

* merge with main

* notebook added

* FastAPI example added

* wip

* merge

* getter/setter to iostream added

* website/blog/2024-03-03-AutoGen-Update/img/dalle_gpt4v.png: convert to Git LFS

* website/blog/2024-03-03-AutoGen-Update/img/gaia.png: convert to Git LFS

* website/blog/2024-03-03-AutoGen-Update/img/teach.png: convert to Git LFS

* add SSL support

* wip

* wip

* exception handling added to on_connect()

* refactoring: default iostream is being set in a context manager

* test fix

* polishing

* polishing

* polishing

* fixed bug with new thread

* polishing

* a bit of refactoring and docs added

* notebook added to docs

* type checking added to CI

* CI fix

* CI fix

* CI fix

* polishing

* obsolete todo comment removed

* fixed precommit error

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

* [CAP] [Feature] Get list of actors from directory service. (#2073)

* Search directory for list of actors using regex '.*' gets all actors

* docs changes

* pre-commit fixes

* Use ActorInfo from protobuf

* pre-commit

* Added zmq tests to work on removing sleeps

* minor refactor of zmq tests

* 1) Change DirSvr to user Broker.  2) Add req-router to broker 3) In ActorConnector use handshake and req/resp to remove sleep

* 1) Change DirSvr to user Broker.  2) Add req-router to broker 3) In ActorConnector use handshake and req/resp to remove sleep

* move socket creation to thread with recv

* move socket creation to thread with recv

* Better logging for DirectorySvc

* better logging for directory svc

* Use logging config

* Start removing sleeps

* pre-commit

* Cleanup monitor socket

* Mark cache as a protocol and update type hints to reflect (#2168)

* Mark cache as a protocl and update type hints to reflect

* int

* undo init change

	modified:   autogen/agentchat/chat.py

* fix(): fix word spelling errors (#2171)

* Implement User Defined Functions for Local CLI Executor (#2102)

* Implement user defined functions feature for local cli exec, add docs

* add tests, update docs

* fixes

* fix test

* add pandas test dep

* install test

* provide template as func

* formatting

* undo change

* address comments

* add test deps

* formatting

* test only in 1 env

* formatting

* remove test for local only

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

* simplify getting-started; update news (#2175)

* simplify getting-started; update news

* bug fix

* update (#2178)

Co-authored-by: AnonymousRepoSub <“shaokunzhang529@outlook.com” >

* Fix formatting of admonitions in udf docs (#2188)

* Fix iostream on new thread (#2181)

* fixed get_stream in new thread by introducing a global default

* fixed get_stream in new thread by introducing a global default

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Add link for rendering notebooks docs on website (#2191)

* Transform Messages Capability (#1923)

* wip

* Adds docstrings

* fixed spellings

* wip

* fixed errors

* better class names

* adds tests

* added tests to workflow

* improved token counting

* improved notebook

* improved token counting in test

* improved docstrings

* fix inconsistencies

* changed by mistake

* fixed docstring

* fixed details

* improves tests + adds openai contrib test

* fix spelling oai contrib test

* clearer docstrings

* remove repeated docstr

* improved notebook

* adds metadata to notebook

* Improve outline and description (#2125)

* better dir structure

* clip max tokens to allowed tokens

* more accurate comments/docstrs

* add deperecation warning

* fix front matter desc

* add deperecation warning notebook

* undo local notebook settings changes

* format notebook

* format workflow

---------

Co-authored-by: gagb <gagb@users.noreply.github.com>

* Bump express from 4.18.2 to 4.19.2 in /website (#2157)

Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.18.2...4.19.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add clarity analytics (#2201)

* Docstring formatting fix: Standardize docstrings to adhere to Google style guide, ensuring consistency and clarity. and also fixed the broken link for autogen/agentchat/chat.py

* Docstring fix: Reformattted docstrings to adhere to Google style guide, nsuring consistency and clarity. For agentchat/contrib/retrieve_user_proxy_agent.py file

* Fixed Pre-Commit Error, Trailing spaces on agentchat/chat.py

* Fixed Pre-Commit Error, Trailing spaces on agentchat/chat.py

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Co-authored-by: olgavrou <olgavrou@gmail.com>
Co-authored-by: gagb <gagb@users.noreply.github.com>
Co-authored-by: Qingyun Wu <qingyun0327@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Wael Karkoub <wael.karkoub96@gmail.com>
Co-authored-by: Shaokun Zhang <shaokunzhang529@gmail.com>
Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
Co-authored-by: Jieyu Zhang <jieyuz2@cs.washington.edu>
Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com>
Co-authored-by: Davor Runje <davor@airt.ai>
Co-authored-by: Rajan <rajan.chari@yahoo.com>
Co-authored-by: calm <1191465097@qq.com>
Co-authored-by: AnonymousRepoSub <“shaokunzhang529@outlook.com” >
---
 autogen/agentchat/chat.py                     |  51 ++++---
 .../contrib/retrieve_user_proxy_agent.py      | 135 ++++++++++++------
 2 files changed, 119 insertions(+), 67 deletions(-)

diff --git a/autogen/agentchat/chat.py b/autogen/agentchat/chat.py
index bd56cf2f5793..347f58bfc3b8 100644
--- a/autogen/agentchat/chat.py
+++ b/autogen/agentchat/chat.py
@@ -26,7 +26,9 @@ class ChatResult:
     summary: str = None
     """A summary obtained from the chat."""
     cost: tuple = None  # (dict, dict) - (total_cost, actual_cost_with_cache)
-    """The cost of the chat. a tuple of (total_cost, total_actual_cost), where total_cost is a dictionary of cost information, and total_actual_cost is a dictionary of information on the actual incurred cost with cache."""
+    """The cost of the chat. a tuple of (total_cost, total_actual_cost), where total_cost is a
+       dictionary of cost information, and total_actual_cost is a dictionary of information on
+       the actual incurred cost with cache."""
     human_input: List[str] = None
     """A list of human input solicited during the chat."""
 
@@ -141,25 +143,32 @@ def __post_carryover_processing(chat_info: Dict[str, Any]) -> None:
 
 def initiate_chats(chat_queue: List[Dict[str, Any]]) -> List[ChatResult]:
     """Initiate a list of chats.
-
     Args:
-        chat_queue (List[Dict]): a list of dictionaries containing the information about the chats.
-
-        Each dictionary should contain the input arguments for [`ConversableAgent.initiate_chat`](/docs/reference/agentchat/conversable_agent#initiate_chat). For example:
-            - "sender": the sender agent.
-            - "recipient": the recipient agent.
-            - "clear_history" (bool): whether to clear the chat history with the agent. Default is True.
-            - "silent" (bool or None): (Experimental) whether to print the messages in this conversation. Default is False.
-            - "cache" (AbstractCache or None): the cache client to use for this conversation. Default is None.
-            - "max_turns" (int or None): maximum number of turns for the chat. If None, the chat will continue until a termination condition is met. Default is None.
-            - "summary_method" (str or callable): a string or callable specifying the method to get a summary from the chat. Default is DEFAULT_summary_method, i.e., "last_msg".
-            - "summary_args" (dict): a dictionary of arguments to be passed to the summary_method. Default is {}.
-            - "message" (str, callable or None): if None, input() will be called to get the initial message.
-            - **context: additional context information to be passed to the chat.
-                - "carryover": It can be used to specify the carryover information to be passed to this chat.
-                    If provided, we will combine this carryover with the "message" content when generating the initial chat
-                    message in `generate_init_message`.
-
+        chat_queue (List[Dict]): A list of dictionaries containing the information about the chats.
+
+        Each dictionary should contain the input arguments for
+        [`ConversableAgent.initiate_chat`](/docs/reference/agentchat/conversable_agent#initiate_chat).
+        For example:
+            - `"sender"` - the sender agent.
+            - `"recipient"` - the recipient agent.
+            - `"clear_history"  (bool) - whether to clear the chat history with the agent.
+               Default is True.
+            - `"silent"` (bool or None) - (Experimental) whether to print the messages in this
+               conversation. Default is False.
+            - `"cache"` (Cache or None) - the cache client to use for this conversation.
+               Default is None.
+            - `"max_turns"` (int or None) - maximum number of turns for the chat. If None, the chat
+               will continue until a termination condition is met. Default is None.
+            - `"summary_method"` (str or callable) - a string or callable specifying the method to get
+               a summary from the chat. Default is DEFAULT_summary_method, i.e., "last_msg".
+            - `"summary_args"` (dict) - a dictionary of arguments to be passed to the summary_method.
+               Default is {}.
+            - `"message"` (str, callable or None) - if None, input() will be called to get the
+               initial message.
+            - `**context` - additional context information to be passed to the chat.
+            - `"carryover"` - It can be used to specify the carryover information to be passed
+               to this chat. If provided, we will combine this carryover with the "message" content when
+               generating the initial chat message in `generate_init_message`.
     Returns:
         (list): a list of ChatResult objects corresponding to the finished chats in the chat_queue.
     """
@@ -228,11 +237,11 @@ async def a_initiate_chats(chat_queue: List[Dict[str, Any]]) -> Dict[int, ChatRe
     """(async) Initiate a list of chats.
 
     args:
-        Please refer to `initiate_chats`.
+        - Please refer to `initiate_chats`.
 
 
     returns:
-        (Dict): a dict of ChatId: ChatResult corresponding to the finished chats in the chat_queue.
+        - (Dict): a dict of ChatId: ChatResult corresponding to the finished chats in the chat_queue.
     """
     consolidate_chat_info(chat_queue)
     _validate_recipients(chat_queue)
diff --git a/autogen/agentchat/contrib/retrieve_user_proxy_agent.py b/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
index f252f60e5ec1..1d029a5192c9 100644
--- a/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
+++ b/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
@@ -62,6 +62,10 @@
 
 
 class RetrieveUserProxyAgent(UserProxyAgent):
+    """(In preview) The Retrieval-Augmented User Proxy retrieves document chunks based on the embedding
+    similarity, and sends them along with the question to the Retrieval-Augmented Assistant
+    """
+
     def __init__(
         self,
         name="RetrieveChatAgent",  # default set to RetrieveChatAgent
@@ -73,67 +77,106 @@ def __init__(
         r"""
         Args:
             name (str): name of the agent.
+
             human_input_mode (str): whether to ask for human inputs every time a message is received.
                 Possible values are "ALWAYS", "TERMINATE", "NEVER".
                 1. When "ALWAYS", the agent prompts for human input every time a message is received.
                     Under this mode, the conversation stops when the human input is "exit",
                     or when is_termination_msg is True and there is no human input.
-                2. When "TERMINATE", the agent only prompts for human input only when a termination message is received or
-                    the number of auto reply reaches the max_consecutive_auto_reply.
-                3. When "NEVER", the agent will never prompt for human input. Under this mode, the conversation stops
-                    when the number of auto reply reaches the max_consecutive_auto_reply or when is_termination_msg is True.
+                2. When "TERMINATE", the agent only prompts for human input only when a termination
+                    message is received or the number of auto reply reaches
+                    the max_consecutive_auto_reply.
+                3. When "NEVER", the agent will never prompt for human input. Under this mode, the
+                    conversation stops when the number of auto reply reaches the
+                    max_consecutive_auto_reply or when is_termination_msg is True.
+
             is_termination_msg (function): a function that takes a message in the form of a dictionary
                 and returns a boolean value indicating if this received message is a termination message.
                 The dict can contain the following keys: "content", "role", "name", "function_call".
+
             retrieve_config (dict or None): config for the retrieve agent.
-                To use default config, set to None. Otherwise, set to a dictionary with the following keys:
-                - task (Optional, str): the task of the retrieve chat. Possible values are "code", "qa" and "default". System
-                    prompt will be different for different tasks. The default value is `default`, which supports both code and qa.
-                - client (Optional, chromadb.Client): the chromadb client. If key not provided, a default client `chromadb.Client()`
-                    will be used. If you want to use other vector db, extend this class and override the `retrieve_docs` function.
-                - docs_path (Optional, Union[str, List[str]]): the path to the docs directory. It can also be the path to a single file,
-                    the url to a single file or a list of directories, files and urls. Default is None, which works only if the collection is already created.
-                - extra_docs (Optional, bool): when true, allows adding documents with unique IDs without overwriting existing ones; when false, it replaces existing documents using default IDs, risking collection overwrite.,
-                    when set to true it enables the system to assign unique IDs starting from "length+i" for new document chunks, preventing the replacement of existing documents and facilitating the addition of more content to the collection..
-                    By default, "extra_docs" is set to false, starting document IDs from zero. This poses a risk as new documents might overwrite existing ones, potentially causing unintended loss or alteration of data in the collection.
-                - collection_name (Optional, str): the name of the collection.
+
+                To use default config, set to None. Otherwise, set to a dictionary with the
+                following keys:
+                - `task` (Optional, str) - the task of the retrieve chat. Possible values are
+                    "code", "qa" and "default". System prompt will be different for different tasks.
+                     The default value is `default`, which supports both code and qa.
+                - `client` (Optional, chromadb.Client) - the chromadb client. If key not provided, a
+                     default client `chromadb.Client()` will be used. If you want to use other
+                     vector db, extend this class and override the `retrieve_docs` function.
+                - `docs_path` (Optional, Union[str, List[str]]) - the path to the docs directory. It
+                     can also be the path to a single file, the url to a single file or a list
+                     of directories, files and urls. Default is None, which works only if the
+                     collection is already created.
+                - `extra_docs` (Optional, bool) - when true, allows adding documents with unique IDs
+                    without overwriting existing ones; when false, it replaces existing documents
+                    using default IDs, risking collection overwrite., when set to true it enables
+                    the system to assign unique IDs starting from "length+i" for new document
+                    chunks, preventing the replacement of existing documents and facilitating the
+                    addition of more content to the collection..
+                    By default, "extra_docs" is set to false, starting document IDs from zero.
+                    This poses a risk as new documents might overwrite existing ones, potentially
+                    causing unintended loss or alteration of data in the collection.
+                - `collection_name` (Optional, str) - the name of the collection.
                     If key not provided, a default name `autogen-docs` will be used.
-                - model (Optional, str): the model to use for the retrieve chat.
+                - `model` (Optional, str) - the model to use for the retrieve chat.
                     If key not provided, a default model `gpt-4` will be used.
-                - chunk_token_size (Optional, int): the chunk token size for the retrieve chat.
+                - `chunk_token_size` (Optional, int) - the chunk token size for the retrieve chat.
                     If key not provided, a default size `max_tokens * 0.4` will be used.
-                - context_max_tokens (Optional, int): the context max token size for the retrieve chat.
+                - `context_max_tokens` (Optional, int) - the context max token size for the
+                    retrieve chat.
                     If key not provided, a default size `max_tokens * 0.8` will be used.
-                - chunk_mode (Optional, str): the chunk mode for the retrieve chat. Possible values are
-                    "multi_lines" and "one_line". If key not provided, a default mode `multi_lines` will be used.
-                - must_break_at_empty_line (Optional, bool): chunk will only break at empty line if True. Default is True.
+                - `chunk_mode` (Optional, str) - the chunk mode for the retrieve chat. Possible values
+                    are "multi_lines" and "one_line". If key not provided, a default mode
+                    `multi_lines` will be used.
+                - `must_break_at_empty_line` (Optional, bool) - chunk will only break at empty line
+                    if True. Default is True.
                     If chunk_mode is "one_line", this parameter will be ignored.
-                - embedding_model (Optional, str): the embedding model to use for the retrieve chat.
-                    If key not provided, a default model `all-MiniLM-L6-v2` will be used. All available models
-                    can be found at `https://www.sbert.net/docs/pretrained_models.html`. The default model is a
-                    fast model. If you want to use a high performance model, `all-mpnet-base-v2` is recommended.
-                - embedding_function (Optional, Callable): the embedding function for creating the vector db. Default is None,
-                    SentenceTransformer with the given `embedding_model` will be used. If you want to use OpenAI, Cohere, HuggingFace or
-                    other embedding functions, you can pass it here, follow the examples in `https://docs.trychroma.com/embeddings`.
-                - customized_prompt (Optional, str): the customized prompt for the retrieve chat. Default is None.
-                - customized_answer_prefix (Optional, str): the customized answer prefix for the retrieve chat. Default is "".
-                    If not "" and the customized_answer_prefix is not in the answer, `Update Context` will be triggered.
-                - update_context (Optional, bool): if False, will not apply `Update Context` for interactive retrieval. Default is True.
-                - get_or_create (Optional, bool): if True, will create/return a collection for the retrieve chat. This is the same as that used in chromadb.
-                    Default is False. Will raise ValueError if the collection already exists and get_or_create is False. Will be set to True if docs_path is None.
-                - custom_token_count_function (Optional, Callable): a custom function to count the number of tokens in a string.
-                    The function should take (text:str, model:str) as input and return the token_count(int). the retrieve_config["model"] will be passed in the function.
-                    Default is autogen.token_count_utils.count_token that uses tiktoken, which may not be accurate for non-OpenAI models.
-                - custom_text_split_function (Optional, Callable): a custom function to split a string into a list of strings.
-                    Default is None, will use the default function in `autogen.retrieve_utils.split_text_to_chunks`.
-                - custom_text_types (Optional, List[str]): a list of file types to be processed. Default is `autogen.retrieve_utils.TEXT_FORMATS`.
-                    This only applies to files under the directories in `docs_path`. Explicitly included files and urls will be chunked regardless of their types.
-                - recursive (Optional, bool): whether to search documents recursively in the docs_path. Default is True.
+                - `embedding_model` (Optional, str) - the embedding model to use for the retrieve chat.
+                    If key not provided, a default model `all-MiniLM-L6-v2` will be used. All available
+                    models can be found at `https://www.sbert.net/docs/pretrained_models.html`.
+                    The default model is a fast model. If you want to use a high performance model,
+                    `all-mpnet-base-v2` is recommended.
+                - `embedding_function` (Optional, Callable) - the embedding function for creating the
+                    vector db. Default is None, SentenceTransformer with the given `embedding_model`
+                    will be used. If you want to use OpenAI, Cohere, HuggingFace or other embedding
+                    functions, you can pass it here,
+                    follow the examples in `https://docs.trychroma.com/embeddings`.
+                - `customized_prompt` (Optional, str) - the customized prompt for the retrieve chat.
+                    Default is None.
+                - `customized_answer_prefix` (Optional, str) - the customized answer prefix for the
+                    retrieve chat. Default is "".
+                    If not "" and the customized_answer_prefix is not in the answer,
+                    `Update Context` will be triggered.
+                - `update_context` (Optional, bool) - if False, will not apply `Update Context` for
+                    interactive retrieval. Default is True.
+                - `get_or_create` (Optional, bool) - if True, will create/return a collection for the
+                    retrieve chat. This is the same as that used in chromadb.
+                    Default is False. Will raise ValueError if the collection already exists and
+                    get_or_create is False. Will be set to True if docs_path is None.
+                - `custom_token_count_function` (Optional, Callable) - a custom function to count the
+                    number of tokens in a string.
+                    The function should take (text:str, model:str) as input and return the
+                    token_count(int). the retrieve_config["model"] will be passed in the function.
+                    Default is autogen.token_count_utils.count_token that uses tiktoken, which may
+                    not be accurate for non-OpenAI models.
+                - `custom_text_split_function` (Optional, Callable) - a custom function to split a
+                    string into a list of strings.
+                    Default is None, will use the default function in
+                    `autogen.retrieve_utils.split_text_to_chunks`.
+                - `custom_text_types` (Optional, List[str]) - a list of file types to be processed.
+                    Default is `autogen.retrieve_utils.TEXT_FORMATS`.
+                    This only applies to files under the directories in `docs_path`. Explicitly
+                    included files and urls will be chunked regardless of their types.
+                - `recursive` (Optional, bool) - whether to search documents recursively in the
+                    docs_path. Default is True.
+
             `**kwargs` (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).
 
         Example:
 
-        Example of overriding retrieve_docs - If you have set up a customized vector db, and it's not compatible with chromadb, you can easily plug in it with below code.
+        Example of overriding retrieve_docs - If you have set up a customized vector db, and it's
+        not compatible with chromadb, you can easily plug in it with below code.
         ```python
         class MyRetrieveUserProxyAgent(RetrieveUserProxyAgent):
             def query_vector_db(
@@ -416,9 +459,9 @@ def message_generator(sender, recipient, context):
             sender (Agent): the sender agent. It should be the instance of RetrieveUserProxyAgent.
             recipient (Agent): the recipient agent. Usually it's the assistant agent.
             context (dict): the context for the message generation. It should contain the following keys:
-                - problem (str): the problem to be solved.
-                - n_results (int): the number of results to be retrieved. Default is 20.
-                - search_string (str): only docs that contain an exact match of this string will be retrieved. Default is "".
+                - `problem` (str) - the problem to be solved.
+                - `n_results` (int) - the number of results to be retrieved. Default is 20.
+                - `search_string` (str) - only docs that contain an exact match of this string will be retrieved. Default is "".
         Returns:
             str: the generated message ready to be sent to the recipient agent.
         """