Post release update (microsoft#985)

* news update * doc update * avoid KeyError * bump version to 1.2.1 * handle empty responses * typo * eval function
jfischburg-us · Apr 10, 2023 · c780d79 · c780d79
1 parent a701cd8
commit c780d79
Show file tree

Hide file tree

Showing 7 changed files with 17 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -14,7 +14,7 @@
     <br>
 </p>
 
-:fire: OpenAI GPT-3 models support in v1.1.3. ChatGPT and GPT-4 support will be added in v1.2.0.
+:fire: v1.2.0 is released with support for ChatGPT and GPT-4.
 
 :fire: A [lab forum](https://github.com/microsoft/FLAML/tree/tutorial-aaai23/tutorial) on FLAML at AAAI 2023.
 

diff --git a/flaml/autogen/math_utils.py b/flaml/autogen/math_utils.py
@@ -290,8 +290,16 @@ def eval_math_responses(responses, solution=None, **args):
     Returns:
         dict: The success metrics.
     """
-    success_list = []
     n = len(responses)
+    if not n:
+        return {
+            "expected_success": 0,
+            "success": False,
+            "success_vote": 0,
+            "voted_answer": None,
+            "votes": 0,
+        }
+    success_list = []
     if solution is not None:
         for i in range(n):
             response = responses[i]

diff --git a/flaml/autogen/oai/completion.py b/flaml/autogen/oai/completion.py
@@ -843,7 +843,7 @@ def extract_text(cls, response: dict) -> List[str]:
         choices = response["choices"]
         if "text" in choices[0]:
             return [choice["text"] for choice in choices]
-        return [choice["message"]["content"] for choice in choices]
+        return [choice["message"].get("content", "") for choice in choices]
 
 
 class ChatCompletion(Completion):

diff --git a/flaml/version.py b/flaml/version.py
@@ -1 +1 @@
-__version__ = "1.2.0"
+__version__ = "1.2.1"
diff --git a/test/openai/test_completion.py b/test/openai/test_completion.py
@@ -216,6 +216,7 @@ def my_average(results):
     print("tuned config", config)
     result = oai.ChatCompletion.test(test_data_sample, config)
     print("result from tuned config:", result)
+    print("empty responses", eval_math_responses([], None))
 
 
 if __name__ == "__main__":

diff --git a/website/docs/Examples/AutoGen-OpenAI.md b/website/docs/Examples/AutoGen-OpenAI.md
@@ -56,7 +56,7 @@ test_data = [
 ]
 ```
 
-### Defining the metric
+### Define the metric
 
 Before starting tuning, you need to define the metric for the optimization. For each code generation task, we can use the model to generate multiple candidate responses, and then select one from them. If the final selected response can pass a unit test, we consider the task as successfully solved. Then we can define the average success rate on a collection of tasks as the optimization metric.
 
@@ -69,7 +69,7 @@ eval_with_generated_assertions = partial(eval_function_completions, assertions=g
 
 This function will first generate assertion statements for each problem. Then, it uses the assertions to select the generated responses.
 
-### Tuning Hyperparameters for OpenAI
+### Tune the hyperparameters
 
 The tuning will be performed under the specified optimization budgets.
 

diff --git a/website/docs/Use-Cases/Auto-Generation.md b/website/docs/Use-Cases/Auto-Generation.md
@@ -44,13 +44,13 @@ Collect a diverse set of instances. They can be stored in an iterable of dicts.
 The evaluation function should take a list of responses, and other keyword arguments corresponding to the keys in each validation data instance as input, and output a dict of metrics. For example,
 
 ```python
-def success_metrics(responses: List[str], problem: str, solution: str) -> Dict:
+def eval_math_responses(responses: List[str], solution: str, **args) -> Dict:
     # select a response from the list of responses
     # check whether the answer is correct
     return {"success": True or False}
 ```
 
-`flaml.autogen` offers some example evaluation functions for common tasks such as code generation and math problem solving.
+[`flaml.autogen.code_utils`](../reference/autogen/code_utils) and [`flaml.autogen.math_utils`](../reference/autogen/math_utils) offer some example evaluation functions for code generation and math problem solving.
 
 ### Metric to optimize
-Original file line number
+Diff line change
@@ Expand Up / @@ -14,7 +14,7 @@ @@
         <br>
     </p>
-    :fire: OpenAI GPT-3 models support in v1.1.3. ChatGPT and GPT-4 support will be added in v1.2.0.
+    :fire: v1.2.0 is released with support for ChatGPT and GPT-4.
     :fire: A [lab forum](https://github.com/microsoft/FLAML/tree/tutorial-aaai23/tutorial) on FLAML at AAAI 2023.
@@ Expand Down @@