Skip to content

Conversation

azure-sdk
Copy link
Collaborator

Sync eng/common directory with azure-sdk-tools for PR Azure/azure-sdk-tools#10027 See eng/common workflow

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.
@azure-sdk azure-sdk requested a review from a team as a code owner March 11, 2025 21:22
@azure-sdk azure-sdk requested a review from weshaggard March 11, 2025 21:23
@azure-sdk azure-sdk added EngSys This issue is impacting the engineering system. Central-EngSys This issue is owned by the Engineering System team. labels Mar 11, 2025
@azure-sdk azure-sdk merged commit f3fcfb4 into main Mar 13, 2025
25 checks passed
@azure-sdk azure-sdk deleted the sync-eng/common-AddRGPrefixForSSS-10027 branch March 13, 2025 18:00
allenkim0129 added a commit to allenkim0129/azure-sdk-for-python that referenced this pull request Mar 19, 2025
commit 23163bc
Author: Paul Van Eck <paulvaneck@microsoft.com>
Date:   Tue Mar 18 13:33:26 2025 -0700

    [Core] Fix traceparent header in DistributedTracingPolicy (Azure#40074)

    When we get the trace context headers, we should first ensure that we are
    inside the HTTP span's context to ensure that the traceparent header
    contains the correct span ID.

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

    * Fix mypy

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

    ---------

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

commit 1541021
Author: Liudmila Molkova <limolkova@microsoft.com>
Date:   Tue Mar 18 11:10:47 2025 -0700

    azure-core-tracing-otel: update suppression and fix context management (Azure#39994)

commit 8d3838d
Author: Peter Wu <162184229+weirongw23-msft@users.noreply.github.com>
Date:   Tue Mar 18 08:40:41 2025 -0400

    [Storage] `next-pylint` April 25 (Azure#39839)

commit c975ade
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Mon Mar 17 22:40:35 2025 -0700

    [AutoRelease] t2-containerservice-2025-03-18-22063(can only be merged by SDK owner) (Azure#40111)

    * code and test

    * update testcases

    ---------

    Co-authored-by: azure-sdk <PythonSdkPipelines>
    Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com>

commit d626696
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Mon Mar 17 22:03:54 2025 -0700

    Increment package version after release of azure-cosmos (Azure#40107)

commit a258145
Author: Ralph <18542438+ralph-msft@users.noreply.github.com>
Date:   Mon Mar 17 21:56:04 2025 -0700

    Port Prompty code

    A port of the Prompty code from the Promtpflow repo. The focus was on expediency, rather than elegance. The core logic of the the code is similar to the original except for the following changes:
    - Added type annotations
    - Removed support for the now legacy OpenAI completions API
    - Removed support for functions and tools. The former relied on an insecure implementation using eval. Since none of the evaluators use this feature right now, these were cut
    - Reworked the way images were handled to simplify (now handled in one pass, and no more surprise calls out to the internet to unnecessarily load image bytes)
    - Minor obvious tweaks to the code to improve readability, and trim unnecessary code paths

commit cff97a2
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Mon Mar 17 19:44:45 2025 -0700

    [AutoRelease] t2-databox-2025-03-08-30175(can only be merged by SDK owner) (Azure#39993)

    * code and test

    * update testcases

    * update testcases

    ---------

    Co-authored-by: azure-sdk <PythonSdkPipelines>
    Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com>

commit 0dc14f1
Author: Paul Van Eck <paulvaneck@microsoft.com>
Date:   Mon Mar 17 17:58:06 2025 -0700

    [Monitor Exporter] Update Azure SDK messaging span util (Azure#40059)

    The utility method that gets the target source for Azure SDK messaging
    spans will now also check for `server.address`. A newer version of the
    core OTel plugin will automatically convert `net.peer.name` to
    `server.address`. This ensures that this case can be handled.

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

commit 572a8e3
Author: v-pivamshi <155710746+v-pivamshi@users.noreply.github.com>
Date:   Tue Mar 18 04:18:48 2025 +0530

    added call connection id for recording and live and unit tests code. (Azure#38988)

    * added call connection id for recording and live and unit tests code.

    * made changes and run the live tests.

    * removed empty spaces.

    * resolved the conflicts and recorded the live test.

    * recorded live test.

commit fd1b592
Author: Tomas Varon <70857381+tvaron3@users.noreply.github.com>
Date:   Mon Mar 17 15:12:36 2025 -0700

    Release 4.10.0b2 (Azure#40100)

    * background call for get database account call

    * only call get database account in health check for different endpoints

    * Change health check logic in sync

    * Revert removing timing

    * update changelog

    * fix tests

    * use asyncio create_task

    * fix tests

    * fix pylint

    * Renamed variables and added effective preferred locations

    * update changelog

    * Add test for effective preferred regions

    * Add test for effective preferred regions

    * sync test for preferred regions

    * Renaming and add test for health check

    * fix tests

    * fix tests and add more health check tests

    * fix tests

    * add tests

    * fix tests

    * Move to breaking change

    * fix cspell and tests

    * fix tests

    * fix tests

    * revert preferred locations

    * timeout mark unavailable

    * moving health check to background

    * only finish health check after 2 successes

    * fix tests

    * add tests

    * fix and add timeout tests for marking endpoints unavailable

    * Reacting to comments

    * Reacting to comments

    * Add cleanup of background task when cosmos client closes

    * React to comments

    * Remove consecutive failures code

    * updated changelog

    * Fix test

    * Fix tests that weren't awaiting properly

    * fix tests

    * remove marking unavailable, health check four regions, health check primary and alternate

    * fix type hints

    * fix startup scenario

    * fix updating cache

    * fix operation type check unavailable

    * add logic to check by endpoint and not by regional routing context

    * fix tests

    * fix tests

    * fix test

    * fix tests

    * fix tests

    * debug multimaster test

    * cleanup

    * mark global endpoint available

    * fix tests and only use endpoint from gateway for multimaster

    * retry multi region writes same as reads

    * react to comments

    * multi write fix

    * cleanup / pylint

    * added some comments

    * dont use session token for writes

    * add tests for session token changes, update change log, adds comments to tests

    * fix pylint

    * fix test

    * revert session changes

    * release changes

    ---------

    Co-authored-by: Tomas Varon <tomasvaron@Tomass-MacBook-Pro.local>

commit 9936faa
Author: Waqas Javed <7674577+w-javed@users.noreply.github.com>
Date:   Mon Mar 17 14:37:01 2025 -0700

    Rename to ungrounded attributes (Azure#40078)

    * rename to personal attributes

    * uploading asset with renamed new tests

    * rename to ungroundedness

    * few changes

    * fix

    * fix

commit 9f6bf3b
Author: Tomas Varon <70857381+tvaron3@users.noreply.github.com>
Date:   Mon Mar 17 14:28:19 2025 -0700

    Health Check Improvements (Azure#39647)

    * background call for get database account call

    * only call get database account in health check for different endpoints

    * Change health check logic in sync

    * Revert removing timing

    * update changelog

    * fix tests

    * use asyncio create_task

    * fix tests

    * fix pylint

    * Renamed variables and added effective preferred locations

    * update changelog

    * Add test for effective preferred regions

    * Add test for effective preferred regions

    * sync test for preferred regions

    * Renaming and add test for health check

    * fix tests

    * fix tests and add more health check tests

    * fix tests

    * add tests

    * fix tests

    * Move to breaking change

    * fix cspell and tests

    * fix tests

    * fix tests

    * revert preferred locations

    * timeout mark unavailable

    * moving health check to background

    * only finish health check after 2 successes

    * fix tests

    * add tests

    * fix and add timeout tests for marking endpoints unavailable

    * Reacting to comments

    * Reacting to comments

    * Add cleanup of background task when cosmos client closes

    * React to comments

    * Remove consecutive failures code

    * updated changelog

    * Fix test

    * Fix tests that weren't awaiting properly

    * fix tests

    * remove marking unavailable, health check four regions, health check primary and alternate

    * fix type hints

    * fix startup scenario

    * fix updating cache

    * fix operation type check unavailable

    * add logic to check by endpoint and not by regional routing context

    * fix tests

    * fix tests

    * fix test

    * fix tests

    * fix tests

    * debug multimaster test

    * cleanup

    * mark global endpoint available

    * fix tests and only use endpoint from gateway for multimaster

    * retry multi region writes same as reads

    * react to comments

    * multi write fix

    * cleanup / pylint

    * added some comments

    * dont use session token for writes

    * add tests for session token changes, update change log, adds comments to tests

    * fix pylint

    * fix test

    * revert session changes

    * react to comments

    * React to comments

    ---------

    Co-authored-by: Tomas Varon <tomasvaron@Tomass-MacBook-Pro.local>

commit 9ad050f
Author: Liudmila Molkova <limolkova@microsoft.com>
Date:   Mon Mar 17 12:45:14 2025 -0700

    Don't ascii escape unicode chars in prompts and completions (Azure#40003)

    * don't ascii encode unicode chars in prompts and completions

commit e9cb15e
Author: Peter Wu <162184229+weirongw23-msft@users.noreply.github.com>
Date:   Mon Mar 17 13:14:25 2025 -0400

    [Storage] [Typing] [File Datalake] `azure-storage-file-datalake` (Azure#39691)

commit e9e626f
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Fri Mar 14 17:38:27 2025 -0700

    Update GitHubEventProcessor version to 1.0.0-dev.20250314.4 (Azure#40083)

    Co-authored-by: Juan Ospina <juanospina77752@gmail.com>

commit d8aa537
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Fri Mar 14 12:06:13 2025 -0700

    Use unified pipeline convention for rust (Azure#40071)

    Co-authored-by: Patrick Hallisey <pahallis@microsoft.com>

commit b6e506d
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Fri Mar 14 11:44:04 2025 -0700

    Gracefully handle when we return no matrix results (Azure#40079)

    Co-authored-by: Scott Beddall <scbedd@microsoft.com>

commit 582b642
Author: slister1001 <103153180+slister1001@users.noreply.github.com>
Date:   Fri Mar 14 11:11:07 2025 -0700

    Adding ECI to SafetyEvaluation (Azure#39915)

    * Adding ECI to SafetyEvaluation

    * fix typos

commit ff36cdb
Author: Krista Pratico <krpratic@microsoft.com>
Date:   Fri Mar 14 09:50:59 2025 -0700

    skip failing models.list tests (Azure#40073)

    * skip failing models.list tests

    * fix test fails

commit 1f7312f
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Thu Mar 13 16:53:09 2025 -0700

    Fix group prefix conditional in test resource removal (Azure#40067)

    Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com>

commit 6b747ae
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Thu Mar 13 15:59:50 2025 -0700

    Ensure that direct batches have an underscore before the batchnumber (Azure#40060)

    Co-authored-by: Scott Beddall <scbedd@microsoft.com>

commit e816b75
Author: catalinaperalta <catalinaperaltah@hotmail.com>
Date:   Thu Mar 13 13:04:54 2025 -0700

    changelog date (Azure#40066)

    Co-authored-by: catalinaperalta <caperal@microsoft.com>

commit fa14626
Author: Paul Van Eck <paulvaneck@microsoft.com>
Date:   Thu Mar 13 12:58:40 2025 -0700

    [Cosmos] Rename `test` directory (Azure#40030)

    For better compatibility with CI tooling and consistency with other Azure
    SDK packages, let's rename `test` to `tests`.

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

commit 42e0a23
Author: whisper6284 <61473382+whisper6284@users.noreply.github.com>
Date:   Thu Mar 13 12:56:24 2025 -0700

    Update CODEOWNERS for Azure Communication Services Phone Numbers (Azure#40058)

    Co-authored-by: Adrian Tang <adtang@microsoft.com>

commit 3b39fbb
Author: Waqas Javed <7674577+w-javed@users.noreply.github.com>
Date:   Thu Mar 13 11:59:15 2025 -0700

    Adding change log and New test for Isa Eval - updates (Azure#40051)

    * first commit

    * added text

    * updating assets

    * fix cspell

    * fix cspell

    * test fix

    * test fix

    * refereshed assets

    * refereshed assets

    * asset update

    * asset update

    * change to details

    * change to details

    * assets

    * new assets

    * new assets

    * new assets

    * new assets

    * asset

    * adding isa

    * test added

    * revert operation

    * Fix

    * Fix & asset

    * Fix & asset

    * Fix & asset

    * remove singleton

    * remove singleton

    * fix

    * one more test

    * adding one more test for ISA

    * fix

    * adding change log

    * typo

    * typo

commit f3fcfb4
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Thu Mar 13 11:00:03 2025 -0700

    Add resource prefix for safe secret standard alerts (Azure#40028)

    Add the prefix to identify RGs that we are creating in our TME
    tenant to identify them as potentially using local auth and violating
    our safe secret standards.

    Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

commit c33e4f7
Author: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Date:   Thu Mar 13 12:48:04 2025 +0530

    For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (Azure#39934)

    * add disableLocalAuth for computeInstance

    * fix disableLocalAuthAuth issue for amlCompute

    * update compute instance

    * update recordings

    * temp changes

    * Revert "temp changes"

    This reverts commit 64e3c38.

    * update recordings

    * fix tests

commit 9c5f623
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Wed Mar 12 20:02:34 2025 -0700

    [AutoRelease] t2-keyvault-2025-02-25-62029(can only be merged by SDK owner) (Azure#39856)

    * code and test

    * Update version to 11.0.0 and changelog

    * update

    * update

    * update

    * update testcases

    * Fix release date in CHANGELOG.md

    * Update release date in CHANGELOG.md

    ---------

    Co-authored-by: azure-sdk <PythonSdkPipelines>
    Co-authored-by: ChenxiJiang333 <119990644+ChenxiJiang333@users.noreply.github.com>
    Co-authored-by: Yuchao Yan <yuchaoyan@microsoft.com>
    Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com>

commit d762abb
Author: Scott Beddall <45376673+scbedd@users.noreply.github.com>
Date:   Wed Mar 12 16:38:08 2025 -0700

    remove windows style paths on returned AdditionalValidationPackages paths in get_package_properties.py (Azure#40053)

commit 0a2ae72
Author: Scott Beddall <45376673+scbedd@users.noreply.github.com>
Date:   Wed Mar 12 14:07:22 2025 -0700

    Use Root .coveragerc when publishing coverage reports (Azure#40035)

    * use the .coveragerc file across the board
    * adjust the coverage threshold for azure-ai-ml

commit 26f7999
Author: Paul Van Eck <paulvaneck@microsoft.com>
Date:   Wed Mar 12 13:38:23 2025 -0700

    [Core] OpenTelemetryTracer updates (Azure#40024)

    * [Core] OpenTelemetryTracer updates

    - Add `end_on_exit` keyword argument to `start_as_current_span`
    - Add `use_span` class method to activate a span in the current tracing
      context.

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

    * Update docstrings

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

    ---------

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

commit a2ceed3
Author: M-Hietala <78813398+M-Hietala@users.noreply.github.com>
Date:   Wed Mar 12 14:24:04 2025 -0500

    adding agents instrumentation to configure_azure_monitor (Azure#40043)

    * adding agents instrumentation to configure_azure_monitor

    * removing 2 from package names that was left from testing

    * adding sample and updating changelog

    * changing to array of instrumentors

    ---------

    Co-authored-by: Marko Hietala <markhiet@microsoft.com>

commit e94f2de
Author: Krista Pratico <krpratic@microsoft.com>
Date:   Wed Mar 12 12:11:00 2025 -0700

    update tests - vector_stores out of beta (Azure#40046)

commit dc84a1b
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Wed Mar 12 12:10:24 2025 -0700

    Fix issue with excludepaths in get-prpkgproperties (Azure#40050)

    Co-authored-by: Scott Beddall <scbedd@microsoft.com>

commit 9613853
Author: Waqas Javed <7674577+w-javed@users.noreply.github.com>
Date:   Wed Mar 12 10:45:07 2025 -0700

    Code Vulnerability & ISA Evaluators (Azure#39882)

    * first commit

    * added text

    * updating assets

    * fix cspell

    * fix cspell

    * test fix

    * test fix

    * refereshed assets

    * refereshed assets

    * asset update

    * asset update

    * change to details

    * change to details

    * assets

    * new assets

    * new assets

    * new assets

    * new assets

    * asset

    * adding isa

    * test added

    * revert operation

    * Fix

    * Fix & asset

    * Fix & asset

    * Fix & asset

    * remove singleton

    * remove singleton

    * fix

commit 516977e
Author: swathipil <76007337+swathipil@users.noreply.github.com>
Date:   Wed Mar 12 10:23:40 2025 -0700

    [ServiceBus] fix changelog (Azure#40044)

commit 9e5dd05
Author: Liudmila Molkova <limolkova@microsoft.com>
Date:   Tue Mar 11 22:37:40 2025 -0700

    suppress generic spans for manually instrumented operations (Azure#40010)

commit 7e70af1
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Tue Mar 11 18:05:58 2025 -0700

    use the pull request head sha for APIView PR runs (Azure#40032)

    Co-authored-by: Chidozie Ononiwu <chononiw@microsoft.com>

commit a893b79
Author: Ralph <18542438+ralph-msft@users.noreply.github.com>
Date:   Tue Mar 11 16:06:18 2025 -0700

    Tox black code reformatting

    Fix "tox -e black" formatting issues

commit 45cab16
Author: Leighton Chen <lechen@microsoft.com>
Date:   Tue Mar 11 13:45:16 2025 -0800

    Support synthetic source in exporter (Azure#40004)

commit 3064d49
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Tue Mar 11 14:23:04 2025 -0700

    Increment package version after release of azure-identity (Azure#40027)

commit 7e8420e
Author: Peter Wu <162184229+weirongw23-msft@users.noreply.github.com>
Date:   Tue Mar 11 16:28:11 2025 -0400

    Increment versions after STG 97 GA (Azure#40026)

commit 6eafa7a
Author: Peter Wu <162184229+weirongw23-msft@users.noreply.github.com>
Date:   Tue Mar 11 15:57:22 2025 -0400

    [Storage] [Typing] [Blob Changefeed] `azure-storage-blob-changefeed` (Azure#39731) (Azure#40002)

commit 864fa17
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Tue Mar 11 12:06:54 2025 -0700

    Sync eng/common directory with azure-sdk-tools for PR 9993 (Azure#40021)

    * optimize save-package-properties for large diffs

    ---------

    Co-authored-by: Scott Beddall <scbedd@microsoft.com>

commit 105d698
Author: Paul Van Eck <paulvaneck@microsoft.com>
Date:   Tue Mar 11 11:06:03 2025 -0700

    [Identity] Prep release (Azure#40009)

    Signed-off-by: Paul Van Eck <paulvaneck@microsoft.com>

commit 204b671
Author: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Date:   Tue Mar 11 08:07:41 2025 -0700

    Increment package version after release of azure-search-documents (Azure#40017)

commit f8c5bfd
Author: Yuchao Yan <yuchaoyan@microsoft.com>
Date:   Tue Mar 11 16:08:42 2025 +0800

    update version (Azure#40014)
singankit pushed a commit that referenced this pull request Mar 21, 2025
…0146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
singankit pushed a commit that referenced this pull request Mar 24, 2025
…0146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
singankit added a commit that referenced this pull request Mar 25, 2025
* Tool Call Accuracy Evaluator (#40068)

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Updating prompt

* Renaming parameter

* Converter from AI Service threads/runs to evaluator-compatible schema (#40047)

* WIP AIAgentConverter

* Added the v1 of the converter

* Updated the AIAgentConverter with different output schemas.

* ruff format

* Update the top schema to have: query, response, tool_definitions

* "agentic" is not a recognized word, change the wording.

* System message always comes first in query with multiple runs.

* Add support for getting inputs from local files with run_ids.

* Export AIAgentConverter through azure.ai.evaluation, local read updates

* Use from ._models import

* Ruff format again.

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Simplify the API by rolling up the static methods and hiding internals.

* Lock the ._converters._ai_services behind an import error.

* Print to install azure-ai-projects if we can't import AIAgentConverter

* By default, include all previous runs' tool calls and results.

* Don't crash if there is no content in historical thread messages.

* Parallelize the calls to get step_details for each run_id.

* Addressing PR comments.

* Use a single underscore to hide internal static members.

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>

* Adding intent_resolution_evaluator to prp/agent_evaluators branch (#40065)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Add Task Adherence and Completeness  (#40098)

* Agentic Evaluator - Response Completeness

* Added Change Log for Response Completeness Agentic Evaluator

* Task Adherence Agentic Evaluator

* Add Task Adherence Evaluator to changelog

* fixing contracts for Completeness and Task Adherence Evaluators

* Enhancing Contract for Task Adherence and Response Completeness Agentic Evaluator

* update completeness implementation.

* update the completeness evaluator response to include threshold comparison.

* updating the implementation for completeness.

* updating the type for completeness score.

* updating the parsing logic for llm output of completeness.

* updating the response dict for completeness.

* Adding Task adherence

* Adding Task Adherence evaluator with samples

* Delete old files

* updating the exception for completeness evaluator.

* Changing docstring

* Adding changelog

* Use _result_key

* Add admonition

---------

Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Adding bug bash sample and instructions (#40125)

* Adding bug bash sample and instructions

* Updating instructions

* Update instructions.md

* Adding instructions and evaluator to agent evaluation sample

* add bug bash sample notebook for response completeness evaluator. (#40139)

* add bug bash sample notebook for response completeness evaluator.

* update the notebook for completeness.

---------

Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Sample specific for tool call accuracy evaluator (#40135)

* Update instructions.md

* Add IntentResolution evaluator bug bash notebook (#40144)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Sample notebook to demo intent_resolution evaluator

* Add synthetic data and section on how to test data from disk

* Update instructions.md

* Update _tool_call_accuracy.py

* Improve task adherence prompt and add sample notebook for bugbash (#40146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* [AIAgentConverter] Added support for converting entire threads. (#40178)

* Implemented prepare_evaluation_data

* Add support for retrieving multiple threads into the same file.

* Parallelize thread preparing across threads.

* Set the maximum number of workers in thread pools to 10.

* Users/singankit/tool call accuracy evaluator tests (#40190)

* Raising error when tool call not found

* Adding unit tests for tool call accuracy evaluator

* Updating sample

* update output of converter for tool calls

* add built-ins

* handle file search

* remove extra files

* revert

* revert

---------

Co-authored-by: Ankit Singhal <30610298+singankit@users.noreply.github.com>
Co-authored-by: Sandy <16922860+thecsw@users.noreply.github.com>
Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Jose Santos <jcsantos@microsoft.com>
Co-authored-by: ghyadav <103428325+ghyadav@users.noreply.github.com>
Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>
Co-authored-by: Ankit Singhal <anksing@microsoft.com>
Co-authored-by: Chandra Sekhar Gupta <38103118+guptha23@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
Co-authored-by: spon <stevenpon@microsoft.com>
singankit added a commit that referenced this pull request Mar 25, 2025
* Tool Call Accuracy Evaluator (#40068)

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Updating prompt

* Renaming parameter

* Converter from AI Service threads/runs to evaluator-compatible schema (#40047)

* WIP AIAgentConverter

* Added the v1 of the converter

* Updated the AIAgentConverter with different output schemas.

* ruff format

* Update the top schema to have: query, response, tool_definitions

* "agentic" is not a recognized word, change the wording.

* System message always comes first in query with multiple runs.

* Add support for getting inputs from local files with run_ids.

* Export AIAgentConverter through azure.ai.evaluation, local read updates

* Use from ._models import

* Ruff format again.

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Simplify the API by rolling up the static methods and hiding internals.

* Lock the ._converters._ai_services behind an import error.

* Print to install azure-ai-projects if we can't import AIAgentConverter

* By default, include all previous runs' tool calls and results.

* Don't crash if there is no content in historical thread messages.

* Parallelize the calls to get step_details for each run_id.

* Addressing PR comments.

* Use a single underscore to hide internal static members.

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>

* Adding intent_resolution_evaluator to prp/agent_evaluators branch (#40065)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Add Task Adherence and Completeness  (#40098)

* Agentic Evaluator - Response Completeness

* Added Change Log for Response Completeness Agentic Evaluator

* Task Adherence Agentic Evaluator

* Add Task Adherence Evaluator to changelog

* fixing contracts for Completeness and Task Adherence Evaluators

* Enhancing Contract for Task Adherence and Response Completeness Agentic Evaluator

* update completeness implementation.

* update the completeness evaluator response to include threshold comparison.

* updating the implementation for completeness.

* updating the type for completeness score.

* updating the parsing logic for llm output of completeness.

* updating the response dict for completeness.

* Adding Task adherence

* Adding Task Adherence evaluator with samples

* Delete old files

* updating the exception for completeness evaluator.

* Changing docstring

* Adding changelog

* Use _result_key

* Add admonition

---------

Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Adding bug bash sample and instructions (#40125)

* Adding bug bash sample and instructions

* Updating instructions

* Update instructions.md

* Adding instructions and evaluator to agent evaluation sample

* add bug bash sample notebook for response completeness evaluator. (#40139)

* add bug bash sample notebook for response completeness evaluator.

* update the notebook for completeness.

---------

Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Sample specific for tool call accuracy evaluator (#40135)

* Update instructions.md

* Add IntentResolution evaluator bug bash notebook (#40144)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Sample notebook to demo intent_resolution evaluator

* Add synthetic data and section on how to test data from disk

* Update instructions.md

* Update _tool_call_accuracy.py

* Improve task adherence prompt and add sample notebook for bugbash (#40146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* [AIAgentConverter] Added support for converting entire threads. (#40178)

* Implemented prepare_evaluation_data

* Add support for retrieving multiple threads into the same file.

* Parallelize thread preparing across threads.

* Set the maximum number of workers in thread pools to 10.

* Users/singankit/tool call accuracy evaluator tests (#40190)

* Raising error when tool call not found

* Adding unit tests for tool call accuracy evaluator

* Updating sample

* update output of converter for tool calls

* add built-ins

* handle file search

* remove extra files

* revert

* revert

* fix built-in tool parsing bug

* remove local debug

* Formatted and updated the converter to avoid built-in tool crashes.

* Added an experimental decorator to AIAgentConverter

* Update import path for experimental decorator

---------

Co-authored-by: Ankit Singhal <30610298+singankit@users.noreply.github.com>
Co-authored-by: Sandy <16922860+thecsw@users.noreply.github.com>
Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Jose Santos <jcsantos@microsoft.com>
Co-authored-by: ghyadav <103428325+ghyadav@users.noreply.github.com>
Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>
Co-authored-by: Ankit Singhal <anksing@microsoft.com>
Co-authored-by: Chandra Sekhar Gupta <38103118+guptha23@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
Co-authored-by: spon <stevenpon@microsoft.com>
Co-authored-by: Sandy Urazayev <surazayev@microsoft.com>
singankit added a commit that referenced this pull request Mar 25, 2025
* Tool Call Accuracy Evaluator (#40068)

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Updating prompt

* Renaming parameter

* Converter from AI Service threads/runs to evaluator-compatible schema (#40047)

* WIP AIAgentConverter

* Added the v1 of the converter

* Updated the AIAgentConverter with different output schemas.

* ruff format

* Update the top schema to have: query, response, tool_definitions

* "agentic" is not a recognized word, change the wording.

* System message always comes first in query with multiple runs.

* Add support for getting inputs from local files with run_ids.

* Export AIAgentConverter through azure.ai.evaluation, local read updates

* Use from ._models import

* Ruff format again.

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Simplify the API by rolling up the static methods and hiding internals.

* Lock the ._converters._ai_services behind an import error.

* Print to install azure-ai-projects if we can't import AIAgentConverter

* By default, include all previous runs' tool calls and results.

* Don't crash if there is no content in historical thread messages.

* Parallelize the calls to get step_details for each run_id.

* Addressing PR comments.

* Use a single underscore to hide internal static members.

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>

* Adding intent_resolution_evaluator to prp/agent_evaluators branch (#40065)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Add Task Adherence and Completeness  (#40098)

* Agentic Evaluator - Response Completeness

* Added Change Log for Response Completeness Agentic Evaluator

* Task Adherence Agentic Evaluator

* Add Task Adherence Evaluator to changelog

* fixing contracts for Completeness and Task Adherence Evaluators

* Enhancing Contract for Task Adherence and Response Completeness Agentic Evaluator

* update completeness implementation.

* update the completeness evaluator response to include threshold comparison.

* updating the implementation for completeness.

* updating the type for completeness score.

* updating the parsing logic for llm output of completeness.

* updating the response dict for completeness.

* Adding Task adherence

* Adding Task Adherence evaluator with samples

* Delete old files

* updating the exception for completeness evaluator.

* Changing docstring

* Adding changelog

* Use _result_key

* Add admonition

---------

Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Adding bug bash sample and instructions (#40125)

* Adding bug bash sample and instructions

* Updating instructions

* Update instructions.md

* Adding instructions and evaluator to agent evaluation sample

* Raising error when tool call not found

* add bug bash sample notebook for response completeness evaluator. (#40139)

* add bug bash sample notebook for response completeness evaluator.

* update the notebook for completeness.

---------

Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Sample specific for tool call accuracy evaluator (#40135)

* Update instructions.md

* Add IntentResolution evaluator bug bash notebook (#40144)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Sample notebook to demo intent_resolution evaluator

* Add synthetic data and section on how to test data from disk

* Update instructions.md

* Improve task adherence prompt and add sample notebook for bugbash (#40146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* [AIAgentConverter] Added support for converting entire threads. (#40178)

* Implemented prepare_evaluation_data

* Add support for retrieving multiple threads into the same file.

* Parallelize thread preparing across threads.

* Set the maximum number of workers in thread pools to 10.

* Users/singankit/tool call accuracy evaluator tests (#40190)

* Raising error when tool call not found

* Adding unit tests for tool call accuracy evaluator

* Updating sample

* Fixng doc strings and moving sample to a different agent sample folder

* Fixing bug with raising exception

* Fixing rebase issue

* Removing cell outputs to fix spell check erros

* Fixing failing tests

* Fixing failing test

* Spon/update evals converter (#40204)

* Tool Call Accuracy Evaluator (#40068)

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Updating prompt

* Renaming parameter

* Converter from AI Service threads/runs to evaluator-compatible schema (#40047)

* WIP AIAgentConverter

* Added the v1 of the converter

* Updated the AIAgentConverter with different output schemas.

* ruff format

* Update the top schema to have: query, response, tool_definitions

* "agentic" is not a recognized word, change the wording.

* System message always comes first in query with multiple runs.

* Add support for getting inputs from local files with run_ids.

* Export AIAgentConverter through azure.ai.evaluation, local read updates

* Use from ._models import

* Ruff format again.

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Simplify the API by rolling up the static methods and hiding internals.

* Lock the ._converters._ai_services behind an import error.

* Print to install azure-ai-projects if we can't import AIAgentConverter

* By default, include all previous runs' tool calls and results.

* Don't crash if there is no content in historical thread messages.

* Parallelize the calls to get step_details for each run_id.

* Addressing PR comments.

* Use a single underscore to hide internal static members.

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>

* Adding intent_resolution_evaluator to prp/agent_evaluators branch (#40065)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Add Task Adherence and Completeness  (#40098)

* Agentic Evaluator - Response Completeness

* Added Change Log for Response Completeness Agentic Evaluator

* Task Adherence Agentic Evaluator

* Add Task Adherence Evaluator to changelog

* fixing contracts for Completeness and Task Adherence Evaluators

* Enhancing Contract for Task Adherence and Response Completeness Agentic Evaluator

* update completeness implementation.

* update the completeness evaluator response to include threshold comparison.

* updating the implementation for completeness.

* updating the type for completeness score.

* updating the parsing logic for llm output of completeness.

* updating the response dict for completeness.

* Adding Task adherence

* Adding Task Adherence evaluator with samples

* Delete old files

* updating the exception for completeness evaluator.

* Changing docstring

* Adding changelog

* Use _result_key

* Add admonition

---------

Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Adding bug bash sample and instructions (#40125)

* Adding bug bash sample and instructions

* Updating instructions

* Update instructions.md

* Adding instructions and evaluator to agent evaluation sample

* add bug bash sample notebook for response completeness evaluator. (#40139)

* add bug bash sample notebook for response completeness evaluator.

* update the notebook for completeness.

---------

Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Sample specific for tool call accuracy evaluator (#40135)

* Update instructions.md

* Add IntentResolution evaluator bug bash notebook (#40144)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Sample notebook to demo intent_resolution evaluator

* Add synthetic data and section on how to test data from disk

* Update instructions.md

* Update _tool_call_accuracy.py

* Improve task adherence prompt and add sample notebook for bugbash (#40146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* [AIAgentConverter] Added support for converting entire threads. (#40178)

* Implemented prepare_evaluation_data

* Add support for retrieving multiple threads into the same file.

* Parallelize thread preparing across threads.

* Set the maximum number of workers in thread pools to 10.

* Users/singankit/tool call accuracy evaluator tests (#40190)

* Raising error when tool call not found

* Adding unit tests for tool call accuracy evaluator

* Updating sample

* update output of converter for tool calls

* add built-ins

* handle file search

* remove extra files

* revert

* revert

---------

Co-authored-by: Ankit Singhal <30610298+singankit@users.noreply.github.com>
Co-authored-by: Sandy <16922860+thecsw@users.noreply.github.com>
Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Jose Santos <jcsantos@microsoft.com>
Co-authored-by: ghyadav <103428325+ghyadav@users.noreply.github.com>
Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>
Co-authored-by: Ankit Singhal <anksing@microsoft.com>
Co-authored-by: Chandra Sekhar Gupta <38103118+guptha23@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
Co-authored-by: spon <stevenpon@microsoft.com>

* Updating tool call accuracy to use update tool call schema

* Update response completeness evaluator based on schema (#40214)

* add seed parameter for deterministic results and update completeness return type based on schema.

* add unit tests for response completeness.

* update completeness key to response completeness.

* fixing the unit test for updated completeness key.

* updating completeness to responsecompleteness evaluator.

* clearing output in response completeness sample notebook.

* clearing output in response completeness sample notebook.

---------

Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Spon/update evals converter (#40215)

* Tool Call Accuracy Evaluator (#40068)

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Tool Call Accuracy Evaluator

* Review comments

* Updating score key and output structure

* Updating prompt

* Renaming parameter

* Converter from AI Service threads/runs to evaluator-compatible schema (#40047)

* WIP AIAgentConverter

* Added the v1 of the converter

* Updated the AIAgentConverter with different output schemas.

* ruff format

* Update the top schema to have: query, response, tool_definitions

* "agentic" is not a recognized word, change the wording.

* System message always comes first in query with multiple runs.

* Add support for getting inputs from local files with run_ids.

* Export AIAgentConverter through azure.ai.evaluation, local read updates

* Use from ._models import

* Ruff format again.

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Simplify the API by rolling up the static methods and hiding internals.

* Lock the ._converters._ai_services behind an import error.

* Print to install azure-ai-projects if we can't import AIAgentConverter

* By default, include all previous runs' tool calls and results.

* Don't crash if there is no content in historical thread messages.

* Parallelize the calls to get step_details for each run_id.

* Addressing PR comments.

* Use a single underscore to hide internal static members.

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>

* Adding intent_resolution_evaluator to prp/agent_evaluators branch (#40065)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Add Task Adherence and Completeness  (#40098)

* Agentic Evaluator - Response Completeness

* Added Change Log for Response Completeness Agentic Evaluator

* Task Adherence Agentic Evaluator

* Add Task Adherence Evaluator to changelog

* fixing contracts for Completeness and Task Adherence Evaluators

* Enhancing Contract for Task Adherence and Response Completeness Agentic Evaluator

* update completeness implementation.

* update the completeness evaluator response to include threshold comparison.

* updating the implementation for completeness.

* updating the type for completeness score.

* updating the parsing logic for llm output of completeness.

* updating the response dict for completeness.

* Adding Task adherence

* Adding Task Adherence evaluator with samples

* Delete old files

* updating the exception for completeness evaluator.

* Changing docstring

* Adding changelog

* Use _result_key

* Add admonition

---------

Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Adding bug bash sample and instructions (#40125)

* Adding bug bash sample and instructions

* Updating instructions

* Update instructions.md

* Adding instructions and evaluator to agent evaluation sample

* add bug bash sample notebook for response completeness evaluator. (#40139)

* add bug bash sample notebook for response completeness evaluator.

* update the notebook for completeness.

---------

Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>

* Sample specific for tool call accuracy evaluator (#40135)

* Update instructions.md

* Add IntentResolution evaluator bug bash notebook (#40144)

* Add intent resolution evaluator

* updated intent_resolution evaluator logic

* Remove spurious print statements

* Address reviewers feedback

* add threshold key, update result to pass/fail rather than True/False

* Add example + remove repeated fields

* Harden check_score_is_valid function

* Sample notebook to demo intent_resolution evaluator

* Add synthetic data and section on how to test data from disk

* Update instructions.md

* Update _tool_call_accuracy.py

* Improve task adherence prompt and add sample notebook for bugbash (#40146)

* For ComputeInstance and AmlCompute update disableLocalAuth property based on ssh_public_access (#39934)

* add disableLocalAuth for computeInstance

* fix disableLocalAuthAuth issue for amlCompute

* update compute instance

* update recordings

* temp changes

* Revert "temp changes"

This reverts commit 64e3c38.

* update recordings

* fix tests

* Add resource prefix for safe secret standard alerts (#40028)

Add the prefix to identify RGs that we are creating in our TME
tenant to identify them as potentially using local auth and violating
our safe secret standards.

Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* Add examples to task_adherence prompt. Add Task Adherence sample notebook

* Undo changes to New-TestResources.ps1

* Add sample .env file

---------

Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>

* [AIAgentConverter] Added support for converting entire threads. (#40178)

* Implemented prepare_evaluation_data

* Add support for retrieving multiple threads into the same file.

* Parallelize thread preparing across threads.

* Set the maximum number of workers in thread pools to 10.

* Users/singankit/tool call accuracy evaluator tests (#40190)

* Raising error when tool call not found

* Adding unit tests for tool call accuracy evaluator

* Updating sample

* update output of converter for tool calls

* add built-ins

* handle file search

* remove extra files

* revert

* revert

* fix built-in tool parsing bug

* remove local debug

* Formatted and updated the converter to avoid built-in tool crashes.

* Added an experimental decorator to AIAgentConverter

* Update import path for experimental decorator

---------

Co-authored-by: Ankit Singhal <30610298+singankit@users.noreply.github.com>
Co-authored-by: Sandy <16922860+thecsw@users.noreply.github.com>
Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Jose Santos <jcsantos@microsoft.com>
Co-authored-by: ghyadav <103428325+ghyadav@users.noreply.github.com>
Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>
Co-authored-by: Ankit Singhal <anksing@microsoft.com>
Co-authored-by: Chandra Sekhar Gupta <38103118+guptha23@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
Co-authored-by: spon <stevenpon@microsoft.com>
Co-authored-by: Sandy Urazayev <surazayev@microsoft.com>

* Adding instructions file

* Updating default scores

* Fix test case for non-agent tool call

---------

Co-authored-by: Sandy <16922860+thecsw@users.noreply.github.com>
Co-authored-by: Prashant Dhote <168401122+pdhotems@users.noreply.github.com>
Co-authored-by: Jose Santos <jcsantos@microsoft.com>
Co-authored-by: ghyadav <103428325+ghyadav@users.noreply.github.com>
Co-authored-by: Shiprajain01 <shiprajain01@microsoft.com>
Co-authored-by: ShipraJain01 <103409614+ShipraJain01@users.noreply.github.com>
Co-authored-by: Chandra Sekhar Gupta Aravapalli <caravapalli@microsoft.com>
Co-authored-by: Chandra Sekhar Gupta <38103118+guptha23@users.noreply.github.com>
Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com>
Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com>
Co-authored-by: stevepon <steven@ponshop.net>
Co-authored-by: spon <stevenpon@microsoft.com>
Co-authored-by: Sandy Urazayev <surazayev@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Central-EngSys This issue is owned by the Engineering System team. EngSys This issue is impacting the engineering system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants