[Docs] Add/improve error code documentation of thrown `GenkitError`s + retry handling/strategy

**Is your report related to a problem? Please describe.**

I'm trying to determine, **_which errors should be automatically retried_**. 

My processing flow is quite complicated so 
1. ✅ I want to **retry everything locally if possible**, so any previous steps are not rerun, when not needed, however,
2. ❌ For errors, for which retry doesn't make sense (`validation error`, `unauthorized`, `invalid parameter`), I want to **skip retries** — otherwise it could rerun unnecessary previous steps, and would waste resources. 

_______________________________________

**Is your report related to a suggestion/improvement? Please describe.**

💡 Yes — I'd like to kindly ask the _devs_ to **improve the error documentation**, with clear and structured explanations. I think this is quite a common scenario when implementing more complex workflows. 

I'll break down my suggestion for the 2 major error types involved: `GenerationResponseError` and `GenerationResponseError`. 

_______________________________________

**Additional context**

_______________________________________

***⚠️ `GenerationResponseError`***

**Please explain the `status` codes in detail.** 

Some of them seem straightforward:
- `"UNAUTHENTICATED"` - probably _`401 Unauthorized`_
- `"PERMISSION_DENIED"` - probably _`403 Forbidden`_
- `"RESOURCE_EXHAUSTED"` - probably _`429 Too Many Requests`_

🔁 Are these 1:1 mappings to HTTP status codes? Or are these statuses used in other cases?

Other codes are more ambiguous:
- `"OUT_OF_RANGE"` - is it for 
    - _`400 Bad Request`_ (for an incorrect parameter), or 
    - _`416 Range Not Satisfiable`_ (bad offser when downloading a resource), or
    - is it for some other specific case?

 - `"INTERNAL"` - is this for 
    - _`500 Internal Server Error`_ on the LLM backend, 
    - or is it a local/internal failure in the `genkit` module?

💭 Overall: More clarity would help with programmatic error handling and retry logic.

_______________________________________

***🧪 `ValidationError`***

It would be important to distinguish if a validation failure happened in an input schema or an output schema (both for prompts and flows):

- 🟩 **Output validation error** (e.g. malformed LLM response) 
    ➤ This is likely a random erroneous output from the LLM → **should** be retried.
- 🟥 **Input validation error** (e.g. invalid prompt or parameters) 
    ➤ This usually signals a **bug** → should **not** be retried, but should crash early.

_______________________________________

**🧵 Final Thought**

Yes, I know that the error objects contain human-readable details and messages. But that doesn't help with **programmatic decision-making** — especially because the more specific info (`detail`) has no specified structure.

📌 I think **structured, documented guidance** on what different errors mean and how to handle them (retry, abort, etc.) would massively improve DX and stability.

Thanks for the help. 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Docs] Add/improve error code documentation of thrown `GenkitError`s + retry handling/strategy #3029

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Docs] Add/improve error code documentation of thrown GenkitErrors + retry handling/strategy #3029

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Docs] Add/improve error code documentation of thrown `GenkitError`s + retry handling/strategy #3029