Skip to content

[FEATURE] Refactor Error Handling: Replace anyhow with structured thiserror to support Host Exception Tunneling #1377

@georgeh0

Description

@georgeh0

Context & Motivation

Currently, our Core library relies on anyhow for error handling. While convenient, this causes a major issue when interfacing with the Python runtime (PyO3).

The Problem:
When a user-defined Python callback raises an exception (e.g., ValueError), it passes through our Rust Core. Because anyhow type-erases errors into a generic dynamic error object, the original Python exception is effectively lost. When the error surfaces back to the top-level Python code, the user receives a generic RuntimeError containing a stringified message, losing the original exception type and traceback.

The Goal:
We need to refactor the error handling to use thiserror with a type-erased "Host Error" variant. This will allow us to:

  1. "Tunnel" Host Exceptions: Preserve the exact PyErr (or future JsValue) object through the Rust Core as a trait object and re-raise it at the boundary.
  2. Avoid Generic Pollution: Use dynamic dispatch (Box<dyn HostError>) so core functions do not require generic type parameters.
  3. Structured Errors: Distinguish between "Client/Validation" errors (which should look like API errors) and "Internal" errors.
  4. Support Backtraces & Context: Automatically capture stack traces and support anyhow-style context wrapping.

The Proposed Design

We will replace anyhow::Result<T> with cocoindex_utils::Result<T>.

1. The Host Error Trait

In cocoindex_utils, we define a trait that any host language exception (Python PyErr, JS Error) must implement to pass through our core.

use std::any::Any;
use std::fmt::{Debug, Display};

// This trait allows us to store the error, print it, and downcast it later.
pub trait HostError: Debug + Display + Send + Sync + 'static {
    // Required for downcasting back to the concrete type (e.g. PyErr)
    fn as_any(&self) -> &dyn Any;
}

2. The Core Error Enum

The Error enum uses the HostError trait object for the tunnel.

use thiserror::Error;
use std::backtrace::Backtrace;

#[derive(Error, Debug)]
pub enum Error {
    // 1. Context Wrapper (replaces anyhow::Context)
    // Allows us to wrap errors with strings but still "drill down" to the cause later.
    #[error("{msg}")]
    Context {
        msg: String,
        #[source]
        source: Box<Error>,
    },

    // 2. The Host Tunnel (Type Erased)
    // Holds the native PyErr (or JsValue) via the trait object.
    #[error(transparent)]
    HostLang(Box<dyn HostError>),

    // 3. Client / API Errors
    // For validation issues.
    #[error("Invalid Request: {msg}")]
    Client {
        msg: String,
        backtrace: Backtrace, // Captured manually via helper constructor
    },

    // 4. Internal / Implementation Errors
    // Catch-all for IO, DB, Serde errors.
    #[error(transparent)]
    Internal {
        source: Box<dyn std::error::Error + Send + Sync>,
        backtrace: Backtrace,
    },
}

// Type alias
pub type Result<T> = std::result::Result<T, Error>;

3. Helper Traits & Constructors

To maintain developer ergonomics, we need specific helpers:

  • Constructors:

    impl Error {
        pub fn host(e: impl HostError) -> Self {
            Self::HostLang(Box::new(e))
        }
    
        pub fn client(msg: impl Into<String>) -> Self {
            Self::Client {
                msg: msg.into(),
                backtrace: Backtrace::capture(),
            }
        }
    }
  • IntoInternal Trait: Allows converting generic errors (IO, Serde) to Error::Internal using method syntax.

    // Usage: std::fs::read(path).internal()?
    pub trait IntoInternal<T> {
        fn internal(self) -> Result<T>;
    }
  • ContextExt Trait: Replicates anyhow's .context() and .with_context() behavior for our new Result type.

4. Macros

We should provide macros to reduce boilerplate for common error patterns:

  • client_error!("msg", ...) and client_bail!("msg", ...) -> Returns Err(Error::client(...)) (to replace existing macros api_error! and api_bail!)
  • internal_error!("msg", ...) and internal_bail!("msg", ...) -> Returns Err(Error::Internal { ... }) (creates a string error boxed as dyn Error).

Python Bindings Implementation

In the PyO3 binding crate, we must implement the bridge between PyErr and HostError.

  1. Define a Wrapper:

    #[derive(Debug)]
    struct PyErrWrapper(PyErr);
    
    // Implement Display to delegate to PyErr...
    
    impl cocoindex_utils::HostError for PyErrWrapper {
        fn as_any(&self) -> &dyn std::any::Any { self }
    }
  2. Implement recursive unwrapping:
    When implementing From<cocoindex_utils::Error> for PyErr:

    • Loop through Error::Context.
    • If Error::HostLang(boxed) is found, call boxed.as_any().downcast_ref::<PyErrWrapper>().
    • If the downcast succeeds, return the inner PyErr.
    • Otherwise, fallback to converting the error to PyValueError (Client) or PyRuntimeError (Internal).

Tasks

  • Define HostError Trait: Create the trait with as_any.
  • Define cocoindex_utils::Error: Create the enum using Box<dyn HostError>.
  • Define cocoindex_utils::Result<T>: Create the type alias.
  • Implement Helpers:
    • Error::host(e) constructor.
    • Error::client(msg) constructor.
    • IntoInternal for std::result::Result.
    • ContextExt for cocoindex_utils::Result.
  • Add Macros: Implement client_bail! and internal_bail!.
  • Refactor Core Signatures: Update functions in Core to return cocoindex_utils::Result<T>.
  • Update Python Bindings:
    • Create struct PyErrWrapper(PyErr) implementing HostError.
    • Implement From<Error> for PyErr with recursive unwrapping and downcasting logic.
  • Tests:
    • Context Population: Verify that context added via .context("msg") is preserved in the error chain (e.g. when printing Debug or Display in Rust).
    • Host Tunneling: Add a test case where a Python callback raises a specific custom exception, and verify the library returns that exact custom exception class.
    • Backtraces: Verify backtraces are captured for internal/client errors.

References

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions