Skip to content

demo: an easy to use catalog loader #1372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 9 additions & 29 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 19 additions & 2 deletions crates/catalog/rest/src/catalog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,14 @@

use std::collections::HashMap;
use std::str::FromStr;
use std::sync::Arc;

use async_trait::async_trait;
use iceberg::io::FileIO;
use iceberg::table::Table;
use iceberg::{
Catalog, Error, ErrorKind, Namespace, NamespaceIdent, Result, TableCommit, TableCreation,
TableIdent,
Catalog, CatalogLoader, Error, ErrorKind, Namespace, NamespaceIdent, Result, TableCommit,
TableCreation, TableIdent,
};
use itertools::Itertools;
use reqwest::header::{
Expand Down Expand Up @@ -320,6 +321,22 @@ impl RestCatalog {
}
}

#[async_trait]
impl CatalogLoader for RestCatalog {
async fn load(mut properties: HashMap<String, String>) -> Result<Arc<dyn Catalog>> {
let uri = properties
.remove("uri")
.ok_or_else(|| Error::new(ErrorKind::DataInvalid, "Missing required property `uri`"))?;

let config = RestCatalogConfig::builder()
.uri(uri)
.props(properties)
.build();

Ok(Arc::new(RestCatalog::new(config)))
}
}

/// All requests and expected responses are derived from the REST catalog API spec:
/// https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml
#[async_trait]
Expand Down
15 changes: 14 additions & 1 deletion crates/iceberg/src/catalog/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ use std::collections::HashMap;
use std::fmt::{Debug, Display};
use std::mem::take;
use std::ops::Deref;
use std::sync::Arc;

use _serde::deserialize_snapshot;
use async_trait::async_trait;
Expand All @@ -36,7 +37,19 @@ use crate::spec::{
use crate::table::Table;
use crate::{Error, ErrorKind, Result};

/// The CatalogLoader trait is used to load a catalog from a given name and properties.
#[async_trait]
pub trait CatalogLoader: Debug + Send + Sync {
/// Load a catalog from the given name and properties.
async fn load(properties: HashMap<String, String>) -> Result<Arc<dyn Catalog>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async fn load(properties: HashMap<String, String>) -> Result<Arc<dyn Catalog>>;
async fn load(name: String, properties: HashMap<String, String>) -> Result<Arc<dyn Catalog>>;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two problems with this approach:

  1. This is not easy to use when we know the concret type of catalog. For example, when the user just wants to create a RestCatalog. It will force user to do downcast. This is useful when the catalog has some extran functionality.
  2. This is not easy to use when the catalog builder has an advanced builder method, see https://github.com/apache/iceberg-rust/pull/1231/files#r2106848332

I think I can simplify the methods in my original proposal like your one, while keeping other things same, WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

}

/// The catalog API for Iceberg Rust.
///
/// Users will have two ways to construct a catalog:
///
/// - Use `CatalogLoeader` to load a catalog from a name and properties.
/// - Use `CatalogBuilder` provided by the catalog implementer to build a catalog in a strong typed way.
#[async_trait]
pub trait Catalog: Debug + Sync + Send {
/// List namespaces inside the catalog.
Expand Down Expand Up @@ -2091,7 +2104,7 @@ mod tests {
{
"action": "remove-schemas",
"schema-ids": [1, 2]
}
}
"#,
TableUpdate::RemoveSchemas {
schema_ids: vec![1, 2],
Expand Down
5 changes: 2 additions & 3 deletions crates/iceberg/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,9 @@ mod error;
pub use error::{Error, ErrorKind, Result};

mod catalog;

pub use catalog::{
Catalog, Namespace, NamespaceIdent, TableCommit, TableCreation, TableIdent, TableRequirement,
TableUpdate, ViewCreation,
Catalog, CatalogLoader, Namespace, NamespaceIdent, TableCommit, TableCreation, TableIdent,
TableRequirement, TableUpdate, ViewCreation,
};

pub mod table;
Expand Down
Loading