Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split DataProvider into ResourceProvider and DynProvider #1554

Merged
merged 66 commits into from
Feb 3, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
238a44c
Initial ResourceMarker and ResourceProvider definitions
sffc Jan 20, 2022
0c7f4f2
Checkpoint
sffc Jan 20, 2022
b60b909
Remove redundant module annotation
sffc Jan 24, 2022
a9e1b8d
Big refactor around new traits
sffc Jan 27, 2022
10dbe6f
cargo check --all-targets --all-features in core crate
sffc Jan 27, 2022
600cfbf
icu_provider_blob building
sffc Jan 27, 2022
d2ce258
icu_provider_fs building
sffc Jan 27, 2022
3b0b5c9
Delete blanket impls
sffc Jan 27, 2022
e591f9a
Add DataProvider::cast method
sffc Jan 28, 2022
aaa8132
icu_properties via DynProvider
sffc Jan 28, 2022
9a19c8c
Fixing up some components
sffc Jan 28, 2022
9098d3e
Merge remote-tracking branch 'upstream/main' into resourcemarker
sffc Jan 28, 2022
c1af44d
More migrations
sffc Jan 28, 2022
70b6821
More migrations
sffc Jan 28, 2022
86b1e91
ResourceProvider in plurals
sffc Jan 28, 2022
f7412eb
Split plural rules constructors
sffc Jan 28, 2022
f330a8a
Migrate call sites to type-specific PluralRules constructors
sffc Jan 28, 2022
579179b
fmt
sffc Jan 28, 2022
beaca9c
Add basic tests to icu_provider_macros
sffc Jan 29, 2022
4174efd
Add basic ResourceMarker to macro
sffc Jan 29, 2022
e1ecce8
Improve test
sffc Jan 29, 2022
54a6a94
Add multi-key support
sffc Jan 29, 2022
7077b75
Make docs more consistent
sffc Jan 29, 2022
d72d419
Add docs example
sffc Jan 29, 2022
f858b38
Add namespace to ZeroCopyFrom in yoke derive
sffc Jan 29, 2022
1e6b178
Use new data_struct attribute across components
sffc Jan 29, 2022
d97f077
Clean up imports
sffc Jan 29, 2022
c885e29
uprops building
sffc Jan 29, 2022
1418c81
Migrating cldr
sffc Jan 29, 2022
c0e3444
ICU4X cargo check --all-features
sffc Jan 29, 2022
91e0f5e
Fixing some tests
sffc Jan 29, 2022
f36c233
ICU4X cargo check --all-targets --all-features
sffc Jan 29, 2022
5a3374f
Merge remote-tracking branch 'upstream/main' into resourcemarker
sffc Jan 29, 2022
8a40399
fmt
sffc Jan 29, 2022
edd600e
cargo quick
sffc Jan 29, 2022
fda78d1
Work around Rust bug rust-lang/rust#93470
sffc Jan 31, 2022
87fb440
Merge remote-tracking branch 'upstream/main' into resourcemarker
sffc Feb 1, 2022
c0262c1
Fix resource_path_to_string
sffc Feb 1, 2022
c112f5b
Fixing docs tests in icu_provider
sffc Feb 1, 2022
2f34269
Fix failing bench test
sffc Feb 1, 2022
a08a744
More docs tests fixes
sffc Feb 1, 2022
bb1479a
Another docs test down
sffc Feb 1, 2022
4010e7a
Re-write icu_provider crate-level docs
sffc Feb 1, 2022
7150659
Doc link fixes
sffc Feb 1, 2022
285cccf
diplomat regen
sffc Feb 2, 2022
7217ea7
A few more docs updates + ForkByKey for DynProvider
sffc Feb 2, 2022
0d93723
icu_provider links fixed
sffc Feb 2, 2022
0f010c0
Fixing remaining docs links
sffc Feb 2, 2022
68b45cd
list_formatter fixups
sffc Feb 2, 2022
e04ba20
clippy & tidy
sffc Feb 2, 2022
6e4bac1
Docs for DynProvider and ResourceProvider
sffc Feb 2, 2022
33498b9
Fix plural rules FFI test
sffc Feb 2, 2022
bdce762
Fix FFI test take 2
sffc Feb 2, 2022
860e21f
Merge remote-tracking branch 'upstream/main' into resourcemarker
sffc Feb 2, 2022
9772515
Fix build
sffc Feb 2, 2022
842215a
Remove unneeded type info
sffc Feb 2, 2022
127aaf0
Fix another compile error
sffc Feb 2, 2022
099ab34
try_langid -> get_langid
sffc Feb 2, 2022
b2c8f7d
Update data_struct macro to require explicit marker symbol path
sffc Feb 2, 2022
8a47ffa
Migrate call sites to new macro
sffc Feb 2, 2022
d2902af
Make PluralRulesV1Marker public
sffc Feb 2, 2022
14fefcc
Remove ResourcePath and DataRequestOld
sffc Feb 2, 2022
51cf962
Remove unused file
sffc Feb 2, 2022
8f992bf
Update provider/macros/src/lib.rs
sffc Feb 2, 2022
1217eed
Merge branch 'main' into resourcemarker
sffc Feb 2, 2022
16120c6
fmt
sffc Feb 3, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 63 additions & 46 deletions provider/core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,91 +2,103 @@

`icu_provider` is one of the [`ICU4X`] components.

It defines traits and structs for transmitting data through the ICU4X locale data pipeline.
The primary trait is [`DataProvider`]. It has one method, which transforms a [`Request`] into
a [`Response`]:
`icu_provider` defines traits and structs for transmitting data through the ICU4X locale
data pipeline. The primary trait is [`ResourceProvider`]. It is parameterized by a
[`ResourceMarker`], which contains the data type and a [`ResourceKey`]. It has one method,
[`ResourceProvider::load_resource`], which transforms a [`DataRequest`]
into a [`DataResponse`].

```rust
fn load_payload(&self, req: &DataRequest) -> Result<DataResponse<'data>, DataError>
```
- [`ResourceKey`] is a fixed identifier for the data type, such as `"plurals/cardinal@1"`.
- [`DataRequest`] contains additional annotations to choose a specific variant of the key,
such as a locale.
- [`DataResponse`] contains the data if the request was successful.

A [`Request`] contains a [`ResourceKey`] (a fixed identifier such as "plurals/cardinal@1") and
[`ResourceOptions`] (a language identifier and optional variant, e.g. "fr") being requested.
The Response contains the data payload corresponding to the Request.
In addition, there are three other traits which are widely implemented:

A [`Response`] contains a [`DataPayload`] along with other metadata.
- [`AnyProvider`] returns data as `dyn Any` trait objects.
- [`BufferProvider`] returns data as `[u8]` buffers.
- [`DynProvider`] returns structured data but is not specific to a key.

The most common types required for ICU4X [`DataProvider`] are included via the prelude:
The most common types required for this crate are included via the prelude:

```rust
use icu_provider::prelude::*;
```

### Concrete Implementations of Data Providers
### Types of Data Providers

All data providers can fit into one of two classes.

1. Type 1: Those whose data originates as structured Rust objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: I'm not a huge fan of "type 1" and "type 2" naming, can we just call these any providers and buffered providers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. Type 2: Those whose data originates as unstructured `[u8]` buffers

#### Type 1 Providers

Type 1 providers generally implement [`AnyProvider`], which returns structured data cast into
`dyn Any` trait objects. Users can call [`as_downcasting()`] to get an object implementing
[`ResourceProvider`] by downcasting the trait objects.

Examples of Type 1 providers:

- [`CldrJsonDataProvider`] reads structured data from CLDR JSON source files and returns
structured Rust objects.
- [`AnyPayloadProvider`] wraps a specific data struct and returns it.
- The upcoming `crabbake` provider which reads structured data from Rust source files

#### Type 2 Providers

Type 2 providers generally implement [`BufferProvider`], which returns unstructured data
typically represented as [`serde`]-serialized buffers. Users can call [`as_deserializing()`]
to get an object implementing [`ResourceProvider`] by invoking Serde Deserialize.

Examples of Type 2 providers:

Any object implementing [`DataProvider`] can be used to supply ICU4X with locale data. ICU4X ships
with some pre-built data providers:
- [`FsDataProvider`] reads individual buffers from the filesystem.
- [`BlobDataProvider`] reads buffers from a large in-memory blob.

- [`CldrJsonDataProvider`](../icu_provider_cldr/transform/struct.CldrJsonDataProvider.html) reads structured
data directly from CLDR source files.
- [`FsDataProvider`](../icu_provider_fs/struct.FsDataProvider.html) reads structured data from the
filesystem. It can also write out that filesystem structure. More efficient than CldrJsonDataProvider.
#### Special-Purpose Providers

This crate also contains some concrete implementations for testing purposes:

- [`InvariantDataProvider`] returns fixed data that does not vary by locale.
- [`AnyPayloadProvider`] wraps a particular instance of a struct and returns it.
- [`HelloWorldProvider`] returns "hello world" strings in several languages.

### Combinatorial Providers

ICU4X offers several built-in modules to combine providers in interesting ways:

- Use the [`fork`] module to marshall data requests between multiple possible providers.
- Use the [`either`] module to choose between multiple provider types at runtime.
- Use the [`filter`] module to programmatically reject certain data requests.

### Types and Lifetimes

Types compatible with [`Yokeable`] can be passed through the data provider, so long as they are
associated with a marker type implementing [`DataMarker`].

Most [`DataProvider`] traits take one lifetime argument: `'data`. This lifetime allows data
structs to borrow zero-copy data. In practice, it also represents the lifetime of data that
the Cart of the Yoke of the DataPayload borrows; for more information on carts and yokes,
see [`yoke`].
Data structs should generally have one lifetime argument: `'data`. This lifetime allows data
structs to borrow zero-copy data.

### Additional Traits

#### `IterableDataProvider`

Data providers can implement [`IterableProvider`], allowing iteration over all [`ResourceOptions`]
instances supported for a certain key in the data provider.
Data providers can implement [`IterableProvider`], allowing iteration over all
[`ResourceOptions`] instances supported for a certain key in the data provider.

For more information, see the [`iter`] module.

#### `BufferProvider`

The trait [`BufferProvider`] represents a data provider that produces buffers (`[u8]`), which
are typically deserialized later via Serde. This allows for a Serde-enabled provider
to be saved as a trait object without being specific to a data struct type.

#### `AnyProvider`

The trait [`AnyProvider`] removes the type argument from [`DataProvider`] and requires
that all data structs be convertible to the [`Any`](core::any::Any) type. This enables the
processing of data without the caller knowing the underlying data struct.

Since [`AnyProvider`] is not specific to a single type, it can be useful for caches or
other bulk data operations.

#### `DataProvider<SerializeMarker>`

*Enabled with the "serialize" feature*

Data providers capable of returning opaque `erased_serde::Serialize` trait objects can be used as
input to a data exporter, such as when writing data to the filesystem.

This trait is normally implemented using the [`impl_dyn_provider!`] macro.
Data providers capable of returning opaque `erased_serde::Serialize` trait objects can be use
as input to a data exporter, such as when writing data to the filesystem.

This trait is normally implemented using the [`impl_dyn_provider!`] macro.

[`ICU4X`]: ../icu/index.html
[`DataProvider`]: data_provider::DataProvider
[`Request`]: data_provider::DataRequest
[`Response`]: data_provider::DataResponse
[`ResourceKey`]: resource::ResourceKey
[`ResourceOptions`]: resource::ResourceOptions
[`IterableProvider`]: iter::IterableProvider
Expand All @@ -96,6 +108,11 @@ This trait is normally implemented using the [`impl_dyn_provider!`] macro.
[`AnyProvider`]: any::AnyProvider
[`Yokeable`]: yoke::Yokeable
[`impl_dyn_provider!`]: impl_dyn_provider
[`as_downcasting()`]: AsDowncastingAnyProvider::as_downcasting
[`as_deserializing()`]: AsDeserializingBufferProvider::as_deserializing
[`CldrJsonDataProvider`]: ../icu_provider_cldr/struct.CldrJsonDataProvider.html
[`FsDataProvider`]: ../icu_provider_fs/struct.FsDataProvider.html
[`BlobDataProvider`]: ../icu_provider_blob/struct.BlobDataProvider.html

## More Information

Expand Down
125 changes: 71 additions & 54 deletions provider/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,91 +4,103 @@

//! `icu_provider` is one of the [`ICU4X`] components.
//!
//! It defines traits and structs for transmitting data through the ICU4X locale data pipeline.
//! The primary trait is [`DataProvider`]. It has one method, which transforms a [`Request`] into
//! a [`Response`]:
//!
//! ```ignore
//! fn load_payload(&self, req: &DataRequest) -> Result<DataResponse<'data>, DataError>
//! ```
//!
//! A [`Request`] contains a [`ResourceKey`] (a fixed identifier such as "plurals/cardinal@1") and
//! [`ResourceOptions`] (a language identifier and optional variant, e.g. "fr") being requested.
//! The Response contains the data payload corresponding to the Request.
//!
//! A [`Response`] contains a [`DataPayload`] along with other metadata.
//!
//! The most common types required for ICU4X [`DataProvider`] are included via the prelude:
//! `icu_provider` defines traits and structs for transmitting data through the ICU4X locale
//! data pipeline. The primary trait is [`ResourceProvider`]. It is parameterized by a
//! [`ResourceMarker`], which contains the data type and a [`ResourceKey`]. It has one method,
//! [`ResourceProvider::load_resource`], which transforms a [`DataRequest`]
//! into a [`DataResponse`].
//!
//! - [`ResourceKey`] is a fixed identifier for the data type, such as `"plurals/cardinal@1"`.
//! - [`DataRequest`] contains additional annotations to choose a specific variant of the key,
//! such as a locale.
//! - [`DataResponse`] contains the data if the request was successful.
//!
//! In addition, there are three other traits which are widely implemented:
//!
//! - [`AnyProvider`] returns data as `dyn Any` trait objects.
//! - [`BufferProvider`] returns data as `[u8]` buffers.
//! - [`DynProvider`] returns structured data but is not specific to a key.
//!
//! The most common types required for this crate are included via the prelude:
//!
//! ```
//! use icu_provider::prelude::*;
//! ```
//!
//! ## Concrete Implementations of Data Providers
//!
//! Any object implementing [`DataProvider`] can be used to supply ICU4X with locale data. ICU4X ships
//! with some pre-built data providers:
//!
//! - [`CldrJsonDataProvider`](../icu_provider_cldr/transform/struct.CldrJsonDataProvider.html) reads structured
//! data directly from CLDR source files.
//! - [`FsDataProvider`](../icu_provider_fs/struct.FsDataProvider.html) reads structured data from the
//! filesystem. It can also write out that filesystem structure. More efficient than CldrJsonDataProvider.
//!
//! ## Types of Data Providers
//!
//! All data providers can fit into one of two classes.
//!
//! 1. Type 1: Those whose data originates as structured Rust objects
//! 2. Type 2: Those whose data originates as unstructured `[u8]` buffers
//!
//! ### Type 1 Providers
//!
//! Type 1 providers generally implement [`AnyProvider`], which returns structured data cast into
//! `dyn Any` trait objects. Users can call [`as_downcasting()`] to get an object implementing
//! [`ResourceProvider`] by downcasting the trait objects.
//!
//! Examples of Type 1 providers:
//!
//! - [`CldrJsonDataProvider`] reads structured data from CLDR JSON source files and returns
//! structured Rust objects.
//! - [`AnyPayloadProvider`] wraps a specific data struct and returns it.
//! - The upcoming `crabbake` provider which reads structured data from Rust source files
//!
//! ### Type 2 Providers
//!
//! Type 2 providers generally implement [`BufferProvider`], which returns unstructured data
//! typically represented as [`serde`]-serialized buffers. Users can call [`as_deserializing()`]
//! to get an object implementing [`ResourceProvider`] by invoking Serde Deserialize.
//!
//! Examples of Type 2 providers:
//!
//! - [`FsDataProvider`] reads individual buffers from the filesystem.
//! - [`BlobDataProvider`] reads buffers from a large in-memory blob.
//!
//! ### Special-Purpose Providers
//!
//! This crate also contains some concrete implementations for testing purposes:
//!
//! - [`InvariantDataProvider`] returns fixed data that does not vary by locale.
//! - [`AnyPayloadProvider`] wraps a particular instance of a struct and returns it.
//! - [`HelloWorldProvider`] returns "hello world" strings in several languages.
//!
//! ## Combinatorial Providers
//!
//! ICU4X offers several built-in modules to combine providers in interesting ways:
//!
//! - Use the [`fork`] module to marshall data requests between multiple possible providers.
//! - Use the [`either`] module to choose between multiple provider types at runtime.
//! - Use the [`filter`] module to programmatically reject certain data requests.
//!
//! ## Types and Lifetimes
//!
//! Types compatible with [`Yokeable`] can be passed through the data provider, so long as they are
//! associated with a marker type implementing [`DataMarker`].
//!
//! Most [`DataProvider`] traits take one lifetime argument: `'data`. This lifetime allows data
//! structs to borrow zero-copy data. In practice, it also represents the lifetime of data that
//! the Cart of the Yoke of the DataPayload borrows; for more information on carts and yokes,
//! see [`yoke`].
//!
//! Data structs should generally have one lifetime argument: `'data`. This lifetime allows data
//! structs to borrow zero-copy data.
//!
//! ## Additional Traits
//!
//! ### `IterableDataProvider`
//!
//! Data providers can implement [`IterableProvider`], allowing iteration over all [`ResourceOptions`]
//! instances supported for a certain key in the data provider.
//! Data providers can implement [`IterableProvider`], allowing iteration over all
//! [`ResourceOptions`] instances supported for a certain key in the data provider.
//!
//! For more information, see the [`iter`] module.
//!
//! ### `BufferProvider`
//!
//! The trait [`BufferProvider`] represents a data provider that produces buffers (`[u8]`), which
//! are typically deserialized later via Serde. This allows for a Serde-enabled provider
//! to be saved as a trait object without being specific to a data struct type.
//!
//! ### `AnyProvider`
//!
//! The trait [`AnyProvider`] removes the type argument from [`DataProvider`] and requires
//! that all data structs be convertible to the [`Any`](core::any::Any) type. This enables the
//! processing of data without the caller knowing the underlying data struct.
//!
//! Since [`AnyProvider`] is not specific to a single type, it can be useful for caches or
//! other bulk data operations.
//!
//! ### `DataProvider<SerializeMarker>`
//!
//! *Enabled with the "serialize" feature*
//!
//! Data providers capable of returning opaque `erased_serde::Serialize` trait objects can be used as
//! input to a data exporter, such as when writing data to the filesystem.
//!
//! This trait is normally implemented using the [`impl_dyn_provider!`] macro.
//! Data providers capable of returning opaque `erased_serde::Serialize` trait objects can be use
//! as input to a data exporter, such as when writing data to the filesystem.
//!
//! This trait is normally implemented using the [`impl_dyn_provider!`] macro.
//!
//! [`ICU4X`]: ../icu/index.html
//! [`DataProvider`]: data_provider::DataProvider
//! [`Request`]: data_provider::DataRequest
//! [`Response`]: data_provider::DataResponse
//! [`ResourceKey`]: resource::ResourceKey
//! [`ResourceOptions`]: resource::ResourceOptions
//! [`IterableProvider`]: iter::IterableProvider
Expand All @@ -98,6 +110,11 @@
//! [`AnyProvider`]: any::AnyProvider
//! [`Yokeable`]: yoke::Yokeable
//! [`impl_dyn_provider!`]: impl_dyn_provider
//! [`as_downcasting()`]: AsDowncastingAnyProvider::as_downcasting
//! [`as_deserializing()`]: AsDeserializingBufferProvider::as_deserializing
//! [`CldrJsonDataProvider`]: ../icu_provider_cldr/struct.CldrJsonDataProvider.html
//! [`FsDataProvider`]: ../icu_provider_fs/struct.FsDataProvider.html
//! [`BlobDataProvider`]: ../icu_provider_blob/struct.BlobDataProvider.html

#![cfg_attr(not(any(test, feature = "std")), no_std)]

Expand Down