Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement naming convention for "data provider" #1576

Closed
3 of 4 tasks
sffc opened this issue Feb 2, 2022 · 11 comments · Fixed by #2222
Closed
3 of 4 tasks

Implement naming convention for "data provider" #1576

sffc opened this issue Feb 2, 2022 · 11 comments · Fixed by #2222
Assignees
Labels
A-design Area: Architecture or design C-data-infra Component: provider, datagen, fallback, adapters help wanted Issue needs an assignee S-medium Size: Less than a week (larger bug fix or enhancement) T-techdebt Type: ICU4X code health and tech debt

Comments

@sffc
Copy link
Member

sffc commented Feb 2, 2022

In #1554 I am removing the DataProvider type and making it ResourceProvider and DynProvider.

The prelude types are now:

    pub use crate::any::AnyMarker;
    pub use crate::any::AnyPayload;
    pub use crate::any::AnyProvider;
    pub use crate::any::AnyResponse;
    pub use crate::buf::BufferMarker;
    pub use crate::buf::BufferProvider;
    pub use crate::data_provider::DataPayload;
    pub use crate::data_provider::DataRequest;
    pub use crate::data_provider::DataResponse;
    pub use crate::data_provider::DataResponseMetadata;
    pub use crate::data_provider::DynProvider;
    pub use crate::data_provider::ResourceProvider;
    pub use crate::error::DataError;
    pub use crate::error::DataErrorKind;
    pub use crate::marker::DataMarker;
    pub use crate::marker::ResourceMarker;
    pub use crate::resource::ResourceKey;
    pub use crate::resource::ResourceOptions;

It is weird that some of them are "data" and some are "resource". I started using the word "resource" to refer to specific pieces of data. The word "data" is vague, overused, and is weird when it comes to singular and plural. However, I still think it is okay to talk about "data pipeline" and "data provider" as a high-level concept.

We should also think about the naming for DynProvider. I chose this name because the main use case for DynProvider moving forward are things like DynProvider<SerializeMarker>, where you want a pluggable marker representing a trait object.

Seeking input from:

@sffc sffc added C-data-infra Component: provider, datagen, fallback, adapters A-design Area: Architecture or design needs-approval One or more stakeholders need to approve proposal labels Feb 2, 2022
@sffc
Copy link
Member Author

sffc commented Feb 3, 2022

Also address this comment from @Manishearth:

issue: I'm not a huge fan of "type 1" and "type 2" naming, can we just call these any providers and buffered providers?

Originally posted by @Manishearth in #1554 (comment)

@zbraniecki
Copy link
Member

I don't have a strong opinion here. I see data as a very vague term, but also an appropriate one for DataProvider. And since DataProvider is a broad, generic library that fetches, well, data, I think using such a broad term in this context is justified.

The term resource for me is a structured piece of data collected into a semantically distinct group. A good mental model example for a resource is a file, but it's just an abstraction.

In result my intuition is to continue using Resource to say fetch a group of data items that are related (for example based on key/category etc.). and Data as all of data we want.

@Manishearth
Copy link
Member

I think moving to "resource" provider is nice. I don't know what to do about DynProvider, I don't like having dyn in its name when it does not in its definition provide dyns, but it is supposed to largely be fore type erased stuff.

@robertbastian
Copy link
Member

robertbastian commented Apr 8, 2022

  1. The name I have the biggest issue with is icu_provider::marker::DataMarker. I think it should be DynMarker to give us the symmetry DynProvider<M: DynMarker> <-> ResourceProvider<M: ResourceMarker>

  2. Most crates seem to call their error types just Error. I understand that our style guide prescribes such repetitive naming but I don't agree with its reasoning.

  3. Same for icu_provider::data_provider::{DataPayload,DataRequest,DataResponse,DataResponseMetadata, I'd prefer dropping the Data prefix, but I don't think this is confusing as is, because these types are shared between the DynProvider and ResourceProviderAPI.

  4. Now that we don't have a DataProvider type anymore, I would not be opposed to renaming Resource** to Data*. I think that would make it easier to understand the whole system, DataProvider is what we're talking about when we say "data provider", and DynProvider is a tool that can be used to implement a DataProvider, but will rarely be exposed to a client (they will see FsDataProvider, BlobDataProvider, StaticDataProvider, etc.)

@robertbastian
Copy link
Member

  1. If we decide against (4), I think we should rename FsDataProvider, etc. to FsResourceProvider, as clients should interact with them through the ResourceProvider trait.

@robertbastian
Copy link
Member

(with (4) we'd have to rename load_payload and load_resource as well, we could do something like

DynProvider::load_dyn(ResourceKey, DataRequest) -> Result<DataResponse<M>, DataError>

DataProvider::load_data(DataRequest) -> Result<DataResponse<M>, DataError>)

@Manishearth
Copy link
Member

  1. I think it should be DynMarker to give us the symmetry DynProvider<M: DynMarker> <-> ResourceProvider<M: ResourceMarker>

Honestly this is kinda why I disagree with the naming DynProvider, the dyn provider does not just provide dyns, and neither does DataMarker.

@robertbastian
Copy link
Member

robertbastian commented Apr 8, 2022

How about RawProvider? We usually use it to provide raw things (Any, [u8]) that have to be processed before they can nicely be returned through ResourceProvider

@sffc sffc added discuss Discuss at a future ICU4X-SC meeting and removed needs-approval One or more stakeholders need to approve proposal labels Apr 9, 2022
@sffc
Copy link
Member Author

sffc commented Apr 28, 2022

Discussion:

Old names:

    pub use crate::data_provider::DataPayload;
    pub use crate::data_provider::DataRequest;
    pub use crate::data_provider::DataResponse;
    pub use crate::data_provider::DataResponseMetadata;
    pub use crate::data_provider::DynProvider;
    pub use crate::data_provider::ResourceProvider;
    pub use crate::error::DataError;
    pub use crate::error::DataErrorKind;
    pub use crate::marker::DataMarker;
    pub use crate::marker::ResourceMarker;
    pub use crate::resource::ResourceKey;
    pub use crate::resource::ResourceOptions;

New names:

    pub use crate::data_provider::DataPayload;
    pub use crate::data_provider::DataRequest;
    pub use crate::data_provider::DataResponse;
    pub use crate::data_provider::DataResponseMetadata;
    pub use crate::data_provider::DynamicDataProvider;
    pub use crate::data_provider::DataProvider;
    pub use crate::error::DataError;
    pub use crate::error::DataErrorKind;
    pub use crate::marker::DataMarker;
    pub use crate::marker::KeyedDataMarker;
    pub use crate::resource::DataKey;
    pub use crate::resource::DataOptions;

@sffc sffc added T-techdebt Type: ICU4X code health and tech debt S-medium Size: Less than a week (larger bug fix or enhancement) and removed discuss Discuss at a future ICU4X-SC meeting labels Apr 28, 2022
@sffc sffc added this to the ICU4X 1.0 milestone Apr 28, 2022
@sffc sffc added the help wanted Issue needs an assignee label Apr 28, 2022
@sffc sffc changed the title Discuss naming convention for "data provider" Implement naming convention for "data provider" May 25, 2022
@robertbastian robertbastian self-assigned this Jul 18, 2022
@robertbastian
Copy link
Member

Our load methods are now

AnyProvider::load_any
BufferProvider::load_buffer
DynamicDataProvider::load_payload
DataProvider::load_resource

The last one is obviously not great anymore, and I'm also not a fan of load_payload, because all of these methods return some kind of payload (DataPayload or AnyPayload).
I proposed load_data above but we didn't discuss it in the meeting; we could also use load. Maybe DataProvider::load and DynamicDataProvider::load_data? I can't see how we would put the concept of compile-time keys into the names, which is the only difference between those, but we need different names to avoid fully qualifying them.

@sffc
Copy link
Member Author

sffc commented Jul 19, 2022

DataProvider::load and DynamicDataProvider::load_data works for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design Area: Architecture or design C-data-infra Component: provider, datagen, fallback, adapters help wanted Issue needs an assignee S-medium Size: Less than a week (larger bug fix or enhancement) T-techdebt Type: ICU4X code health and tech debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants