Skip to content

Commit

Permalink
bug(SDK): fix the ingestion process for list in huggingface datasets (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
tianweidut authored Nov 22, 2023
1 parent 314934e commit 98823b4
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions client/starwhale/integrations/huggingface/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,9 @@ def _transform_to_starwhale(data: t.Any, feature: t.Any) -> t.Any:
# TODO: graceful handle classLabel, store it into Starwhale.ClassLabel type
return data
elif isinstance(feature, list):
# list supports mixed type, but Starwhale only supports same type
return [_transform_to_starwhale(d, feature[i]) for i, d in enumerate(data)]
# Huggingface list feature should be provided with a single sub-feature as an example of the feature type hosted in this list.
# ref: https://huggingface.co/docs/datasets/package_reference/main_classes#datasets.Features
return [_transform_to_starwhale(d, feature[0]) for d in data]
elif isinstance(feature, hf_datasets.Sequence):
inner_feature = feature.feature
if isinstance(inner_feature, dict):
Expand Down

0 comments on commit 98823b4

Please sign in to comment.