-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Add Sintel Dataset to the dataset prototype API #4895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
krshrimali
wants to merge
43
commits into
pytorch:main
Choose a base branch
from
krshrimali:dataset/sintel
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
eff55d3
WIP: Sintel Dataset
krshrimali 61831dc
Failing to read streamwrapper object in Python
krshrimali 12b5915
KeyZipper updates
krshrimali d2dcba9
Merge remote-tracking branch 'upstream/main' into dataset/sintel
krshrimali 1b690ac
seek of closed file error for now
krshrimali 6f371c7
Working...
krshrimali 081c70f
Rearranging functions
krshrimali ad74f96
Merge remote-tracking branch 'upstream/main' into dataset/sintel
krshrimali 6c106e7
Fix mypy failures, minor edits
krshrimali 32cc661
Apply suggestions from code review
krshrimali 7f27e3f
Address reviews...
krshrimali 28def28
Merge branch 'main' into dataset/sintel
krshrimali cdcb914
Update torchvision/prototype/datasets/_builtin/sintel.py
krshrimali b58c14b
Add support for 'both' as pass_name
krshrimali 1d7a36e
Merge branch 'dataset/sintel' of github.com:krshrimali/vision into da…
krshrimali 52ba6da
Keep imports in the same block
krshrimali e515fbb
Convert re.search output to bool
krshrimali 7892eb6
Merge branch 'main' into dataset/sintel
krshrimali ee3c78f
Address reviews, cleanup, one more todo left...
krshrimali 79c65fb
Merge branch 'dataset/sintel' of github.com:krshrimali/vision into da…
krshrimali 08cd984
Merge branch 'main' into dataset/sintel
krshrimali 591633a
little endian format for data (flow file)
krshrimali 98872fd
Merge branch 'dataset/sintel' of github.com:krshrimali/vision into da…
krshrimali 7ccca53
Merge branch 'main' into dataset/sintel
krshrimali 8f84b51
As per review, use frombuffer consistently
krshrimali 709263c
Merge branch 'dataset/sintel' of github.com:krshrimali/vision into da…
krshrimali 6b40366
Only filter pass name, and not png, include flow filter there
krshrimali 34e8de3
Rename the func
krshrimali cb904c5
Add label (scene dir), needs review
krshrimali 0e13b3f
Merge branch 'main' into dataset/sintel
krshrimali 10bdc4b
Add test for sintel dataset
krshrimali 7b4265f
Merge branch 'dataset/sintel' of github.com:krshrimali/vision into da…
krshrimali d34ebe6
Merge branch 'main' into dataset/sintel
krshrimali 54618c6
Remove comment
krshrimali 6c04d5f
Temporary fix + test class fixes
krshrimali 84c4e88
Revert temp fix
krshrimali ebf7e4a
Merge branch 'main' into dataset/sintel
pmeier c0b254c
use common read_flo instead of custom implementation
pmeier e9fa656
remove more obsolete code
pmeier 3724869
[DEBUG] check if tests also run on Python 3.9
pmeier 69194e1
Revert "[DEBUG] check if tests also run on Python 3.9"
pmeier b4cce90
store bytes to avoid reading twice from file handle
pmeier 527d1fa
Merge branch 'main' into dataset/sintel
pmeier File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
import io | ||
import pathlib | ||
import re | ||
from functools import partial | ||
from typing import Any, Callable, Dict, List, Optional, Tuple, BinaryIO | ||
|
||
import torch | ||
from torchdata.datapipes.iter import ( | ||
IterDataPipe, | ||
Demultiplexer, | ||
Mapper, | ||
Shuffler, | ||
Filter, | ||
IterKeyZipper, | ||
ZipArchiveReader, | ||
) | ||
from torchvision.prototype.datasets.utils import ( | ||
Dataset, | ||
DatasetConfig, | ||
DatasetInfo, | ||
HttpResource, | ||
OnlineResource, | ||
DatasetType, | ||
) | ||
from torchvision.prototype.datasets.utils._internal import INFINITE_BUFFER_SIZE, read_flo, InScenePairer, path_accessor | ||
|
||
|
||
class SINTEL(Dataset): | ||
|
||
_FILE_NAME_PATTERN = re.compile(r"(frame|image)_(?P<idx>\d+)[.](flo|png)") | ||
|
||
def _make_info(self) -> DatasetInfo: | ||
return DatasetInfo( | ||
"sintel", | ||
type=DatasetType.IMAGE, | ||
homepage="http://sintel.is.tue.mpg.de/", | ||
valid_options=dict( | ||
split=("train", "test"), | ||
pass_name=("clean", "final", "both"), | ||
), | ||
) | ||
|
||
def resources(self, config: DatasetConfig) -> List[OnlineResource]: | ||
archive = HttpResource( | ||
"http://files.is.tue.mpg.de/sintel/MPI-Sintel-complete.zip", | ||
sha256="bdc80abbe6ae13f96f6aa02e04d98a251c017c025408066a00204cd2c7104c5f", | ||
) | ||
return [archive] | ||
|
||
def _filter_split(self, data: Tuple[str, Any], *, split: str) -> bool: | ||
path = pathlib.Path(data[0]) | ||
# The dataset contains has the folder "training", while allowed options for `split` are | ||
# "train" and "test", we don't check for equality here ("train" != "training") and instead | ||
# check if split is in the folder name | ||
return split in path.parents[2].name | ||
|
||
def _filter_pass_name_and_flow(self, data: Tuple[str, Any], *, pass_name: str) -> bool: | ||
path = pathlib.Path(data[0]) | ||
if pass_name == "both": | ||
matched = path.parents[1].name in ["clean", "final", "flow"] | ||
else: | ||
matched = path.parents[1].name in [pass_name, "flow"] | ||
return matched | ||
|
||
def _classify_archive(self, data: Tuple[str, Any], *, pass_name: str) -> Optional[int]: | ||
path = pathlib.Path(data[0]) | ||
suffix = path.suffix | ||
if suffix == ".flo": | ||
return 0 | ||
elif suffix == ".png": | ||
return 1 | ||
else: | ||
return None | ||
|
||
def _flows_key(self, data: Tuple[str, Any]) -> Tuple[str, int]: | ||
path = pathlib.Path(data[0]) | ||
category = path.parent.name | ||
idx = int(self._FILE_NAME_PATTERN.match(path.name).group("idx")) # type: ignore[union-attr] | ||
return category, idx | ||
|
||
def _add_fake_flow_data(self, data: Tuple[str, Any]) -> Tuple[Tuple[None, None], Tuple[str, Any]]: | ||
return ((None, None), data) | ||
|
||
def _images_key(self, data: Tuple[Tuple[str, Any], Tuple[str, Any]]) -> Tuple[str, int]: | ||
return self._flows_key(data[0]) | ||
|
||
def _collate_and_decode_sample( | ||
self, | ||
data: Tuple[Tuple[Optional[str], Optional[BinaryIO]], Tuple[Tuple[str, BinaryIO], Tuple[str, BinaryIO]]], | ||
*, | ||
decoder: Optional[Callable[[BinaryIO], torch.Tensor]], | ||
) -> Dict[str, Any]: | ||
flow_data, images_data = data | ||
flow_path, flow_buffer = flow_data | ||
image1_data, image2_data = images_data | ||
image1_path, image1_buffer = image1_data | ||
image2_path, image2_buffer = image2_data | ||
|
||
return dict( | ||
image1=decoder(image1_buffer) if decoder else image1_buffer, | ||
image1_path=image1_path, | ||
image2=decoder(image2_buffer) if decoder else image2_buffer, | ||
image2_path=image2_path, | ||
flow=read_flo(flow_buffer) if flow_buffer else None, | ||
flow_path=flow_path, | ||
scene=pathlib.Path(image1_path).parent.name, | ||
) | ||
|
||
def _make_datapipe( | ||
self, | ||
resource_dps: List[IterDataPipe], | ||
*, | ||
config: DatasetConfig, | ||
decoder: Optional[Callable[[io.IOBase], torch.Tensor]], | ||
) -> IterDataPipe[Dict[str, Any]]: | ||
dp = resource_dps[0] | ||
archive_dp = ZipArchiveReader(dp) | ||
|
||
curr_split = Filter(archive_dp, self._filter_split, fn_kwargs=dict(split=config.split)) | ||
krshrimali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
filtered_curr_split = Filter( | ||
curr_split, self._filter_pass_name_and_flow, fn_kwargs=dict(pass_name=config.pass_name) | ||
) | ||
if config.split == "train": | ||
flo_dp, pass_images_dp = Demultiplexer( | ||
filtered_curr_split, | ||
2, | ||
partial(self._classify_archive, pass_name=config.pass_name), | ||
drop_none=True, | ||
buffer_size=INFINITE_BUFFER_SIZE, | ||
) | ||
flo_dp = Shuffler(flo_dp, buffer_size=INFINITE_BUFFER_SIZE) | ||
pass_images_dp: IterDataPipe[Tuple[str, Any], Tuple[str, Any]] = InScenePairer( | ||
pass_images_dp, scene_fn=path_accessor("parent", "name") | ||
) | ||
zipped_dp = IterKeyZipper( | ||
flo_dp, | ||
pass_images_dp, | ||
key_fn=self._flows_key, | ||
ref_key_fn=self._images_key, | ||
) | ||
else: | ||
pass_images_dp = Shuffler(filtered_curr_split, buffer_size=INFINITE_BUFFER_SIZE) | ||
pass_images_dp = InScenePairer(pass_images_dp, scene_fn=path_accessor("parent", "name")) | ||
zipped_dp = Mapper(pass_images_dp, self._add_fake_flow_data) | ||
|
||
return Mapper(zipped_dp, self._collate_and_decode_sample, fn_kwargs=dict(decoder=decoder)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.