Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.24】Add SubsetRandomSampler #6285

Merged
merged 2 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api/paddle/io/Overview_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ paddle.io 目录下包含飞桨框架数据集定义、数据读取相关的 API
" :ref:`SequenceSampler <cn_api_paddle_io_SequenceSampler>` ", "顺序采样器接口"
" :ref:`RandomSampler <cn_api_paddle_io_RandomSampler>` ", "随机采样器接口"
" :ref:`WeightedRandomSampler <cn_api_paddle_io_WeightedRandomSampler>` ", "带权重随机采样器接口"
" :ref:`SubesetRandomSampler <cn_api_paddle_io_SubsetRandomSampler>` ", "子集随机随机采样器接口"

.. _about_batch_sampler:

Expand Down
25 changes: 25 additions & 0 deletions docs/api/paddle/io/SubsetRandomSampler_cn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
.. _cn_api_paddle_io_SubsetRandomSampler:

SubsetRandomSampler
-------------------------------

.. py:class:: paddle.io.SubsetRandomSampler(indices, generator=None)

从给定的索引列表中随机采样元素,而不进行替换

参数
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参数部分描述和英文不一致,以哪个为准? PaddlePaddle/Paddle#57726

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,和代码同步

:::::::::

- **indices** (tuple|list) - 子集在原数据集中的索引序列,需要是 list 或者 tuple 类型。
- **generator** (Generator,可选) - 指定采样 ``data_source`` 的采样器。默认值为 None,不启用。

返回
:::::::::
SubsetRandomSampler,返回根据权重随机采样下标的采样器



代码示例
:::::::::

COPY-FROM: paddle.io.SubsetRandomSampler
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## [参数完全一致] torch.utils.data.SubsetRandomSampler

### [torch.utils.data.SubsetRandomSampler](https://pytorch.org/docs/stable/data.html#torch.utils.data.SubsetRandomSampler)

```
torch.utils.data.SubsetRandomSampler(indices, generator=None)
```

### [paddle.io.WeightedRandomSampler](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/io/SubsetRandomSampler_cn.html#paddle.io.SubsetRandomSampler)

```
paddle.io.SubsetRandomSampler(indices, generator=None)
```

两者参数完全一致,具体如下:

### 参数映射

| PyTorch | PaddlePaddle | 备注 |
| ----------- | ------------ | -------------------------------------------------------------------- |
| indices | indices | 子集在原数据集中的索引序列,需要是 list 或者 tuple 类型。 |
| generator | - | Paddle 无此参数,一般对网络训练结果影响不大,可直接删除。 |