Skip to content

[Typing] 修复 dynamic_decode 以及示例中的类型标注错误 #67295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 13, 2024

Conversation

megemini
Copy link
Contributor

@megemini megemini commented Aug 10, 2024

PR Category

User Experience

PR Types

Bug fixes

Description

修复#65397 中所有检查出的错误。

以及 dynamic_decode 标注错误。

关联 PR #65008 #65397

@SigureMo

Copy link

paddle-bot bot commented Aug 10, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@megemini megemini changed the title [Typing] 修复示例中的类型标注错误 [Typing all] 修复示例中的类型标注错误 Aug 10, 2024
@megemini megemini changed the title [Typing all] 修复示例中的类型标注错误 [Typing] 修复示例中的类型标注错误 Aug 10, 2024
@paddle-bot paddle-bot bot added the contributor External developers label Aug 10, 2024
@megemini megemini changed the title [Typing] 修复示例中的类型标注错误 [Typing] 修复 dynamic_decode 以及示例中的类型标注错误 Aug 11, 2024
@@ -61,7 +61,7 @@
from paddle import Tensor
from paddle._typing import PlaceLike
from paddle._typing.device_like import _Place
from paddle.base.dataset import DatasetBase
from paddle.distributed.fleet.dataset.dataset import DatasetBase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么是这个 DatasetBase?按照示例代码应该是 base 下的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考这段示例

import paddle
paddle.enable_static()
dataset = paddle.distributed.InMemoryDataset()
slots = ["slot1", "slot2", "slot3", "slot4"]
slots_vars = []
for slot in slots:
    var = paddle.static.data(
        name=slot, shape=[None, 1], dtype="int64", lod_level=1)
    slots_vars.append(var)
dataset.init(
    batch_size=1,
    thread_num=2,
    input_type=1,
    pipe_command="cat",
    use_var=slots_vars)
filelist = ["a.txt", "b.txt"]
dataset.set_filelist(filelist)
dataset.load_into_memory()
dataset.global_shuffle()
exe = paddle.static.Executor(paddle.CPUPlace())
startup_program = paddle.static.Program()
main_program = paddle.static.Program()
exe.run(startup_program)
exe.train_from_dataset(main_program, dataset)
dataset.release_memory()

base 下面的

class InMemoryDataset(DatasetBase):
"""
InMemoryDataset, it will load data into memory
and shuffle data before training.
This class should be created by DatasetFactory
Example:
dataset = paddle.base.DatasetFactory().create_dataset("InMemoryDataset")
"""
@deprecated(since="2.0.0", update_to="paddle.distributed.InMemoryDataset")
def __init__(self):
"""Init."""
super().__init__()
self.proto_desc.name = "MultiSlotInMemoryDataFeed"
self.fleet_send_batch_size = None
self.is_user_set_queue_num = False
self.queue_num = None
self.parse_ins_id = False
self.parse_content = False
self.parse_logkey = False
self.merge_by_sid = True
self.enable_pv_merge = False
self.merge_by_lineid = False
self.fleet_send_sleep_seconds = None
self.trainer_num = -1
self.pass_id = 0

已经遗弃了,还是说 train_from_datasetdataset 可以支持两个 Dataset

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从执行器的示例代码肯定是支持 paddle.base.dataset.Dataset 的,那就都写吧,import 的时候 as 加个别名避免冲突

离谱的设计,再这样搞直接给他们搞 Any

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow 🐾

@luotao1 luotao1 merged commit 4eb9d6e into PaddlePaddle:develop Aug 13, 2024
30 checks passed
Jeff114514 pushed a commit to Jeff114514/Paddle that referenced this pull request Aug 14, 2024
* [Fix] typing

* [Fix] typing

* [Fix] typing

* [Fix] DatasetBase from base and fleet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants