Skip to content

Conversation

@pekopoke
Copy link
Collaborator

No description provided.

from magic_html import GeneralExtractor

# 初始化提取器
extractor = GeneralExtractor()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这几个 example 是不是缺少了extractor 的初始化部分

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这边的示例主要还是演示新定义的extractor 的用法,不是原始安装包的抽取方法

from typing import Dict, Any, Optional
from .base import BaseExtractor, ExtractionResult
from .factory import extractor
from magic_html import GeneralExtractor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们整理一个 requirement.txt 文件吧,看起来要安装很多依赖

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

每一种 extractor tests 目录下也加一下单测用例

@e06084 e06084 merged commit 07b095c into opendatalab:main Aug 4, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants