-
Notifications
You must be signed in to change notification settings - Fork 0
feat: added logs to wikiteam crawler! #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request updates the Changes
Sequence Diagram(s)sequenceDiagram
participant C as Caller
participant W as WikiteamCrawler
participant L as Logger
participant D as DumpGenerator
C->>W: crawl(api_url, dump_path)
W->>L: Log "Crawling mediawiki dump with api_url and dump_path"
W->>L: Log "Parameters for DumpGenerator"
W->>D: Initialize DumpGenerator with logged parameters
D-->>W: (Return dump data)
Poem
Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
hivemind_etl/mediawiki/wikiteam_crawler.py (1)
62-64: Consider adding success/completion loggingWhile you've added useful logging before the crawl operation starts, it would be beneficial to also add a log statement after the DumpGenerator completes its work to indicate success or completion.
logging.info(f"Crawling mediawiki dump from {api_url} to {dump_path}") logging.info(f"Parameters: {params}") # Directly call the DumpGenerator static __init__ method which will parse these parameters, # execute the dump generation process, and run through the rest of the workflow. DumpGenerator(params) + logging.info(f"Successfully completed crawling mediawiki dump from {api_url}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
hivemind_etl/mediawiki/wikiteam_crawler.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: ci / test / Test
- GitHub Check: ci / lint / Lint
🔇 Additional comments (1)
hivemind_etl/mediawiki/wikiteam_crawler.py (1)
62-64: Good addition of logging statements!These logs provide valuable context about the crawling operation, including the source URL, destination path, and parameters being used. This will be helpful for debugging and monitoring.
Summary by CodeRabbit