From json to pqt #20

LuisJG8 · 2026-01-25T03:37:53Z

No description provided.

gemini-code-assist · 2026-01-25T03:38:06Z

Summary of Changes

Hello @LuisJG8, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the task distribution mechanism within worker.py to enhance the efficiency and idempotency of data retrieval. It introduces parallel processing for the get_github_data task by leveraging Celery's group primitive, allowing a larger data fetching operation to be split into smaller, concurrently executable chunks. The title "From json to pqt" is not directly reflected in the provided patch, which focuses solely on Celery task management.

Highlights

Celery Task Distribution: The worker.py file now utilizes Celery's group primitive to distribute get_github_data tasks, enabling parallel processing.
Function Renaming and Refactoring: The get_more_data function has been renamed to distribute_tasks and now orchestrates the creation and execution of multiple get_github_data sub-tasks.
Improved Data Retrieval Efficiency: Ten get_github_data tasks are now launched concurrently, each processing a chunk of 500 items, significantly improving the efficiency of data retrieval.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new Celery task distribute_tasks to parallelize fetching data from GitHub by splitting it into a group of sub-tasks. This is a good approach for improving performance and idempotency.

However, I've found a critical issue in the implementation that will cause the tasks to fail at runtime. I've also noted a medium-severity issue regarding the use of magic numbers.

Key points for improvement:

Critical Bug: The get_github_data task is called with incorrect arguments, passing an integer where a Github object is expected. This needs to be fixed.
Magic Numbers: The range for distributing tasks uses hardcoded numbers, which should be extracted into constants for better readability and maintainability.
Incomplete Change: The client.py file, which seems to be the entry point for starting the Celery job, has not been updated to call the new distribute_tasks function. It still calls the old get_github_data task directly, bypassing the new parallelization logic.
Misleading Title: The PR title 'From json to pqt' does not seem to reflect the changes made, which are focused on refactoring task distribution rather than changing data formats.

Please see my detailed comments in the code.

gemini-code-assist · 2026-01-25T03:39:08Z

worker.py


-logger.info("Worker module loaded")
+    jobs = [
+        get_github_data.s(start, 500)


There's a critical issue in how the get_github_data task is being called. The task's signature is def get_github_data(self, start_in_repo_num: int = 0, github_instance: Github = gh). When you call get_github_data.s(start, 500), Celery passes start as start_in_repo_num and the integer 500 as github_instance. The github_instance parameter expects a Github object, not an integer. This will cause an AttributeError at runtime when the task tries to use this parameter (e.g., github_instance.get_repos(...)).

To fix this, you should remove the second argument. If 500 was intended as a limit, the get_github_data task signature and implementation would need to be updated to handle it.

Suggested change

get_github_data.s(start, 500)

get_github_data.s(start)

gemini-code-assist · 2026-01-25T03:39:08Z

worker.py

-logger.info("Worker module loaded")
+    jobs = [
+        get_github_data.s(start, 500)
+        for start in range(0, 5000, 500)


The values 5000 and 500 are used here as magic numbers. For better readability and maintainability, consider defining them as named constants at the module level. For example:

TOTAL_REPOS_TO_PROCESS = 5000 CHUNK_SIZE = 500 # ... for start in range(0, TOTAL_REPOS_TO_PROCESS, CHUNK_SIZE):

This makes the code's intent clearer and easier to modify in the future.

Luis Gonzalez added 2 commits January 24, 2026 22:34

added parallel task execution

feae0a6

added parallel task execution

e5aa0e6

LuisJG8 merged commit ed524a8 into MyProjects Jan 25, 2026
1 check failed

gemini-code-assist bot reviewed Jan 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From json to pqt #20

From json to pqt #20

Uh oh!

LuisJG8 commented Jan 25, 2026

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 25, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

From json to pqt #20

From json to pqt #20

Uh oh!

Conversation

LuisJG8 commented Jan 25, 2026

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant