porting gamecraft to fastvideo #941

MihirJagtap · 2025-12-19T03:29:20Z

No description provided.

gemini-code-assist · 2025-12-19T03:31:24Z

Summary of Changes

Hello @MihirJagtap, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Hunyuan-GameCraft model into the fastvideo framework, marking a significant step towards interactive game video generation. The core changes involve integrating a specialized transformer model capable of incorporating camera control and action sequences directly into the video synthesis process. The update provides a comprehensive inference pipeline, complete with distributed processing capabilities, FP8 optimization for enhanced performance, and a user-friendly Gradio interface, making it a powerful tool for creating dynamic and controllable video content in game environments.

Highlights

New Model Integration: The Hunyuan-GameCraft model has been fully integrated into the fastvideo framework, including its specific configuration (HunyuanGameCraftConfig) and core transformer implementation (HunyuanGameCraftTransformer3DModel).
Interactive Game Video Generation: A novel CameraNet module has been introduced to encode camera parameters and integrate them into the video generation process, enabling dynamic and interactive control over viewpoints and movements within generated game videos.
Comprehensive Inference Pipeline: A complete inference pipeline is now available, featuring a custom diffusion pipeline (HunyuanVideoGamePipeline), a dedicated sampler (HunyuanVideoSampler), and support for distributed processing, FP8 optimization, and a Gradio UI for interactive user experience.
Modular and Efficient Architecture: The pull request adds numerous new modules for activation layers, attention mechanisms (including Rotary Position Embeddings and Flash Attention integration), embedding layers, MLP layers, normalization layers, and utilities for distributed parallel states, ensuring a robust, extensible, and high-performance design.
Causal 3D VAE for Video Processing: A AutoencoderKLCausal3D model, along with specialized 3D causal UNet blocks, has been integrated to efficiently encode and decode video latents, supporting temporal and spatial tiling for memory optimization and consistent video generation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request ports the Hunyuan-GameCraft model to the fastvideo framework. It introduces a significant amount of new code, including model definitions, configurations, inference pipelines, and a Gradio-based user interface. The changes are extensive and cover the full stack from model architecture to a user-facing demo.

My review has identified several issues of varying severity. There is a critical issue in the API server that will cause it to be unresponsive due to a long sleep call. The Gradio application has a high-severity issue related to its use of global state, which will lead to incorrect behavior in multi-session scenarios. There are also a few high and medium severity issues in the model and data loading code that should be addressed to ensure correctness and robustness. Overall, the porting effort is substantial, but these key issues need to be resolved before merging.

gemini-code-assist · 2025-12-19T03:34:00Z

fastvideo/models/Hunyuan-GameCraft-1.0-main/hymm_sp/gradio/api_server.py

+        with open(trigger_tmp, 'w') as f:
+            json.dump(trigger_data, f, indent=2, ensure_ascii=False)
+        os.replace(trigger_tmp, TRIGGER_FILE)  # Atomically replace/create
+        time.sleep(100)


This time.sleep(100) call will block the API endpoint for 100 seconds, which will cause any client requests to time out and render the API server unresponsive. This appears to be leftover debugging code and should be removed.

gemini-code-assist · 2025-12-19T03:34:00Z

...models/Hunyuan-GameCraft-1.0-main/hymm_sp/diffusion/pipelines/pipeline_hunyuan_video_game.py

+        latents = x0 * t + x1 * (1 - t)
+        latents = torch.randn_like(x1)


The latents variable calculated from linear interpolation on line 531 is immediately overwritten on line 532. This makes the interpolation logic ineffective and appears to be a bug. The TODO: correct comment on line 524 also suggests this section may be incomplete or incorrect. Please review this logic to ensure the intended latent initialization is performed.

Suggested change

latents = x0 * t + x1 * (1 - t)

latents = torch.randn_like(x1)

latents = x0 * t + x1 * (1 - t)

gemini-code-assist · 2025-12-19T03:34:00Z

fastvideo/models/Hunyuan-GameCraft-1.0-main/hymm_sp/gradio/app_gamecraft.py

+previous_video_path = None
+previous_first_frame = None
+all_video_paths = []
+current_input_type = None  # record input type: "image" or "video"


The application uses global variables (previous_video_path, previous_first_frame, all_video_paths, current_input_type) to manage state between user interactions. This is problematic in a web application as it creates shared state across all user sessions, leading to race conditions and incorrect behavior. Please refactor this to use gr.State for managing session-specific state. All variables that need to be preserved between interactions for a single user session should be passed as inputs and outputs to your Gradio functions via gr.State components.

For example:

# In the UI definition ... video_state = gr.State(value={"previous_video_path": None, "all_video_paths": []}) ... # In the click handler generate_btn.click( fn=generate_custom_video, inputs=[..., video_state], outputs=[..., video_state] ) # In the function definition def generate_custom_video(..., video_state_dict): # read from video_state_dict previous_video_path = video_state_dict["previous_video_path"] ... # update state video_state_dict["previous_video_path"] = new_video_path ... return ..., video_state_dict

gemini-code-assist · 2025-12-19T03:34:00Z

fastvideo/configs/models/dits/hunyuangamecraft.py

+        })
+
+    # Reverse mapping for saving checkpoints: custom -> hf
+    reverse_param_names_mapping: dict = field(default_factory=lambda: {})


The reverse_param_names_mapping is initialized as an empty dictionary. The comment indicates it's for saving checkpoints. If saving checkpoints is intended to be functional with this PR, this dictionary should be populated. If it's a work-in-progress, consider adding a TODO comment to make it clear that this functionality is not yet implemented.

gemini-code-assist · 2025-12-19T03:34:00Z

fastvideo/models/Hunyuan-GameCraft-1.0-main/hymm_sp/data_kits/video_dataset.py

+        except:
+            # Fallback: PIL for special formats
+            face_image_masked = Image.open(image_path).convert('RGB')


Using a bare except: clause is generally discouraged as it can catch unexpected exceptions like SystemExit or KeyboardInterrupt, making the program harder to debug or terminate. It's better to catch a more specific exception, such as Exception.

Suggested change

except:

# Fallback: PIL for special formats

face_image_masked = Image.open(image_path).convert('RGB')

except Exception:

# Fallback: PIL for special formats

face_image_masked = Image.open(image_path).convert('RGB')

gemini-code-assist · 2025-12-19T03:34:00Z

fastvideo/models/Hunyuan-GameCraft-1.0-main/hymm_sp/modules/models.py

+        if CPU_OFFLOAD: torch.cuda.empty_cache()
+
+        camera_condition = cam_latents
+        assert camera_condition is not None, print("plucker_embedding is not provided")


The assert statement uses a print() call for its message. The print() function will be executed regardless of whether the assertion passes or fails, which is likely not the intended behavior. The message should be passed as the second argument to assert.

Suggested change

assert camera_condition is not None, print("plucker_embedding is not provided")

assert camera_condition is not None, "plucker_embedding is not provided"

porting gamecraft to fastvideo

aeed306

gemini-code-assist bot reviewed Dec 19, 2025

View reviewed changes

MihirJagtap added 3 commits December 19, 2025 09:40

remove hunyuan gamecraft home files

e72e35b

add test cases

13fa07a

misc

24f97da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

porting gamecraft to fastvideo #941

porting gamecraft to fastvideo #941

Uh oh!

MihirJagtap commented Dec 19, 2025

Uh oh!

gemini-code-assist bot commented Dec 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Uh oh!

gemini-code-assist bot Dec 19, 2025

Uh oh!

gemini-code-assist bot Dec 19, 2025

Uh oh!

gemini-code-assist bot Dec 19, 2025

Uh oh!

gemini-code-assist bot Dec 19, 2025

Uh oh!

gemini-code-assist bot Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		latents = x0 * t + x1 * (1 - t)
		latents = torch.randn_like(x1)

	assert camera_condition is not None, print("plucker_embedding is not provided")
	assert camera_condition is not None, "plucker_embedding is not provided"

porting gamecraft to fastvideo #941

Are you sure you want to change the base?

porting gamecraft to fastvideo #941

Uh oh!

Conversation

MihirJagtap commented Dec 19, 2025

Uh oh!

gemini-code-assist bot commented Dec 19, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant