-
Notifications
You must be signed in to change notification settings - Fork 19
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Summary
I ran into this error while trying to run the Krea model. This results in the output video being SUPER slow.
Screen.Recording.2025-11-24.at.12.55.47.AM.mov
Error:
2025-11-24T06:50:03.707696092Z 2025-11-24 06:50:03,706 - scope.server.frame_processor - ERROR - Error processing chunk: LoweringException: AttributeError: 'ShapeAsConstantBuffer' object has no attribute 'dtype'
2025-11-24T06:50:03.707700975Z target: flex_attention
2025-11-24T06:50:03.707705647Z args[0]: TensorBox(StorageBox(
2025-11-24T06:50:03.707712082Z ComputedBuffer(name='buf5', layout=FlexibleLayout('cuda:0', torch.bfloat16, size=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128], stride=[655360*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 16384*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.bfloat16, inner_fn=<function BaseView.make_loader.<locals>.loader at 0x759fcdee3d00>, ranges=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128]))
2025-11-24T06:50:03.707717092Z ))
2025-11-24T06:50:03.707721767Z args[1]: TensorBox(StorageBox(
2025-11-24T06:50:03.707726512Z ComputedBuffer(name='buf6', layout=FlexibleLayout('cuda:0', torch.bfloat16, size=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128], stride=[655360*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 16384*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.bfloat16, inner_fn=<function BaseView.make_loader.<locals>.loader at 0x759e583e4700>, ranges=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128]))
2025-11-24T06:50:03.707731309Z ))
2025-11-24T06:50:03.707735926Z args[2]: TensorBox(StorageBox(
2025-11-24T06:50:03.707741585Z ComputedBuffer(name='buf7', layout=FlexibleLayout('cuda:0', torch.bfloat16, size=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128], stride=[655360*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 16384*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.bfloat16, inner_fn=<function BaseView.make_loader.<locals>.loader at 0x759e583e55a0>, ranges=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128]))
2025-11-24T06:50:03.707751018Z ))
2025-11-24T06:50:03.707756166Z args[3]: Subgraph(name='sdpa_score0', graph_module=<lambda>(), graph=None)
2025-11-24T06:50:03.707760923Z args[4]: (1, 1, TensorBox(StorageBox(
2025-11-24T06:50:03.707765665Z ComputedBuffer(name='buf8', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1], stride=[1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function _full.<locals>.inner_fn at 0x759e583e7370>, ranges=[1, 1, 1]))
2025-11-24T06:50:03.707770496Z )), TensorBox(StorageBox(
2025-11-24T06:50:03.707775157Z ComputedBuffer(name='buf9', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1, 1], stride=[1, 1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function _full.<locals>.inner_fn at 0x759e583e5870>, ranges=[1, 1, 1, 1]))
2025-11-24T06:50:03.707779978Z )), None, None, TensorBox(StorageBox(
2025-11-24T06:50:03.707784569Z ComputedBuffer(name='buf10', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1], stride=[1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function make_pointwise.<locals>.inner.<locals>.inner_fn at 0x759fcdee27a0>, ranges=[1, 1, 1]))
2025-11-24T06:50:03.707789306Z )), TensorBox(StorageBox(
2025-11-24T06:50:03.707794306Z ComputedBuffer(name='buf11', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1, 1], stride=[1, 1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function make_pointwise.<locals>.inner.<locals>.inner_fn at 0x759fcdee0040>, ranges=[1, 1, 1, 1]))
2025-11-24T06:50:03.707799113Z )), None, None, 1073741824, 1073741824, Subgraph(name='sdpa_mask0', graph_module=<lambda>(), graph=None))
2025-11-24T06:50:03.707803803Z args[5]: 0.08838834764831843
2025-11-24T06:50:03.707808346Z args[6]: {'PRESCALE_QK': False, 'ROWS_GUARANTEED_SAFE': False, 'BLOCKS_ARE_CONTIGUOUS': False, 'WRITE_DQ': True, 'OUTPUT_LOGSUMEXP': False}
2025-11-24T06:50:03.707813072Z args[7]: (TensorBox(StorageBox(
2025-11-24T06:50:03.707817609Z ComputedBuffer(name='buf4', layout=FlexibleLayout('cuda:0', torch.int32, size=[], stride=[]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function _full.<locals>.inner_fn at 0x759fcdee00d0>, ranges=[]))
2025-11-24T06:50:03.707823124Z )), -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160))
2025-11-24T06:50:03.707827952Z args[8]: ()
2025-11-24T06:50:03.707837419Z Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
2025-11-24T06:50:03.707842070Z Traceback (most recent call last):
2025-11-24T06:50:03.707846587Z File "/app/src/scope/server/frame_processor.py", line 285, in process_chunk
2025-11-24T06:50:03.707851420Z output = pipeline(**call_params)
2025-11-24T06:50:03.707855943Z File "/app/src/scope/core/pipelines/krea_realtime_video/pipeline.py", line 161, in __call__
2025-11-24T06:50:03.707860427Z return self._generate(**kwargs)
2025-11-24T06:50:03.707864911Z File "/app/src/scope/core/pipelines/krea_realtime_video/pipeline.py", line 184, in _generate
2025-11-24T06:50:03.707869728Z _, self.state = self.blocks(self.components, self.state)
2025-11-24T06:50:03.707874294Z File "/app/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
2025-11-24T06:50:03.707879064Z return func(*args, **kwargs)
2025-11-24T06:50:03.707883578Z File "/app/.venv/lib/python3.10/site-packages/diffusers/modular_pipelines/modular_pipeline.py", line 917, in __call__
2025-11-24T06:50:03.707892360Z pipeline, state = block(pipeline, state)
2025-11-24T06:50:03.707897124Z File "/app/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
2025-11-24T06:50:03.707901849Z return func(*args, **kwargs)
2025-11-24T06:50:03.707906391Z File "/app/src/scope/core/pipelines/wan2_1/blocks/denoise.py", line 163, in __call__
2025-11-24T06:50:03.707911255Z _, denoised_pred = components.generator(
2025-11-24T06:50:03.707916001Z File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
2025-11-24T06:50:03.707920708Z return self._call_impl(*args, **kwargs)
2025-11-24T06:50:03.707925501Z File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
2025-11-24T06:50:03.707930512Z return forward_call(*args, **kwargs)
2025-11-24T06:50:03.707935055Z File "/app/src/scope/core/pipelines/wan2_1/components/generator.py", line 218, in forward
2025-11-24T06:50:03.707939599Z flow_pred = self._call_model(
2025-11-24T06:50:03.707944145Z File "/app/src/scope/core/pipelines/wan2_1/components/generator.py", line 189, in _call_model
2025-11-24T06:50:03.707948701Z return self.model(*args, **accepted)
2025-11-24T06:50:03.707956570Z File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
2025-11-24T06:50:03.707961365Z return self._call_impl(*args, **kwargs)
2025-11-24T06:50:03.707967096Z File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
2025-11-24T06:50:03.707971909Z return forward_call(*args, **kwargs)
2025-11-24T06:50:03.707976625Z File "/app/src/scope/core/pipelines/krea_realtime_video/modules/causal_model.py", line 1439, in forward
2025-11-24T06:50:03.707981452Z result = self._forward_inference(*args, **kwargs)
2025-11-24T06:50:03.707985972Z File "/app/src/scope/core/pipelines/krea_realtime_video/modules/causal_model.py", line 1237, in _forward_inference
2025-11-24T06:50:03.707990802Z x = block(x, **kwargs)
2025-11-24T06:50:03.707998069Z File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1771, in _wrapped_call_impl
2025-11-24T06:50:03.708003007Z return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
2025-11-24T06:50:03.708007715Z File "/app/.venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 749, in compile_wrapper
2025-11-24T06:50:03.708012430Z raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1
2025-11-24T06:50:03.708017030Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 923, in _compile_fx_inner
2025-11-24T06:50:03.708021697Z raise InductorError(e, currentframe()).with_traceback(
2025-11-24T06:50:03.708026216Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 907, in _compile_fx_inner
2025-11-24T06:50:03.708030870Z mb_compiled_graph = fx_codegen_and_compile(
2025-11-24T06:50:03.708035341Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1578, in fx_codegen_and_compile
2025-11-24T06:50:03.708039971Z return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
2025-11-24T06:50:03.708044561Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1377, in codegen_and_compile
2025-11-24T06:50:03.708049121Z graph.run(*example_inputs)
2025-11-24T06:50:03.708053743Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 921, in run
2025-11-24T06:50:03.708058230Z return super().run(*args)
2025-11-24T06:50:03.708062807Z File "/app/.venv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 173, in run
2025-11-24T06:50:03.708067381Z self.env[node] = self.run_node(node)
2025-11-24T06:50:03.708071958Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1599, in run_node
2025-11-24T06:50:03.708076591Z result = super().run_node(n)
2025-11-24T06:50:03.708081457Z File "/app/.venv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 242, in run_node
2025-11-24T06:50:03.708090198Z return getattr(self, n.op)(n.target, args, kwargs)
2025-11-24T06:50:03.708095022Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1268, in call_function
2025-11-24T06:50:03.708100471Z raise LoweringException(e, target, args, kwargs).with_traceback(
2025-11-24T06:50:03.708105084Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1258, in call_function
2025-11-24T06:50:03.708109898Z out = lowerings[target](*args, **kwargs) # type: ignore[index]
2025-11-24T06:50:03.708114418Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 446, in wrapped
2025-11-24T06:50:03.708118951Z out = decomp_fn(*args, **kwargs)
2025-11-24T06:50:03.708123492Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/kernel/flex_attention.py", line 1413, in flex_attention
2025-11-24T06:50:03.708128374Z score_mod_other_buffers = maybe_realize(score_mod_other_buffers)
2025-11-24T06:50:03.708132929Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/kernel/flex_attention.py", line 128, in maybe_realize
2025-11-24T06:50:03.708140979Z return tree_map(
2025-11-24T06:50:03.708145699Z File "/app/.venv/lib/python3.10/site-packages/torch/utils/_pytree.py", line 1380, in tree_map
2025-11-24T06:50:03.708150328Z return treespec.unflatten(map(func, *flat_args))
2025-11-24T06:50:03.708154818Z File "/app/.venv/lib/python3.10/site-packages/torch/utils/_pytree.py", line 1197, in unflatten
2025-11-24T06:50:03.708163032Z leaves = list(leaves)
2025-11-24T06:50:03.708167638Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/kernel/flex_attention.py", line 130, in <lambda>
2025-11-24T06:50:03.708172396Z realize_inputs(x)
2025-11-24T06:50:03.708176895Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py", line 3146, in realize_inputs
2025-11-24T06:50:03.708181516Z return ir.ExternKernel.require_stride1(ir.ExternKernel.realize_input(args[0]))
2025-11-24T06:50:03.708186038Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/ir.py", line 5496, in require_stride1
2025-11-24T06:50:03.708190758Z return cls.copy_input(x)
2025-11-24T06:50:03.708195325Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/ir.py", line 5257, in copy_input
2025-11-24T06:50:03.708199895Z dtype=x.get_dtype(),
2025-11-24T06:50:03.708204444Z File "/app/.venv/lib/python3.10/site-packages/torch/_inductor/ir.py", line 581, in get_dtype
2025-11-24T06:50:03.708209033Z return self.dtype
2025-11-24T06:50:03.708213446Z torch._inductor.exc.InductorError: LoweringException: AttributeError: 'ShapeAsConstantBuffer' object has no attribute 'dtype'
2025-11-24T06:50:03.708218056Z target: flex_attention
2025-11-24T06:50:03.708222597Z args[0]: TensorBox(StorageBox(
2025-11-24T06:50:03.708228820Z ComputedBuffer(name='buf5', layout=FlexibleLayout('cuda:0', torch.bfloat16, size=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128], stride=[655360*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 16384*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.bfloat16, inner_fn=<function BaseView.make_loader.<locals>.loader at 0x759fcdee3d00>, ranges=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128]))
2025-11-24T06:50:03.708233696Z ))
2025-11-24T06:50:03.708238884Z args[1]: TensorBox(StorageBox(
2025-11-24T06:50:03.708243740Z ComputedBuffer(name='buf6', layout=FlexibleLayout('cuda:0', torch.bfloat16, size=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128], stride=[655360*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 16384*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.bfloat16, inner_fn=<function BaseView.make_loader.<locals>.loader at 0x759e583e4700>, ranges=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128]))
2025-11-24T06:50:03.708252266Z ))
2025-11-24T06:50:03.708257130Z args[2]: TensorBox(StorageBox(
2025-11-24T06:50:03.708261839Z ComputedBuffer(name='buf7', layout=FlexibleLayout('cuda:0', torch.bfloat16, size=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128], stride=[655360*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 16384*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.bfloat16, inner_fn=<function BaseView.make_loader.<locals>.loader at 0x759e583e55a0>, ranges=[1, 40, 128*CeilToInt(IntTrueDiv(Max(2160, -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160) + 2160), 128)), 128]))
2025-11-24T06:50:03.708266630Z ))
2025-11-24T06:50:03.708272041Z args[3]: Subgraph(name='sdpa_score0', graph_module=<lambda>(), graph=None)
2025-11-24T06:50:03.708276775Z args[4]: (1, 1, TensorBox(StorageBox(
2025-11-24T06:50:03.708281401Z ComputedBuffer(name='buf8', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1], stride=[1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function _full.<locals>.inner_fn at 0x759e583e7370>, ranges=[1, 1, 1]))
2025-11-24T06:50:03.708286491Z )), TensorBox(StorageBox(
2025-11-24T06:50:03.708291125Z ComputedBuffer(name='buf9', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1, 1], stride=[1, 1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function _full.<locals>.inner_fn at 0x759e583e5870>, ranges=[1, 1, 1, 1]))
2025-11-24T06:50:03.708298615Z )), None, None, TensorBox(StorageBox(
2025-11-24T06:50:03.708303191Z ComputedBuffer(name='buf10', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1], stride=[1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function make_pointwise.<locals>.inner.<locals>.inner_fn at 0x759fcdee27a0>, ranges=[1, 1, 1]))
2025-11-24T06:50:03.708307901Z )), TensorBox(StorageBox(
2025-11-24T06:50:03.708312785Z ComputedBuffer(name='buf11', layout=FlexibleLayout('cuda:0', torch.int32, size=[1, 1, 1, 1], stride=[1, 1, 1, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function make_pointwise.<locals>.inner.<locals>.inner_fn at 0x759fcdee0040>, ranges=[1, 1, 1, 1]))
2025-11-24T06:50:03.708317572Z )), None, None, 1073741824, 1073741824, Subgraph(name='sdpa_mask0', graph_module=<lambda>(), graph=None))
2025-11-24T06:50:03.708322267Z args[5]: 0.08838834764831843
2025-11-24T06:50:03.708327606Z args[6]: {'PRESCALE_QK': False, 'ROWS_GUARANTEED_SAFE': False, 'BLOCKS_ARE_CONTIGUOUS': False, 'WRITE_DQ': True, 'OUTPUT_LOGSUMEXP': False}
2025-11-24T06:50:03.708332315Z args[7]: (TensorBox(StorageBox(
2025-11-24T06:50:03.708336856Z ComputedBuffer(name='buf4', layout=FlexibleLayout('cuda:0', torch.int32, size=[], stride=[]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.int32, inner_fn=<function _full.<locals>.inner_fn at 0x759fcdee00d0>, ranges=[]))
2025-11-24T06:50:03.708341506Z )), -s27 + s30 + s47 - Max(0, -s27 + s30 + s47 - 2160))
2025-11-24T06:50:03.708346132Z args[8]: ()
2025-11-24T06:50:03.708355995Z Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
Platform
Ubuntu
Nvidia GPU
H100
Scope Version
uv Version
No response
node Version
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working