Error while running `video_understanding` demo

Hello!

While trying to run the `video_understanding` demo, I got this error:
```
Please input the video path:(Waiting for input. Your input can only be text or image path each time, you can press Enter once to input multiple times. Press Enter twice to finish the entire
 input.):
>>>/root/test.mov
>>>what is inside?
>>>
INFO | 2025-01-23 15:12:21,107 | /root/github/OmAgent/examples/video_understanding/run_cli.py:79 | <module> | User input lines: ['/root/test.mov', 'what is inside?']
INFO | 2025-01-23 15:12:21,457 | /root/github/OmAgent/omagent-core/src/omagent_core/clients/devices/app/input.py:79 | read_input | Received message: {'payload': '{"agent_id": "57921ba3-b050-4138-bac7-0cfd4b885553", "messages": [{"role": "user", "content": [{"type": "image_url", "data": "/root/test.mov"}, {"type": "text", "data": "what is inside?"}]}], "kwargs": {}}'}
  Detected: 0 | Progress:   0%|                                                                                                                                  | 0/582 [00:00<?, ?frames/s]2025-01-23 15:12:21,555 pyscenedetect INFO     Detecting scenes...
  Detected: 0 | Progress: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊| 581/582 [00:01<00:00, 567.87frames/s]2025-01-23 15:12:22,714 [3113034] omagent_core.engine.worker.base ERROR    Error executing task VideoPreprocessor with id de26c1ab-748a-40c6-a641-c4147b1c6b6c.  error = Traceback (most recent call last):
  File "/root/github/OmAgent/omagent-core/src/omagent_core/engine/worker/base.py", line 141, in execute
    task_output = self._run(**task_input)
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/github/OmAgent/examples/video_understanding/agent/video_preprocessor/video_preprocess.py", line 182, in _run
    video = VideoScenes.load(
            ^^^^^^^^^^^^^^^^^
  File "/root/github/OmAgent/examples/video_understanding/agent/misc/scene.py", line 98, in load
    audio = AudioSegment.from_file(video_path)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/omagent/lib64/python3.11/site-packages/pydub/audio_segment.py", line 728, in from_file
    info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/omagent/lib64/python3.11/site-packages/pydub/utils.py", line 279, in mediainfo_json
    info = json.loads(output)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)

```

I am using local models and for STT and Embedding I followed the example from #197 .

Version: `v0.2.2`

If you need more details, please let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while running `video_understanding` demo #200

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error while running video_understanding demo #200

Description