-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: unify audio pipeline with FFmpeg-based decoder and encoder #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add FFmpegDecoder supporting all common audio formats (MP3, FLAC, WAV, OGG, AAC) - Update StreamingReader to use FFmpeg with fallback to pure Go decoders - Implement seeking capability for efficient two-pass processing - Support both planar and interleaved audio formats with automatic stereo-to-mono downmixing This eliminates double-decode overhead and addresses end-of-encode delay by following the FIFO design approach.
- Add a new SharedAudioBuffer to provide thread-safe buffer for audio samples - Implement independent read positions for FFT analysis and AAC encoder - Support both blocking and non-blocking read operations - Include buffer management features like compaction and reset - Add comprehensive test coverage for the shared buffer - Fix formatting in ffmpeg_decoder.go sample slice declarations
Resolve conflict between Git submodule and Go's vendoring system. Placing the submodule in vendor/ triggered Go's vendoring mode, requiring -mod=mod flags everywhere and causing "inconsistent vendoring" errors when vendor/modules.txt got out of sync. Changes: - Move submodule to third_party/ffmpeg-statigo - Update go.mod replace directive - Remove all -mod=mod flags from justfile and CI workflows - Update ffmpeg-statigo documentation to recommend third_party/ The third_party/ directory is the conventional location for external C/CGO libraries in Go projects, keeping vendor/ free for go mod vendor.
- Add support for pre-decoded audio samples with WriteAudioSamples() - Implement new audio encoder initialization for sample-based input - Add audio FIFO buffering for proper frame handling - Include FlushAudioEncoder method to handle remaining samples - Add SampleRate configuration parameter for new mode - Maintain backward compatibility with file-based audio input
- Replace PTS-based audio processing with direct sample-based approach - Add WriteAudioSamples method to send audio samples alongside video frames - Simplify audio flush process with dedicated FlushAudioEncoder method - Use audio file's native sample rate rather than hardcoded value This change improves synchronization between audio and video by handling audio samples at the same time as their corresponding video frames.
- Remove pure Go decoders (WAV, MP3, FLAC) in favor of unified FFmpeg decoder - Update architecture documentation to reflect the unified audio pipeline - Remove audioFlushTime performance metric (no longer needed with single decoder) - Clean up dependencies in go.mod This simplifies the codebase while providing broader format support.
- Removes minimum display time and completion delays from Pass1Model - Tracks and displays thumbnail generation time in the summary - Improves speed calculation in Pass2 to show accurate realtime multiplier - Renames "Initialization" to "Other" in time breakdown for clarity - Simplifies video duration display format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 issues found across 19 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="internal/audio/ffmpeg_decoder.go">
<violation number="1" location="internal/audio/ffmpeg_decoder.go:185">
P2: Inconsistent error handling: `ret` from `AVCodecSendPacket` is assigned but never checked. Other FFmpeg calls in this file check both `err != nil` and `ret < 0` for complete error handling.</violation>
</file>
<file name="internal/audio/shared_buffer.go">
<violation number="1" location="internal/audio/shared_buffer.go:70">
P2: Documentation says 'Returns io.EOF' but the code returns `ErrBufferClosed`. This inconsistency can cause bugs when callers check for `io.EOF` to detect end-of-stream. Either update the comments to reference `ErrBufferClosed`, or change the error to use `io.EOF` (which would require importing the `io` package).</violation>
<violation number="2" location="internal/audio/shared_buffer.go:221">
P1: `Reset()` doesn't call `cond.Broadcast()` after changing state. If a consumer is blocked in `ReadForEncoder`'s `cond.Wait()`, it will remain stuck indefinitely after Reset() is called, potentially causing a deadlock. Add `b.cond.Broadcast()` to wake up blocked consumers so they can re-evaluate the new state.</violation>
</file>
<file name="internal/encoder/encoder.go">
<violation number="1" location="internal/encoder/encoder.go:477">
P1: Error handling logic silently swallows real encoder errors. The condition `err != nil || errors.Is(err, ...)` breaks on ANY error. The intended logic should only break on EAgain/EOF and return errors for real failures.</violation>
<violation number="2" location="internal/encoder/encoder.go:568">
P2: Return value of `AVInterleavedWriteFrame` is ignored. Unlike `WriteAudioSamples()`, this doesn't check for write errors, which could silently produce corrupted output.</violation>
</file>
<file name="cmd/jivefire/main.go">
<violation number="1" location="cmd/jivefire/main.go:498">
P2: Repeated allocation in hot loop. Consider pre-allocating `audioSamplesBuffer` outside the frame loop and reusing it with slicing (`audioSamples := audioSamplesBuffer[:len(newSamples)]`), consistent with how `barHeights` and other buffers are pre-allocated at lines 273-280.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| b.samples = b.samples[:0] | ||
| b.fftReadPos = 0 | ||
| b.encoderReadPos = 0 | ||
| b.closed = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Reset() doesn't call cond.Broadcast() after changing state. If a consumer is blocked in ReadForEncoder's cond.Wait(), it will remain stuck indefinitely after Reset() is called, potentially causing a deadlock. Add b.cond.Broadcast() to wake up blocked consumers so they can re-evaluate the new state.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/audio/shared_buffer.go, line 221:
<comment>`Reset()` doesn't call `cond.Broadcast()` after changing state. If a consumer is blocked in `ReadForEncoder`'s `cond.Wait()`, it will remain stuck indefinitely after Reset() is called, potentially causing a deadlock. Add `b.cond.Broadcast()` to wake up blocked consumers so they can re-evaluate the new state.</comment>
<file context>
@@ -0,0 +1,246 @@
+ b.samples = b.samples[:0]
+ b.fftReadPos = 0
+ b.encoderReadPos = 0
+ b.closed = false
+}
+
</file context>
| b.closed = false | |
| b.closed = false | |
| b.cond.Broadcast() // Wake up any blocked consumers to re-evaluate state |
| if err != nil || errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) { | ||
| ffmpeg.AVPacketFree(&encodedPkt) | ||
| break | ||
| } | ||
| if ret < 0 { | ||
| ffmpeg.AVPacketFree(&encodedPkt) | ||
| return fmt.Errorf("failed to receive audio packet from encoder: %d", ret) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Error handling logic silently swallows real encoder errors. The condition err != nil || errors.Is(err, ...) breaks on ANY error. The intended logic should only break on EAgain/EOF and return errors for real failures.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/encoder/encoder.go, line 477:
<comment>Error handling logic silently swallows real encoder errors. The condition `err != nil || errors.Is(err, ...)` breaks on ANY error. The intended logic should only break on EAgain/EOF and return errors for real failures.</comment>
<file context>
@@ -540,6 +418,160 @@ func (e *Encoder) WriteFrame(rgbData []byte) error {
+ for {
+ encodedPkt := ffmpeg.AVPacketAlloc()
+ ret, err = ffmpeg.AVCodecReceivePacket(e.audioCodec, encodedPkt)
+ if err != nil || errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) {
+ ffmpeg.AVPacketFree(&encodedPkt)
+ break
</file context>
| if err != nil || errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) { | |
| ffmpeg.AVPacketFree(&encodedPkt) | |
| break | |
| } | |
| if ret < 0 { | |
| ffmpeg.AVPacketFree(&encodedPkt) | |
| return fmt.Errorf("failed to receive audio packet from encoder: %d", ret) | |
| } | |
| if errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) { | |
| ffmpeg.AVPacketFree(&encodedPkt) | |
| break | |
| } | |
| if err != nil || ret < 0 { | |
| ffmpeg.AVPacketFree(&encodedPkt) | |
| return fmt.Errorf("failed to receive audio packet from encoder: %w (ret=%d)", err, ret) | |
| } |
| ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet) | ||
| ffmpeg.AVPacketUnref(d.packet) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to send packet to decoder: %w", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Inconsistent error handling: ret from AVCodecSendPacket is assigned but never checked. Other FFmpeg calls in this file check both err != nil and ret < 0 for complete error handling.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/audio/ffmpeg_decoder.go, line 185:
<comment>Inconsistent error handling: `ret` from `AVCodecSendPacket` is assigned but never checked. Other FFmpeg calls in this file check both `err != nil` and `ret < 0` for complete error handling.</comment>
<file context>
@@ -0,0 +1,465 @@
+ }
+
+ // Send packet to decoder
+ ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet)
+ ffmpeg.AVPacketUnref(d.packet)
+ if err != nil {
</file context>
| ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet) | |
| ffmpeg.AVPacketUnref(d.packet) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to send packet to decoder: %w", err) | |
| } | |
| ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet) | |
| ffmpeg.AVPacketUnref(d.packet) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to send packet to decoder: %w", err) | |
| } | |
| if ret < 0 { | |
| return nil, fmt.Errorf("failed to send packet to decoder: error code %d", ret) | |
| } |
| // ReadForFFT reads samples for FFT analysis. | ||
| // Returns up to numSamples, or fewer if not enough available. | ||
| // Non-blocking: returns immediately with available samples (may be empty). | ||
| // Returns io.EOF when buffer is closed and all samples have been read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Documentation says 'Returns io.EOF' but the code returns ErrBufferClosed. This inconsistency can cause bugs when callers check for io.EOF to detect end-of-stream. Either update the comments to reference ErrBufferClosed, or change the error to use io.EOF (which would require importing the io package).
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/audio/shared_buffer.go, line 70:
<comment>Documentation says 'Returns io.EOF' but the code returns `ErrBufferClosed`. This inconsistency can cause bugs when callers check for `io.EOF` to detect end-of-stream. Either update the comments to reference `ErrBufferClosed`, or change the error to use `io.EOF` (which would require importing the `io` package).</comment>
<file context>
@@ -0,0 +1,246 @@
+// ReadForFFT reads samples for FFT analysis.
+// Returns up to numSamples, or fewer if not enough available.
+// Non-blocking: returns immediately with available samples (may be empty).
+// Returns io.EOF when buffer is closed and all samples have been read.
+func (b *SharedAudioBuffer) ReadForFFT(numSamples int) ([]float64, error) {
+ b.mu.Lock()
</file context>
| // Returns io.EOF when buffer is closed and all samples have been read. | |
| // Returns ErrBufferClosed when buffer is closed and all samples have been read. |
| encodedPkt.SetStreamIndex(e.audioStream.Index()) | ||
| ffmpeg.AVPacketRescaleTs(encodedPkt, e.audioCodec.TimeBase(), e.audioStream.TimeBase()) | ||
|
|
||
| ffmpeg.AVInterleavedWriteFrame(e.formatCtx, encodedPkt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Return value of AVInterleavedWriteFrame is ignored. Unlike WriteAudioSamples(), this doesn't check for write errors, which could silently produce corrupted output.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/encoder/encoder.go, line 568:
<comment>Return value of `AVInterleavedWriteFrame` is ignored. Unlike `WriteAudioSamples()`, this doesn't check for write errors, which could silently produce corrupted output.</comment>
<file context>
@@ -540,6 +418,160 @@ func (e *Encoder) WriteFrame(rgbData []byte) error {
+ encodedPkt.SetStreamIndex(e.audioStream.Index())
+ ffmpeg.AVPacketRescaleTs(encodedPkt, e.audioCodec.TimeBase(), e.audioStream.TimeBase())
+
+ ffmpeg.AVInterleavedWriteFrame(e.formatCtx, encodedPkt)
+ ffmpeg.AVPacketFree(&encodedPkt)
+ }
</file context>
|
|
||
| // Write audio samples for this frame to encoder | ||
| // Convert float64 samples to float32 for AAC encoder | ||
| audioSamples := make([]float32, len(newSamples)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Repeated allocation in hot loop. Consider pre-allocating audioSamplesBuffer outside the frame loop and reusing it with slicing (audioSamples := audioSamplesBuffer[:len(newSamples)]), consistent with how barHeights and other buffers are pre-allocated at lines 273-280.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At cmd/jivefire/main.go, line 498:
<comment>Repeated allocation in hot loop. Consider pre-allocating `audioSamplesBuffer` outside the frame loop and reusing it with slicing (`audioSamples := audioSamplesBuffer[:len(newSamples)]`), consistent with how `barHeights` and other buffers are pre-allocated at lines 273-280.</comment>
<file context>
@@ -488,6 +493,18 @@ func generateVideo(inputFile string, outputFile string, channels int, noPreview
+ // Write audio samples for this frame to encoder
+ // Convert float64 samples to float32 for AAC encoder
+ audioSamples := make([]float32, len(newSamples))
+ for i, s := range newSamples {
+ audioSamples[i] = float32(s)
</file context>
Summary
Replaces multiple format-specific audio decoders with a unified FFmpeg-based audio pipeline, improving format support and encoding efficiency. This change eliminates the separate audio flush phase and reduces encoding latency.
Changes
vendor/tothird_party/