Skip to content

Conversation

@flexiondotorg
Copy link
Contributor

Summary

Replaces multiple format-specific audio decoders with a unified FFmpeg-based audio pipeline, improving format support and encoding efficiency. This change eliminates the separate audio flush phase and reduces encoding latency.

Changes

  • Replace pure Go audio decoders (WAV/MP3/FLAC) with a single FFmpeg-based decoder
  • Add direct sample-based audio encoding API for improved interleaving
  • Move ffmpeg-statigo submodule from vendor/ to third_party/
  • Implement shared audio buffer for parallel FFT analysis and encoding
  • Remove "catch-up" audio encoding delay by sharing decoded samples
  • Improve progress reporting and performance metrics
  • Add thumbnail generation timing to performance stats

- Add FFmpegDecoder supporting all common audio formats (MP3, FLAC, WAV, OGG, AAC)
- Update StreamingReader to use FFmpeg with fallback to pure Go decoders
- Implement seeking capability for efficient two-pass processing
- Support both planar and interleaved audio formats with automatic stereo-to-mono downmixing

This eliminates double-decode overhead and addresses end-of-encode delay by following the FIFO design approach.
- Add a new SharedAudioBuffer to provide thread-safe buffer for audio samples
- Implement independent read positions for FFT analysis and AAC encoder
- Support both blocking and non-blocking read operations
- Include buffer management features like compaction and reset
- Add comprehensive test coverage for the shared buffer
- Fix formatting in ffmpeg_decoder.go sample slice declarations
Resolve conflict between Git submodule and Go's vendoring system.
Placing the submodule in vendor/ triggered Go's vendoring mode,
requiring -mod=mod flags everywhere and causing "inconsistent
vendoring" errors when vendor/modules.txt got out of sync.

Changes:
- Move submodule to third_party/ffmpeg-statigo
- Update go.mod replace directive
- Remove all -mod=mod flags from justfile and CI workflows
- Update ffmpeg-statigo documentation to recommend third_party/

The third_party/ directory is the conventional location for external
C/CGO libraries in Go projects, keeping vendor/ free for go mod vendor.
- Add support for pre-decoded audio samples with WriteAudioSamples()
- Implement new audio encoder initialization for sample-based input
- Add audio FIFO buffering for proper frame handling
- Include FlushAudioEncoder method to handle remaining samples
- Add SampleRate configuration parameter for new mode
- Maintain backward compatibility with file-based audio input
- Replace PTS-based audio processing with direct sample-based approach
- Add WriteAudioSamples method to send audio samples alongside video frames
- Simplify audio flush process with dedicated FlushAudioEncoder method
- Use audio file's native sample rate rather than hardcoded value

This change improves synchronization between audio and video by handling
audio samples at the same time as their corresponding video frames.
- Remove pure Go decoders (WAV, MP3, FLAC) in favor of unified FFmpeg decoder
- Update architecture documentation to reflect the unified audio pipeline
- Remove audioFlushTime performance metric (no longer needed with single decoder)
- Clean up dependencies in go.mod

This simplifies the codebase while providing broader format support.
- Removes minimum display time and completion delays from Pass1Model
- Tracks and displays thumbnail generation time in the summary
- Improves speed calculation in Pass2 to show accurate realtime multiplier
- Renames "Initialization" to "Other" in time breakdown for clarity
- Simplifies video duration display format
@flexiondotorg flexiondotorg merged commit 9b3ec37 into main Nov 27, 2025
0 of 4 checks passed
@flexiondotorg flexiondotorg deleted the fifo branch November 27, 2025 18:26
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 19 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/audio/ffmpeg_decoder.go">

<violation number="1" location="internal/audio/ffmpeg_decoder.go:185">
P2: Inconsistent error handling: `ret` from `AVCodecSendPacket` is assigned but never checked. Other FFmpeg calls in this file check both `err != nil` and `ret < 0` for complete error handling.</violation>
</file>

<file name="internal/audio/shared_buffer.go">

<violation number="1" location="internal/audio/shared_buffer.go:70">
P2: Documentation says 'Returns io.EOF' but the code returns `ErrBufferClosed`. This inconsistency can cause bugs when callers check for `io.EOF` to detect end-of-stream. Either update the comments to reference `ErrBufferClosed`, or change the error to use `io.EOF` (which would require importing the `io` package).</violation>

<violation number="2" location="internal/audio/shared_buffer.go:221">
P1: `Reset()` doesn't call `cond.Broadcast()` after changing state. If a consumer is blocked in `ReadForEncoder`'s `cond.Wait()`, it will remain stuck indefinitely after Reset() is called, potentially causing a deadlock. Add `b.cond.Broadcast()` to wake up blocked consumers so they can re-evaluate the new state.</violation>
</file>

<file name="internal/encoder/encoder.go">

<violation number="1" location="internal/encoder/encoder.go:477">
P1: Error handling logic silently swallows real encoder errors. The condition `err != nil || errors.Is(err, ...)` breaks on ANY error. The intended logic should only break on EAgain/EOF and return errors for real failures.</violation>

<violation number="2" location="internal/encoder/encoder.go:568">
P2: Return value of `AVInterleavedWriteFrame` is ignored. Unlike `WriteAudioSamples()`, this doesn't check for write errors, which could silently produce corrupted output.</violation>
</file>

<file name="cmd/jivefire/main.go">

<violation number="1" location="cmd/jivefire/main.go:498">
P2: Repeated allocation in hot loop. Consider pre-allocating `audioSamplesBuffer` outside the frame loop and reusing it with slicing (`audioSamples := audioSamplesBuffer[:len(newSamples)]`), consistent with how `barHeights` and other buffers are pre-allocated at lines 273-280.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

b.samples = b.samples[:0]
b.fftReadPos = 0
b.encoderReadPos = 0
b.closed = false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Reset() doesn't call cond.Broadcast() after changing state. If a consumer is blocked in ReadForEncoder's cond.Wait(), it will remain stuck indefinitely after Reset() is called, potentially causing a deadlock. Add b.cond.Broadcast() to wake up blocked consumers so they can re-evaluate the new state.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/audio/shared_buffer.go, line 221:

<comment>`Reset()` doesn't call `cond.Broadcast()` after changing state. If a consumer is blocked in `ReadForEncoder`'s `cond.Wait()`, it will remain stuck indefinitely after Reset() is called, potentially causing a deadlock. Add `b.cond.Broadcast()` to wake up blocked consumers so they can re-evaluate the new state.</comment>

<file context>
@@ -0,0 +1,246 @@
+	b.samples = b.samples[:0]
+	b.fftReadPos = 0
+	b.encoderReadPos = 0
+	b.closed = false
+}
+
</file context>
Suggested change
b.closed = false
b.closed = false
b.cond.Broadcast() // Wake up any blocked consumers to re-evaluate state

Comment on lines +477 to +484
if err != nil || errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) {
ffmpeg.AVPacketFree(&encodedPkt)
break
}
if ret < 0 {
ffmpeg.AVPacketFree(&encodedPkt)
return fmt.Errorf("failed to receive audio packet from encoder: %d", ret)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Error handling logic silently swallows real encoder errors. The condition err != nil || errors.Is(err, ...) breaks on ANY error. The intended logic should only break on EAgain/EOF and return errors for real failures.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/encoder/encoder.go, line 477:

<comment>Error handling logic silently swallows real encoder errors. The condition `err != nil || errors.Is(err, ...)` breaks on ANY error. The intended logic should only break on EAgain/EOF and return errors for real failures.</comment>

<file context>
@@ -540,6 +418,160 @@ func (e *Encoder) WriteFrame(rgbData []byte) error {
+		for {
+			encodedPkt := ffmpeg.AVPacketAlloc()
+			ret, err = ffmpeg.AVCodecReceivePacket(e.audioCodec, encodedPkt)
+			if err != nil || errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) {
+				ffmpeg.AVPacketFree(&encodedPkt)
+				break
</file context>
Suggested change
if err != nil || errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) {
ffmpeg.AVPacketFree(&encodedPkt)
break
}
if ret < 0 {
ffmpeg.AVPacketFree(&encodedPkt)
return fmt.Errorf("failed to receive audio packet from encoder: %d", ret)
}
if errors.Is(err, ffmpeg.EAgain) || errors.Is(err, ffmpeg.AVErrorEOF) {
ffmpeg.AVPacketFree(&encodedPkt)
break
}
if err != nil || ret < 0 {
ffmpeg.AVPacketFree(&encodedPkt)
return fmt.Errorf("failed to receive audio packet from encoder: %w (ret=%d)", err, ret)
}

Comment on lines +185 to +189
ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet)
ffmpeg.AVPacketUnref(d.packet)
if err != nil {
return nil, fmt.Errorf("failed to send packet to decoder: %w", err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Inconsistent error handling: ret from AVCodecSendPacket is assigned but never checked. Other FFmpeg calls in this file check both err != nil and ret < 0 for complete error handling.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/audio/ffmpeg_decoder.go, line 185:

<comment>Inconsistent error handling: `ret` from `AVCodecSendPacket` is assigned but never checked. Other FFmpeg calls in this file check both `err != nil` and `ret < 0` for complete error handling.</comment>

<file context>
@@ -0,0 +1,465 @@
+		}
+
+		// Send packet to decoder
+		ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet)
+		ffmpeg.AVPacketUnref(d.packet)
+		if err != nil {
</file context>
Suggested change
ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet)
ffmpeg.AVPacketUnref(d.packet)
if err != nil {
return nil, fmt.Errorf("failed to send packet to decoder: %w", err)
}
ret, err = ffmpeg.AVCodecSendPacket(d.codecCtx, d.packet)
ffmpeg.AVPacketUnref(d.packet)
if err != nil {
return nil, fmt.Errorf("failed to send packet to decoder: %w", err)
}
if ret < 0 {
return nil, fmt.Errorf("failed to send packet to decoder: error code %d", ret)
}

// ReadForFFT reads samples for FFT analysis.
// Returns up to numSamples, or fewer if not enough available.
// Non-blocking: returns immediately with available samples (may be empty).
// Returns io.EOF when buffer is closed and all samples have been read.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Documentation says 'Returns io.EOF' but the code returns ErrBufferClosed. This inconsistency can cause bugs when callers check for io.EOF to detect end-of-stream. Either update the comments to reference ErrBufferClosed, or change the error to use io.EOF (which would require importing the io package).

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/audio/shared_buffer.go, line 70:

<comment>Documentation says 'Returns io.EOF' but the code returns `ErrBufferClosed`. This inconsistency can cause bugs when callers check for `io.EOF` to detect end-of-stream. Either update the comments to reference `ErrBufferClosed`, or change the error to use `io.EOF` (which would require importing the `io` package).</comment>

<file context>
@@ -0,0 +1,246 @@
+// ReadForFFT reads samples for FFT analysis.
+// Returns up to numSamples, or fewer if not enough available.
+// Non-blocking: returns immediately with available samples (may be empty).
+// Returns io.EOF when buffer is closed and all samples have been read.
+func (b *SharedAudioBuffer) ReadForFFT(numSamples int) ([]float64, error) {
+	b.mu.Lock()
</file context>
Suggested change
// Returns io.EOF when buffer is closed and all samples have been read.
// Returns ErrBufferClosed when buffer is closed and all samples have been read.

encodedPkt.SetStreamIndex(e.audioStream.Index())
ffmpeg.AVPacketRescaleTs(encodedPkt, e.audioCodec.TimeBase(), e.audioStream.TimeBase())

ffmpeg.AVInterleavedWriteFrame(e.formatCtx, encodedPkt)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Return value of AVInterleavedWriteFrame is ignored. Unlike WriteAudioSamples(), this doesn't check for write errors, which could silently produce corrupted output.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/encoder/encoder.go, line 568:

<comment>Return value of `AVInterleavedWriteFrame` is ignored. Unlike `WriteAudioSamples()`, this doesn't check for write errors, which could silently produce corrupted output.</comment>

<file context>
@@ -540,6 +418,160 @@ func (e *Encoder) WriteFrame(rgbData []byte) error {
+		encodedPkt.SetStreamIndex(e.audioStream.Index())
+		ffmpeg.AVPacketRescaleTs(encodedPkt, e.audioCodec.TimeBase(), e.audioStream.TimeBase())
+
+		ffmpeg.AVInterleavedWriteFrame(e.formatCtx, encodedPkt)
+		ffmpeg.AVPacketFree(&encodedPkt)
+	}
</file context>


// Write audio samples for this frame to encoder
// Convert float64 samples to float32 for AAC encoder
audioSamples := make([]float32, len(newSamples))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Repeated allocation in hot loop. Consider pre-allocating audioSamplesBuffer outside the frame loop and reusing it with slicing (audioSamples := audioSamplesBuffer[:len(newSamples)]), consistent with how barHeights and other buffers are pre-allocated at lines 273-280.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At cmd/jivefire/main.go, line 498:

<comment>Repeated allocation in hot loop. Consider pre-allocating `audioSamplesBuffer` outside the frame loop and reusing it with slicing (`audioSamples := audioSamplesBuffer[:len(newSamples)]`), consistent with how `barHeights` and other buffers are pre-allocated at lines 273-280.</comment>

<file context>
@@ -488,6 +493,18 @@ func generateVideo(inputFile string, outputFile string, channels int, noPreview
 
+			// Write audio samples for this frame to encoder
+			// Convert float64 samples to float32 for AAC encoder
+			audioSamples := make([]float32, len(newSamples))
+			for i, s := range newSamples {
+				audioSamples[i] = float32(s)
</file context>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant