-
-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement IFrameEncoder / IFrameDecoder using libvorbis #758
Comments
Hi, I am interested in solving this issue. |
Welcome, thank you! |
Let me know if you'll need help/hints with the build system. Also, if libogg dependency can be disabled, I suppose we should do it, because all we need is vorbis encoder/decoder, we don't need to work with ogg files. |
Hi @gavv , I tried without the libogg, but it fails when installing vorbis. According to the README, "You'll need libogg (distributed separately) to compile this library.", which is why I tried to install it before libvorbis. I managed to install vorbis with the build script after installing libogg. But I'm having trouble in installing the libogg with the script, since I used "make install" on the terminal. |
Oh, I see. I assumed that libogg should support using libvorbis as codec, but instead, libvorbis supports using libogg for muxing. For rfc5215, we'll need to disable ogg muxing, which seems to be feasible. libvorbis encodes packets into a struct from libogg ( When For example, here is what is "installed" for json-c:
These OK, so what we need to do if we need to build libogg and then pass it as a dependency to libvorbis? Here is an example of similar situation: roc-toolkit/3rdparty/SConscript Lines 267 to 285 in b102bb7
Here, ltdl and json-c are dependencies of pulseaudio. We first build dependencies, and then pass them via When we call format_flags(), it iterates all those dependencies (if they're present) and add their include and lib directories to the flags. Summary:
|
Feel free to ping me if you'll have troubles with those scripts, I know it's not the funniest part :) |
Hi @gavv , I am working on the handling of Vorbis stream headers. When encoding with vorbis, we should have 3 headers: the identification, comments and codebook headers. And have a few doubts on how to deal with this: Should we have a dedicated frame for headers or should we split the header across multiple frames together with the data? Or do you have another idea? My idea would be to create a function How should we handle cases where the header size exceeds the frame size? Could split the header through multiple frame cause any problems? Does the I would need to adjust all the unit tests, as well as the packetizer uses the |
Hi, sorry for delay! I was busy relocating to another country. Here are my thoughts, after reading your comments and checking out RFC and libvorbis. Ogg streamFirst of all, in your current PR you're using ogg stream and serializing ogg pages. As mentioned above, we shouldn't do it, as specified in rfc5215. We must only send vorbis packets, and don't add ogg encapsulation. In my understanding, we'll need to use Configuration headersRegarding the 3 headers, rfc5215 states the following:
In other words there are 3 strict requirements:
So I see the algorithm like this: Sender:
Receiver:
In addition, if received packet is in-band configuration header update (section 3.1.1), we should parse it and apply configuration to encoder. I'm not 100% sure how to do it, but likely vorbis_synthesis_idheader() and vorbis_synthesis_restart() may help here.
The approach above is very close to your idea with get_headers_frame()/initialize_headers(), but instead of byte buffers, we need to work with SDP (since we need codec-agnostic solution). We can pass sdp::SessionDescription to IFrameEncoder, so that it can add its codec-specific configuration to the SDP. Then we can pass sdp::SessionDescription to IFrameDecoder, and it can read configuration from it. Same as with your approach, this should be done before encoding/decoding first packet.
Libvorbis flowAs far as I can tell from docs & sources, libvorbis works this way:
This design places us before a few challenges which I missed before. (BTW changes that I describe below will be useful not just for vorbis, but for many other codecs like Opus). Please let me know if I'm missing something. Bellow I assume that this is true. Variable packet sizeencoded_byte_count() was intended to return exact size of encoded payload (i.e. everything after RTP header, including Vorbis headers and compressed audio). It works fine for PCM, but apparently it doesn't fit codecs like Vorbis with variable-size packets. I guess we should rework encoder and composer interfaces. With libvorbis, we can't know packet size before we actually encode it. Currently the algorithm is:
We could change it to something like this:
To make it work, we'll need a few changes:
I suppose we'll also need similar changes in IFrameDecoder. Multiple packets per frameSince libvorbis is allowed to produce multiple packets for one block, we should reflect it in IFrameEncoder interface as well. Current approach is:
We could use the following approach:
Padding packetsCurrently audio::Packetizer produces packets which all have exactly same size (defines by This is done so because FECFRAME (the protocol which we use for loss repair) requires that all packets within FEC blocks should have same size. This requirements conflicts with vorbis (and similar codecs) where we don't control packet size. Unfortunately I don't see a clean solution here, so I suggest the following work around:
We'll need to test this approach to see how stable the packet size would be in practice and how often would it cause receiver to skip loss repair. At the first glance it seems to be a rare event. Also let me know if you have better ideas! Affected parts of codeIFrameEncoder is used only from Packetizer, and IFrameDecoder from Depacketizer. IFrameEncoder and IFrameDecoder currently are implemented only for PCM. IComposer is used more widely, and have a few implementations (rtp::Composer, fec::Composer, rtcp::Composer). Though, changes in its interface and implementations are much smaller. See also this page: https://roc-streaming.org/toolkit/docs/internals/packets_frames.html |
We have two interfaces IFrameEncoder and IFrameDecoder that are used to encode audio samples into packet payload and decode it back. Currently they're implemented using PcmEncoder and PcmDecoder (for uncompressed PCM).
Now we need to add two more implementations: VorbisEncoder and VorbisDecoder, that will use libvorbis library to do the job.
We also need to add new encoder & decoder to the list of tested codecs in test_frame_encoder_decoder.cpp.
The text was updated successfully, but these errors were encountered: