Last call on FLAC specification #244

ktmf01 · 2022-09-28T19:11:17Z

ktmf01
Sep 28, 2022

Sorry for bothering you this way, I really hope you don't mind.

To implement FLAC decoding in dr_flac, you might have used the FLAC format specification on the FLAC website. If so, you probably needed to dive into some actual code (libFLAC or ffmpeg) to understand the finer points. An IETF working group has tried to improve this document, trying to include those finer points, and is on the verge of turning it in for final checks and publication as an RFC. A so-called working group last call has been issued, which lasts 2 weeks.

As you've had first-hand experience implementing FLAC from scratch, you probably know best what implementation details needed extra attention. It would be very valuable to have your feedback on the FLAC format for this document.

I hope you can provide some insights.

The latest draft can be found here https://datatracker.ietf.org/doc/draft-ietf-cellar-flac/

mackron · 2022-09-28T23:40:29Z

mackron
Sep 28, 2022
Maintainer

Thanks! I'm more than happy to have a look at that. This is something that's needed for FLAC.

Indeed I did use the FLAC format specification you linked to, and generally it was a a good reference, but certainly there were a few details that lacked explanation. For the initial implementation of dr_flac, I think from memory I looked at another library for only one thing in particular and that was the coefficient constants for the SUBFRAME_FIXED decoding path. Just browsing the document quickly it looks like you've covered that nicely in section 10.2.5.

One thing that screwed me over was the requirement to shift by an extra bit for side channels. It was such a small and unintuitive detail and it's never mentioned in the original specification. I think from memory it took something like 4 days of frustration to figure out what was going on there. It looks like you've got that covered as well in section 5.2.

I think there were also details about the rice decoding and prediction calculation which took a bit of time to figure out in which case a more detailed explanation would have been handy.

Other than that, I think most other issues were resolved with feedback from the community over time, just like that thing we talked about a while ago about the logic for choosing the 32- vs 64-bit path for the prediction phase (if you haven't talked about that in this new specification, I feel like that would be useful to mention in an appendix).

In any case, I'll have a proper look at that new specification when I get a chance and give you my feedback here. Certainly by looking at it quickly just now it looks significantly better than the FLAC format specification at explaining the details. That said, I think the FLAC format specification is still a valuable resource because it does provide a good high level overview of the structure of the format and is really easy to read.

1 reply

ktmf01 Sep 29, 2022
Author

Yes, I intend to keep the FLAC specification at it has been on the website. I've already added notes at the very top with links to the IETF spec and several mappings specs. Perhaps I will add links to each section of the 'old' FLAC spec to corresponding section in the new one. The tables are indeed a nice way to quickly gloss over certain aspects, and I suspect the anchors in the format document are deeplinked at quite a few places, so it would be nice not to break those.

mackron · 2022-10-02T12:51:27Z

mackron
Oct 2, 2022
Maintainer

So I've gone ahead and written a simple FLAC decoder from this spec. I haven't yet implemented all features, most notably CRC checks, metadata and Ogg encapsulation, but it seems to decode the very few files I've tried. It just skips over metadata. You can find it here if you're curious (see the header for usage and comments): https://github.com/mackron/ref-flac

Overall the process was very nice. Much easier than the FLAC format specification document. In particular I really liked the explanation of the side channel stuff and the requirement for the extra bit. Probably took about 2-3 days worth of work. I tried to build it straight from the spec without using any outside sources. I just pulled the bit streaming system from dr_flac in addition to some simple utility functions. I didn't look at dr_flac at all for any implementation details of the actual FLAC stuff.

I only have a few comments, most of which I think is fairly minor.

The main thing I didn't like was the explanation for residual decoding. The section for decoding residual says this:

Decoding the coded residual thus involves selecting the right coding
method, finding the number of partitions, reading unary and binary
parts of each codeword one-by-one and keeping track of when a new
partition starts and thus when a new Rice parameter needs to be read.

In my opinion that is too vague. The way the numbers are encoded is mentioned in the paragraph above, and I think the expectation is for the reader just do the reverse for the decoding stage, but I think it's unnecessarily hard to wrap your head around. I think an explicit explanation on how to do the rice and zigzag decoding would be nice. This is the comment I wrote in my code for my own reference (not suggesting you use that exact language, just the kind of explanation that I think would be more useful): https://github.com/mackron/ref-flac/blob/a7836f456680b539a5f9a229f2c96fe43a937286/rflac.c#L1356-L1373. In the end I ended up having to look up a reference to refresh my memory on Rice coding.

Another thing I noticed concerned the fixed subframe type. Section 10.2.5 says "first the residual has to be decoded, after which for each sample the prediction can be added", but really the first thing that has to be decoded are the warmup samples. I understand that the intention of that statement is to explain the overall concept, but I find it just slightly confusing and misleading. The warmup samples are indeed mentioned in a paragraph below, and I did know this from prior experience, but it's not quite as explicit as I think it could be. A table like in section 10.2.6 might be useful. Something that looks like this:

+==========+===============================================+
| Data     | Description                                   |
+==========+===============================================+
| s(n)     | Unencoded warm-up samples (n = frame's bits-  |
|          | per-sample * lpc order).                      |
+----------+-----------------------------------------------+
| Coded    | Encoded residual                              |
| residual |                                               |
+----------+-----------------------------------------------+

The same kind of table exists for the LPC path, and in my opinion I think it'd make it just a bit clearer in the fixed section as well.

One other quick and minor thing is that I had to look up a reference for UTF-8 encoding and unary numbers to refresh my memory. This is only a very minor thing because it's easy to find references online, but I would have liked it to just be right there in the spec so I could just implement it then and there. Also, since the UTF-8 encoding used in FLAC isn't technically standard UTF-8 because of how it's extended to 7 bytes, being explicit with that could be a bit safer in my opinion.

But overall this was a really good reference! There's still a few things I might review later. I haven't yet done any CRC stuff so I might implement that at some point to get a better feel for how well it's explained in the spec. I also haven't properly reviewed the metadata stuff yet. When I get to that stuff I'll post a comment here.

6 replies

mackron Oct 2, 2022
Maintainer

I've been considering the idea of a push style API in dr_flac for a while to make it easier to support broadcasting scenarios better like internet radio so I thought it'd be a good opportunity to play around with some ideas.

I also saw just now that you do indeed mention the format of unary coded numbers in section 6 so disregard my earlier comment about that. I just missed it.

ktmf01 Oct 3, 2022
Author

I think an explicit explanation on how to do the rice and zigzag decoding would be nice. This is the comment I wrote in my code for my own reference (not suggesting you use that exact language, just the kind of explanation that I think would be more useful): https://github.com/mackron/ref-flac/blob/a7836f456680b539a5f9a229f2c96fe43a937286/rflac.c#L1356-L1373. In the end I ended up having to look up a reference to refresh my memory on Rice coding.

There is an explicit explanation in appendix D, the examples. At first I had an example in that section, but it seemed too extensive and still too vague to keep it there. Perhaps I should add a reference there to that appendix?

mackron Oct 4, 2022
Maintainer

I did indeed see that example. Now that I've figured it all out again, the example is very obvious, but back when it wasn't fresh in my mind I found it harder than it could have been to wrap my head around. Don't get me wrong, the spec has all the information you need, I just thought for the decoding case it was a bit of an unclear explanation compared to how well everything else was explained up until that point. I just brought it up because it sort of stuck out to me and was the thing I probably had the most trouble with.

I don't think a reference to the appendix would hurt.

ktmf01 Oct 4, 2022
Author

I've tried to improve on the Rice coding section. Would you mind taking a look?

ktmf01/flac-specification@master...vague-residual

mackron Oct 4, 2022
Maintainer

Yep, that's really clear. I like that explanation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Last call on FLAC specification #244

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Last call on FLAC specification #244

ktmf01 Sep 28, 2022

Replies: 2 comments · 7 replies

mackron Sep 28, 2022 Maintainer

ktmf01 Sep 29, 2022 Author

mackron Oct 2, 2022 Maintainer

mackron Oct 2, 2022 Maintainer

ktmf01 Oct 3, 2022 Author

mackron Oct 4, 2022 Maintainer

ktmf01 Oct 4, 2022 Author

mackron Oct 4, 2022 Maintainer

ktmf01
Sep 28, 2022

Replies: 2 comments 7 replies

mackron
Sep 28, 2022
Maintainer

ktmf01 Sep 29, 2022
Author

mackron
Oct 2, 2022
Maintainer

mackron Oct 2, 2022
Maintainer

ktmf01 Oct 3, 2022
Author

mackron Oct 4, 2022
Maintainer

ktmf01 Oct 4, 2022
Author

mackron Oct 4, 2022
Maintainer