Code size #456

Dirbaio · 2021-04-21T22:00:01Z

I have put together a code size testbench to measure code size. It has 2 test cases, spam logs lots of stuff (based on the qemu test) and smoltcp runs smoltcp's loopback example from latest git master (which has defmt support).

spam  |  smoltcp |  what
===================================
18428 |  60680   |  main branch
  624 |  49780   |  no defmt

spam has 156 log calls and is 18428 bytes (.text size). Commenting the run_test() call in main gives a .text size of 944 bytes. Therefore, defmt's code size is:

Base size (RTT stuff, etc): 944 - 624 = 320 bytes
Log call size: (18428-944)/155 = 112.8 bytes/call

112.8 bytes/call sounds about right. As seen here, a simple info!("defmt test {=u32}", x); is already 74 bytes of code, and the spam code does a mix of simple and complex log calls.

And 112.8 bytes avg, or 74 bytes to print a u32 is A LOT.

For comparison, let's check good old C printf:

void foo(int x) {
    printf("defmt test %d", x);
}

compiled with arm-none-eabi-gcc -c printf.c -Os -mthumb:

   2:	0001      	movs	r1, r0
   4:	4802      	ldr	r0, [pc, #8]	; (10 <foo+0x10>)
   6:	f7ff fffe 	bl	0 <printf>

This is just 8 bytes of code, or 8+14 = 22 bytes if we count the string.

Defmt is 3.5 times larger than printf when counting the string. And this is a generous comparison: arguably the apples-to-apples comparison is not counting the string, since the string pointer is equivalent to defmt's interned string IDs. In which case defmt is 10 times larger.

I'm aware defmt does much more than printf. It's to be expected a log call printing complex nested structs and enums generates bigger code. However, simple log calls (with no arguments or just a handful of integers) are the vast majority in real-world projects, and code size for these really hurts.

The text was updated successfully, but these errors were encountered:

Urhengulas · 2021-04-26T15:27:33Z

@Dirbaio: Thank you for putting in the effort to create this code size benchmark!

Do we know which parts of the code contribute the most bloat?

Dirbaio · 2021-04-26T17:07:18Z

I've been experimenting with this. Code in my codesize4 branch, testbench updated.

Replace Write trait with a write fn in Logger. #258 is an easy win.
Removing the bool and needs_tag optimizations is a big win since it allows the compiler to optimize the Formatter to essentially nothing, which avoids all the work of stack-allocating and initializing it at every log call.
Removing leb/zigzag encodings has almost no effect, perhaps unsurprisingly because the compiler wasn't inlining them anyway.

spam         |  smoltcp      |  what
code   tx'd  |  code   tx'd  |
===============================================
18428  2806  |  60680  1098  |  main branch
15460  2806  |  58232  1098  |  remove dyn Write (PR #258)
15764  2806  |  57904  1098  |  remove bool return in acquire
13076  2817  |  57800  1098  |  remove bool compression
13060  3181  |  57728  1458  |  remove leb64/leb128/zigzag. Make tags always 16bit
11828  3729  |  57872  1468  |  remove needs_tag optimization
11428  3729  |  57240  1468  |  make Formatter zero-sized

This is at the cost of increasing the wire size, but:

The needs_tag increase can probably be gained back with this without increasing code size.
The leb128/zigzag increase can be gained back with a "zero-removing" compression such as this. This will probably give even smaller wire size than before, since uXX/iXX isn't being leb-encoded and they usually are small too.

Something I want to try next is to "pack" together multile write calls. For example when printing 4 u32s, instead of doing 4 write calls, put them in a 16-byte buffer then do a single 16-byte write call. I have a WIP version of this but it doesn't improve codesize that much. I'm still investing why.

Dirbaio · 2021-04-26T18:55:25Z

Updated table with size measurements from my company's firmware:

spam         |  smoltcp      | company's fw |  what
code   tx'd  |  code   tx'd  | code         |
==============================================================
18428  2806  |  60680  1098  | 136336       |  main branch
15460  2806  |  58232  1098  | 127152       |  remove dyn Write (PR #258)
15764  2806  |  57904  1098  | 125832       |  remove bool return in acquire
13076  2817  |  57800  1098  | 120096       |  remove bool compression
13060  3181  |  57728  1458  | 120024       |  remove leb64/leb128/zigzag. Make tags always 16bit
11828  3729  |  57872  1468  | 118280       |  remove needs_tag optimization
11428  3729  |  57240  1468  | 117064       |  make Formatter zero-sized
11428  3729  |  57312  1468  | 117168       |  Remove methods from Formatter
  624        |  49780        | 98716        |  no defmt

2x smaller! :)

Dirbaio · 2021-04-26T22:07:31Z

I've done a quick test with rzCOBS compression+framing. Results on spam test:

   With rzCOBS: 2290 bytes, 367308 cycles
Without rzCOBS: 3729 bytes, 337152 cycles

Wire size is better than upstream defmt, while keeping all the code size gains. Code size increase is only ~200 bytes.

This is without needs_tag. Wire size could be even better if this is implemented.

And it provides framing, so probe-run is able to recover from a corrupt stream!

Dirbaio · 2021-05-02T23:11:55Z

@Urhengulas @jonas-schievink what are your thoughts on this? Is this something you would want to address?

#258 is an easy start, and I'm willing to clean/rework my other patches to PR them if you confirm this is something you'd like to review and merge.

jonas-schievink · 2021-05-02T23:15:41Z

Personally I think this is a good change, inclusing plugging in rzCOBS, but we don't currently plan to make breaking changes to defmt, which complicates things.

Dirbaio · 2021-05-02T23:21:10Z

In what timeframe would you consider OK making the next breaking release?

Urhengulas · 2021-05-05T13:02:43Z

In what timeframe would you consider OK making the next breaking release?

We currently want to avoid it, since it will split the ecosystem. We are discussing how to make breaking releases more feasible, mainly looking at #426.

Another option we discussed is maintaining a next branch for some time. This branch would include all the changes from main and breaking changes, which enables us to start the effort, but delay the breaking change for longer. What would you think about this workaround?

Dirbaio · 2021-05-06T17:14:14Z

Using defmt from git was extremely painful in the pre-0.1 times.

It forces users to have to git-clone, patch and build probe-run.
For git builds defmt requires firmware and decoder versions to match exactly (Separate "crate version" from "wire format version" #287), so this means constantly rebuilding probe-run every time you upgrade defmt on your firmware.
You can't depend on git crates for crates published on crates.io. This means using this next branch in library crates (eg smoltcp, embassy) is unfeasible. Published crates would be stuck on 0.2, which in turns prevents end users from using next in their firmware if they depend on any such published crate. To get around this users would have to patch every single crate on their tree that depends on defmt 0.2!

japaric · 2021-05-10T10:35:50Z

Wire size is better than upstream defmt, while keeping all the code size gains.

@Dirbaio do you have data on how much wire size is reduced for any of the 3 examples you posted in the last few tables?

MathiasKoch · 2021-05-10T11:40:26Z

Wire size is better than upstream defmt, while keeping all the code size gains.

@Dirbaio do you have data on how much wire size is reduced for any of the 3 examples you posted in the last few tables?

I think that would be the tx'd column?

And the

With rzCOBS: 2290 bytes
Without rzCOBS: 3729 bytes

jonas-schievink · 2021-08-05T11:56:30Z

Hi @Dirbaio! Now that #492 has been fully implemented, how are we doing in terms of code size? I've seen #525, but that doesn't seem necessary to consider this issue fixed.

Dirbaio · 2021-08-05T17:03:28Z

Yeah, #525 should reduce the code size further but in its current state it didn't. In my original experiments branch it did reduce it quite a lot, so I still have to check what's up with that.

Feel free to close this issue if you want. There'll always be some code size reduction possible, this'd stay open forever otherwise :P

jonas-schievink · 2021-08-05T17:26:24Z

Okay, closing then!

Dirbaio mentioned this issue May 6, 2021

Decode breaks when not calling write! #457

Closed

Urhengulas added breaking change fix / feature / improvement involves a breaking change and needs to wait until next minor version and removed good first issue Good for newcomers labels May 7, 2021

Dirbaio mentioned this issue May 27, 2021

RFC: Traits and wire format 2.0 #492

Closed

Urhengulas added this to the v0.3.0 milestone Jun 18, 2021

japaric assigned jonas-schievink Jul 14, 2021

jonas-schievink closed this as completed Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code size #456

Code size #456

Dirbaio commented Apr 21, 2021

Urhengulas commented Apr 26, 2021

Dirbaio commented Apr 26, 2021 •

edited

Loading

Dirbaio commented Apr 26, 2021

Dirbaio commented Apr 26, 2021 •

edited

Loading

Dirbaio commented May 2, 2021

jonas-schievink commented May 2, 2021

Dirbaio commented May 2, 2021

Urhengulas commented May 5, 2021

Dirbaio commented May 6, 2021 •

edited

Loading

japaric commented May 10, 2021

MathiasKoch commented May 10, 2021 •

edited

Loading

jonas-schievink commented Aug 5, 2021

Dirbaio commented Aug 5, 2021

jonas-schievink commented Aug 5, 2021

Code size #456

Code size #456

Comments

Dirbaio commented Apr 21, 2021

Urhengulas commented Apr 26, 2021

Dirbaio commented Apr 26, 2021 • edited Loading

Dirbaio commented Apr 26, 2021

Dirbaio commented Apr 26, 2021 • edited Loading

Dirbaio commented May 2, 2021

jonas-schievink commented May 2, 2021

Dirbaio commented May 2, 2021

Urhengulas commented May 5, 2021

Dirbaio commented May 6, 2021 • edited Loading

japaric commented May 10, 2021

MathiasKoch commented May 10, 2021 • edited Loading

jonas-schievink commented Aug 5, 2021

Dirbaio commented Aug 5, 2021

jonas-schievink commented Aug 5, 2021

Dirbaio commented Apr 26, 2021 •

edited

Loading

Dirbaio commented Apr 26, 2021 •

edited

Loading

Dirbaio commented May 6, 2021 •

edited

Loading

MathiasKoch commented May 10, 2021 •

edited

Loading