-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Lite" version? #50
Comments
Bob, I would love to do so, but I have to be realistic about my lack of time availability given the large set of other tasks on my plate. So I will certainly keep this issue alive, but it's going to be back burner for now while I devote time to work on Etherkit products. If/when the opportunity arises to do this, I'll certainly give it a try. Keep in mind that the main reason that the code from this library compiles to the size that it does is because of the 64-bit integer math that is required to give a decent level of precision in setting the frequency. Cutting it down to 32-bit math, for example, will hurt the precision in setting the frequency a fair bit. Thanks for filing the issue. |
Remove the 3 instances of new/delete and this will make your light version for free. You're on a 8-bit micro, not a Xeon server... |
Also each buffer uses 8 bytes, not 20. declaring a: uint8_t params[8] should be enough. Your bad code was copied all over the web, see here: https://github.com/NT7S/Si5351/blob/master/si5351.c Also you can use fixed values instead of i++ code will be even better. |
So you've made those changes and confirmed that it substantially improves the code? I'd like to see the data on how much things are improved and if they are I'll happily accept a PR.
The issue here is the large amount of 64-bit math done to in order to get an accurate output frequency, which translates into more code in flash storage, not the RAM that is temporarily allocated within that method.
A link to an old version of the code on my personal account is "all over the web"? Drop the hyperbole. |
okay, now my feet hurts. hyperbole apart, dynamic allocation probably triggers the linking of a set of functions to implement malloc and free, which probably also has a significant flash memory cost. |
Hi All, Removing the one new in si5351 only makes about 8 bytes difference... sometimes, depending on circumstances, and I looked in Wire - it doesn't use new, and I didn't see it anywhere else. The things that REALLY make a difference are:
Modified files attached, plus a diff so you can see what I actually changed. |
One more thing - the linker won't include code that's never called, so simply not calling things you don't need will result in the library being only as big as it needs to be - writing a cut down version may not yield the gains you think it will, unless you write out complexities such as not being able to use more clocks than you need (eg removing support for CLK6/7 which requires extra code). |
My initial remarks were about RAM savings. some buffers allocated dynamically are clearly unneededly big, and could be allocated once and for all as a global/class variable. |
I ended up chasing the obvious savings available from optional support for CLK6/7.. and while I was at it found other uses of new... no idea why ctrl+F didn't find them for me before... anyway.... I also added the ability to disable support for clocks 6 and 7 with a #define - for a gain of 2200+ bytes! I've yet to actually test the code (and I can't fully test the extra clocks) but I'll post it when I've had a chance to load it into a chip and make sure it's not a total disaster :) |
Are you sure about that? It seems to me that GNU LD is linking whole files, at least that is the case on my x86 linux pc; and there is no reason to have a different behaviour for avr. That's the reason why projects that want to include only the required function actually split their code into multiple files, each containing a single function. Stack buffers still use RAM, but temporarily, and this limits the possible levels of recursion, which can grow unexpectedly fast in modular code. Sure, a static RAM buffer uses the same mem at all times, but you have better chances to avoid stack overflow (in general). IIRC, Last time I read the code 8 bytes were sufficient for storing registers, why did you chose 10 bytes? - note: thats only 8 bytes, so stack allocation is completely fine in that case. |
@ChocolateFrogsNuts Thanks, I appreciate your analysis. I'll dig into it soon, as I'm currently knee-deep in a few other things but I'd like to see your optimizations. |
seems to work ok, so here's the updated code. |
@f4grx it definitely only links code that is referenced on my x86 linux pc with IDE 1.8.4 here. Adding in a single line of code that references something new in an already linked library can add large amounts of size. It definitely also works that way in the sketch - #ifdef-ing out a function that isn't used anywhere makes no difference to the compiled size. Limiting the possible levels of recursion is going to happen no matter where the buffer is allocated - the only difference is whether that limiting is happening to other code when the function in question is not running. The only time a static buffer will reduce the chances of a stack overflow are if one of the functions using a buffer calls another that uses a buffer, which doesn't happen here. A better way to manage that risk is to simply declare the buffer within a code block that's closed by the time the call is made (if possible) - that way the local variable is removed from the stack before the call is made. 10 bytes was a round number - it leaves a little margin for error if someone updates the code and miscounts :-) 8 bytes is all that's really needed. |
Ok, I don't understand why this one makes such a big difference - I expected a few bytes, but almost 500???
Making similar changes elsewhere only made a few bytes difference as expected. Maybe it's the result of an optimiser bug.... or maybe due to the position in the parameter list the optimiser can't see how to do it? I also got a few bytes by merging the common code from set_pll() and set_vcxo() into a write_pll_parameters() function. That one's more about improving code management - the code does exactly the same thing. I did test replacing i++ with constants in that code.. for a grand saving of 0 bytes... I prefer the i++ version. It also ensures the params buffer only exists when it's actually needed, at the expense of 1 call frame on the stack. Given we've cut the buffer size by 10 I'm not too concerned about the extra 7 bytes for the call. I also noticed that set_pll() updates plla_freq or pllb_freq as appropriate, but set_vcxo does not after performing the pll parameter write. Is this a bug? If so I'll apply the fix. |
I discovered a good answer to selecting the "Lite" version for compiling with or without CLK67 support from within the IDE!
and a bunch of other stuff you will have to look up but it's basically a copy of the appropriate section of the boards.txt for your arduino. I made one for the nano that works boards.txt That's it - in the IDE you will have a new board to choose from which will use the 'Lite' version of SI5351. So if you select "Arduino Nano" you get the full version of SI5351, and if you select "My Custom Board" it automatically rebuilds with the lite version. |
Would it still be useful to have a version that uses 32bit integers for frequency calculations (instead of int64)? That would cut down the size substantially on 8bit cpus, and would still allow operation up to about 40MHz, I think? |
I'm a noob to Arduino but the need for frequency control using the Si5351 has lit a fire under me! So first, thank you for your efforts! I really like the v2 updated library which I find much more intuitive to use.
My only request would be to create a downsized library where the combined libraries needed to control the Si5351 (e.g. "Si5151" and "Wire") could run on the ATTiny85-20. This is probably a tall order since it's only got about 6K left after the bootloader, but it would be very cool to be able to shrink the size of a frequency source to that of an old-style crystal using this 8-pin uC. Especially now that the last custom crystal maker has gone out of business.
If it were possible to slim down the library for this purpose, I'd be content with just the basic setup commands and the new set_freq() method - just the bare minimum. A simple sketch that uses just "Wire" takes a little over 2K of memory, which would leave 4K at most for "Si5351 Lite" and the user's sketch. For comparison, a fairly simple sketch using the current libraries runs about 12K.
Thanks and 73,
Bob W9RAN
The text was updated successfully, but these errors were encountered: