Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CMake support (Zephyr Module) #177

Closed
yashi opened this issue Oct 5, 2021 · 22 comments
Closed

Add CMake support (Zephyr Module) #177

yashi opened this issue Oct 5, 2021 · 22 comments

Comments

@yashi
Copy link
Collaborator

yashi commented Oct 5, 2021

Hi @keith-packard,

I've read this pull-request and I'm very much interested in integrating picolibc as a Zephyr module. In order to proceed I'd like to use this issue to gather requirements for building and integrating piocolibc as a Zephyr module.

To kick this off, let me ask a few questions

  • You said in the PR you are happy to integrate CMake files into picolibc. Does it mean picolibc hosts CMakeLists.txt files all over the source tree?

    • CMake is similar to Meson, meaning that there usually a CMakeLists.txt in most of the directories.
    • Unlike Meson, it needs a template file for generating picolibc.h. So it adds yet another file.
    • Unlike Meson, it doesn't need a option file. All options are declared in CMakeLists.txt.
    • I have prepared and attached a minimum change, which builds strnlen()-only picolibc with CMake. Hoping that the diff gives you an idea.
  • Do you want CMake to be as general as Meson, or be specific to Zephyr? I think there are pros and cons for each direction. I'm happy with either way.

    • CMake only for Zephyr (this is simple, can skip unnecessary features, such as multilib or semi hosting, in CMake)
    • CMake as a second build system (this makes it more complex, but aligned with Meson, and someone from non Zephyr community might find it useful)

The following is the minimum change to build strnlen()-only picolibc with CMake, without Zephyr module glue code.

diff --git a/CMakeLists.txt b/CMakeLists.txt
new file mode 100644
index 000000000..df0c84408
--- /dev/null
+++ b/CMakeLists.txt
@@ -0,0 +1,19 @@
+project(PICOLIBC)
+cmake_minimum_required(VERSION 3.20)
+
+add_library(picolibc)
+#target_compile_features(picolibc PRIVATE c_std_18)
+target_compile_options(picolibc PRIVATE -Wall -Wextra)
+
+set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS Debug Release MinSizeRel)
+if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
+  message(STATUS "No build type selected, default to MinSizeRel")
+  set(CMAKE_BUILD_TYPE MinSizeRel CACHE STRING "Build Type" FORCE)
+endif()
+
+target_include_directories(picolibc PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
+
+#add_subdirectory(dummyhost)
+add_subdirectory(newlib)
+
+configure_file(picolibc.h.in picolibc.h)
diff --git a/newlib/CMakeLists.txt b/newlib/CMakeLists.txt
new file mode 100644
index 000000000..8dd39ad35
--- /dev/null
+++ b/newlib/CMakeLists.txt
@@ -0,0 +1 @@
+add_subdirectory(libc)
diff --git a/newlib/libc/CMakeLists.txt b/newlib/libc/CMakeLists.txt
new file mode 100644
index 000000000..a58143cd8
--- /dev/null
+++ b/newlib/libc/CMakeLists.txt
@@ -0,0 +1,2 @@
+target_include_directories(picolibc PUBLIC include)
+add_subdirectory(string)
diff --git a/newlib/libc/string/CMakeLists.txt b/newlib/libc/string/CMakeLists.txt
new file mode 100644
index 000000000..fb0539b66
--- /dev/null
+++ b/newlib/libc/string/CMakeLists.txt
@@ -0,0 +1 @@
+target_sources(picolibc PRIVATE strnlen.c)
diff --git a/picolibc.h.in b/picolibc.h.in
new file mode 100644
index 000000000..e69de29bb
@keith-packard
Copy link
Collaborator

Hi @keith-packard,

I've read this pull-request and I'm very much interested in integrating picolibc as a Zephyr module. In order to proceed I'd like to use this issue to gather requirements for building and integrating piocolibc as a Zephyr module.

Awesome! I looked into this briefly, but I really don't have time to figure this out on my own.

To kick this off, let me ask a few questions

* You said in the PR you are _happy to integrate CMake files into picolibc_.  Does it mean picolibc hosts `CMakeLists.txt` files all over the source tree?

It seems like it's either that, or figure out how to add meson to Zephyr, which doesn't seem like a great plan as adding another build dependency just makes things harder for Zephyr users.

  * CMake is similar to Meson, meaning that there usually a `CMakeLists.txt` in most of the directories.
  * Unlike Meson, it needs a template file for generating `picolibc.h`.  So it adds yet another file.
  * Unlike Meson, it doesn't need a option file.  All options are declared in `CMakeLists.txt`.
  * I have prepared and attached a minimum change, which builds `strnlen()`-only picolibc with CMake. Hoping that the diff gives you an idea.

I'm a little familiar with CMake, so these all seem reasonable.

* Do you want CMake to be as general as Meson, or be specific to Zephyr?  I think there are pros and cons for each direction.  I'm happy with either way.

Hrm. I don't plan on switching the upstream build system, so I'm not sure it would be useful to make it completely general. In particular, there are a lot of optional bits of picolibc that a Zephyr build would probably never use, so a Zephyr-specific setup would be simpler. What I would like to see is the ability to integrate CMake into the CI stuff running on github so that we ensure those build bits don't stop working.

  * CMake only for Zephyr (this is simple, can skip unnecessary features, such as multilib or semi hosting, in CMake)

Yup, as a module, we won't want multilib. We may want to keep the semihosting bits so that the build can be tested in the picolibc CI setup.

  * CMake as a second build system (this makes it more complex, but aligned with Meson, and someone from non Zephyr community might find it useful)

Let's leave that problem to a future change when someone else comes in with a requirement for CMake like this. I suspect the Zephyr setup will be a good start for anyone else wanting to embed picolibc into their own CMake-based system.

The following is the minimum change to build strnlen()-only picolibc with CMake, without Zephyr module glue code.

Looks good. I'd be happy to review a PR containing a stand-along CMake build for a non-multilib configuration with the default picolibc options that passes the picolibc tests. Happy to help if you get stuck anywhere; there are probably some weird corners in picolibc that you'll get caught by.

@keith-packard
Copy link
Collaborator

Oh, this leaves one question unanswered -- what to do about libstdc++. If we make picolibc a zephyr module, how can we also support libstdc++, which must be built against libc?

@yashi
Copy link
Collaborator Author

yashi commented Oct 5, 2021

Thanks. I'll proceed with CMakeLists.txt specific to Zephyr (for now).

Regarding GitHub Actions, would you want to

  • Build a plain picolibc with CMake (with semihosting?), or
  • Build picolibc along with Zephyr?

I think it's easier for you to just use plain picolibc without Zephyr. But "picolibc built by CMake" doesn't test much; just some compiler flags and other bits. So I'm assuming that we have to test against Zephyr.

For libstdc++, I honestly don't know. I've never used C++ with Zephyr and not planning to ATM. But I'll dig into it. If you have relevant issues I should know, please let me know.

@keith-packard
Copy link
Collaborator

If libstdc++ is required, it's a big uplift -- you need to build the whole gcc package twice. That's kinda why having picolibc in crosstool-ng is useful; that toolchain is all set to build picolibc, newlib and newlib-nano along with three versions of libstdc++. Of course, that comes at the cost of doing a huge toolchain build as crosstool-ng does the full multilib build of all three as well.

Maybe the full solution is to add picolibc as a module with cmake build rules but also add picolibc to the zephyr toolchain so that projects using libstdc++ can get it from there instead of using the embedded version? As the picolibc support in Zephyr doesn't care where the library is built, that should be a separable project from this one, and can wait until someone needs it.

@stephanosio
Copy link
Contributor

stephanosio commented Mar 21, 2022

It has been recently brought up that we need a middle ground between the Zephyr minimal libc and the newlib included in the Zephyr SDK toolchains because the Zephyr minimal libc is lacking too many features, and the newlib is too heavy for most applications.

picolibc seems to be a good candidate for that since it is almost as light as the Zephyr minimal libc for some configurations despite being a fully featured C library -- we could expand the Zephyr minimal libc to provide such features, but why reinvent the wheel and take more maintenance burden ourselves when we can reuse an existing component?

In that sense, importing picolibc as a Zephyr module with the goal of ultimately replacing the Zephyr minimal libc would be very useful.

For that, we (Zephyr Project) need to consider and implement the following:

  1. picolibc needs to be a Zephyr module that can be integrated to the existing Zephyr ecosystem via west. We could fork the picolibc and add the necessary files to make it a Zephyr module.
  2. picolibc build system needs to be integrated to the existing Zephyr build system, which is written in CMake. This can be done by adding the necessary CMake scripts on the Zephyr side so as to keep the modifications to the fork minimal.
  3. picolibc needs to be configurable at build-time via Kconfig so that it can be configured to enable only the required features and minimise the memory footprint. This can be done by adding the necessary Kconfig files on the Zephyr side (similar to how it is done for the CMake scripts).
  4. picolibc needs to be able to support all currently supported Zephyr targets.
  5. (long term goal) picolibc should be able to be built for non-GNU toolchains (e.g. Clang, Arm Compiler, IAR, ...) -- this is not an absolute requirement for now.

@stephanosio
Copy link
Contributor

With #177 (comment) in mind, the GNU libstdc++ integration with picolibc would not be so essential since the picolibc is replacing the minimal libc, which is intended to be used on the systems with fairly limited resources. If one needs C++ (i.e. memory footprint is not a major concern), one can use the newlib.

When picolibc is used, the C++ support will be limited to what is currently available in the Zephyr C++ subsystem (basically, only new/delete support). We will, however, likely expand the Zephyr subsystem to include something like a "minimal "C++ library," (e.g. forking LLVM libc++ as a Zephyr module) in which case the picolibc can be used alongside it.

@stephanosio
Copy link
Contributor

cc @carlescufi @nashif @cfriedt

@cfriedt
Copy link

cfriedt commented Mar 21, 2022

I'm actively working on supporting C++ although primarily looking at newlib right now (with gthr-posix / pthreads as a stepping stone)

@keith-packard
Copy link
Collaborator

Crosstool-ng has all of the toolchain support for libstdc++ integration with all three external C libraries, including picolibc, so at least that part of C++ integration support is ready. I managed to get the necessary patches for picolibc support integrated into gcc so that libstdc++ would work with it. I haven't looked at llvm's libc++ at all.

@keith-packard
Copy link
Collaborator

1. picolibc needs to be a Zephyr module that can be integrated to the existing Zephyr ecosystem via west. We could fork the picolibc and add the necessary files to make it a Zephyr module.

Is that to support building with a toolchain other than the Zephyr SDK?

2. picolibc build system needs to be integrated to the existing Zephyr build system, which is written in CMake. [This can be done by adding the necessary CMake scripts on the Zephyr side](https://docs.zephyrproject.org/latest/guides/modules.html#module-integration-files-in-zephyr) so as to keep the modifications to the fork minimal.

Happy to host cmake scripts inside the picolibc repository; I suspect you all aren't terribly interested in dealing with yet-another-build-system.

4. picolibc needs to be able to [support all currently supported Zephyr targets](https://github.com/zephyrproject-rtos/zephyr/tree/main/arch).

As picolibc inherits architecture support from newlib, most of the Zephyr targets already have the necessary code, but don't have any build infrastructure. That's mostly blocked on adequate testing support; without some way to verify the architecture works, I'm hesitant to advertise support.

5. (long term goal) picolibc should be able to be built for non-GNU toolchains (e.g. Clang, Arm Compiler, IAR, ...) -- this is not an absolute requirement for now.

Picolibc has clang support, and is continuously tested with that compiler. It also has support for the verified Comp Cert compiler. Should be easy to add support for other reasonably standards-conformant compilers now.

@rdiez
Copy link

rdiez commented Mar 21, 2022

If one needs C++ (i.e. memory footprint is not a major concern), one can use the newlib.

There is this persistent prejudice that support for C++, including exceptions, is expensive. This is a quote from my embedded project:

https://github.com/rdiez/JtagDue

Contrary to popular belief, C++ exception handling does not need many resources. I have been generating dynamic error messages in readable English using C++ exceptions on microcontrollers with as little as 16 KiB SRAM for years, with 'plenty' of memory to spare.

You do need to patch GCC though in order to disable the C++ exception emergency buffer, or you will lose 2 KiB RAM, or maybe more, depending on your target. See the toolchain builder makefile for more information.

@carlescufi
Copy link

Is that to support building with a toolchain other than the Zephyr SDK?

Yes, that's right. GNU Arm Embedded for example, or commercial toolchains like Arm Compiler 6 in the future.

@keith-packard
Copy link
Collaborator

Yes, that's right. GNU Arm Embedded for example, or commercial toolchains like Arm Compiler 6 in the future.

Picolibc happens to provide binaries for the GNU Arm Embedded toolchain , but I certainly get the point about not depending on external toolchain library support, especially if you think you'd want to control the library configuration in more detail. I'd be interested in understanding precisely what kinds of configuration options you're interested in here -- the 'big ticket' options in picolibc revolve around locale support, and I'm not sure Zephyr is heading in that direction?

@keith-packard
Copy link
Collaborator

There is this persistent prejudice that support for C++, including exceptions, is expensive.

Hrm. In my experiments, adding exception support required linking in tables full of information needed to perform stack unwinding. If there's some way to avoid all of that, that would be awesome.

@carlescufi
Copy link

Picolibc happens to provide binaries for the GNU Arm Embedded toolchain ,

Oh, good to know, thank you! Still leaves the problem of distribution: Zephyr users are typically not expected to download additional binaries to get their application running, even with 3rd-party toolchains.

especially if you think you'd want to control the library configuration in more detail. I'd be interested in understanding precisely what kinds of configuration options you're interested in here -- the 'big ticket' options in picolibc revolve around locale support, and I'm not sure Zephyr is heading in that direction?

I think @stephanosio can definitely fill in the gaps here.

@rdiez
Copy link

rdiez commented Mar 21, 2022

Hrm. In my experiments, adding exception support required linking
in tables full of information needed to perform stack unwinding.
If there's some way to avoid all of that, that would be awesome.

There is an initial penalty to pay for the code that handles stack unwinding, but I believe that the code is rather small, and it needs mainly ROM/Flash memory, not RAM/SRAM. This is assuming that your embedded software is already using malloc().

As far as I understand it, the exception unwinding tables tend to be similar, and the linker usually collapses duplicates, so the resulting table size is actually rather small. Again, these tables only need Flash/ROM memory.

If your C++ functions do not use exceptions, you can mark them with throw(), so that they do not generate any such overhead. Or enable LTO, which will optimise everything away.

If you handle errors the traditional way, with "if ( failed ) then..." statements, then both C and C++ normally end up needing more Flash/ROM space than the C++ exception way. GCC uses a technique called "zero cost exception handling" which replaces such "if ( failed )" code checks with those exception unwinding tables. That is the reason why the resulting code is usually smaller, and even faster in the normal case (no errors/exceptions).

@keith-packard
Copy link
Collaborator

There is an initial penalty to pay for the code that handles stack unwinding, but I believe that the code is rather small, and it needs mainly ROM/Flash memory, not RAM/SRAM. This is assuming that your embedded software is already using malloc().

As far as I understand it, the exception unwinding tables tend to be similar, and the linker usually collapses duplicates, so the resulting table size is actually rather small. Again, these tables only need Flash/ROM memory.

I just performed a simple experiment to see what these numbers were like using basic 'hello-world' C program compiled with the C++ compiler. You're quite correct about the exception tables; they're not big at all, with a total of 232 bytes of ROM out of a 5kB total. Adding an explicit catch/throw invocation adds about 5kB of ROM usage, increasing the exception tables to 644 bytes.

If you handle errors the traditional way, with "if ( failed ) then..." statements, then both C and C++ normally end up needing more Flash/ROM space than the C++ exception way. GCC uses a technique called "zero cost exception handling" which replaces such "if ( failed )" code checks with those exception unwinding tables. That is the reason why the resulting code is usually smaller, and even faster in the normal case (no errors/exceptions).

I'd say the "usual" C error handling plan is to ignore them :-). But, you've made a pretty strong case that replacing a pile of correct C exception handling with stack unwinding could credibly reduce the overall ROM space needed while also improving performance. That's pretty cool, and certainly more reasonable than using the standard C API for stack unwinding (setjmp/longjmp).

The thing to avoid with C++ is iostream -- replacing printf("hello, world\n"); with std::cout << "Hello, world!\n" adds 90kB of ROM usage (including an additional 7.5kB of exception tables) and 3.5kB of RAM usage.

@stephanosio
Copy link
Contributor

I'd be interested in understanding precisely what kinds of configuration options you're interested in here -- the 'big ticket' options in picolibc revolve around locale support, and I'm not sure Zephyr is heading in that direction?

@keith-packard The following is the kind of configurations we have for the Zephyr minimal libc at the moment:
https://github.com/zephyrproject-rtos/zephyr/blob/a4d4384447710e9885c0248385db0d5a5d197d50/lib/libc/Kconfig#L122-L180

These configurations mainly allow the users to control the availability of (the sets of) the C standard library functions, sub-features of these functions and internal buffer sizes so that users can fine-tune these configurations to minimise the memory footprint as needed.

Ideally, the Zephyr users should be able to configure the following aspects of the picolibc at build time (via Kconfig):

  • Availability of the C standard library functions
    • with very high level of granularity (i.e. not just at stdio, stdlib, string, etc. level, but down to individual functions if necessary)
    • the purpose of this is to minimise both build time and memory footprint; we certainly do not want to build the entire picolibc while building a Zephyr application as that would unreasonably increase the build time.
  • Availability of the sub-features of the C standard library functions
    • e.g. for printf, optional long long support, floating point support, double support, ...
  • Implementation backends
    • e.g. different implementation strategies for qsort, bsearch, ...
    • e.g. different backends and randomness sources for rand (with the option to use the Zephyr random subsystem)
    • e.g. different optimisation levels and algorithms for memset, memcpy, ...
  • Internal buffer sizes
    • e.g. internal buffer sizes for vfprintf, vfscanf, ...
  • Thread-safety and reentrancy
    • ability to build without (or with absolute minimum) thread-safety and reentrancy provisions
    • granularity of the struct _reent
  • Compilation options
    • optimisation level (O1, O2, ...)
    • optimisation strategy (optimise for size, optimise for speed, ...)
    • will be automatically taken care of by the Zephyr build system and the relevant Zephyr-side configurations if we properly integrate it

@keith-packard
Copy link
Collaborator

These all seem reasonable; let me describe what work would be needed to support them.

* Availability of the C standard library functions
  
  * with very high level of granularity (i.e. not just at `stdio`, `stdlib`, `string`, etc. level, but down to individual functions if necessary)
  * the purpose of this is to minimise both build time and memory footprint; we certainly do not want to build the entire picolibc while building a Zephyr application as that would unreasonably increase the build time.

Picolibc builds in about 10 seconds in the standard configuration; the configuration process takes more time than the compile. However, large parts of the library (like the locale support) can be disabled. I'm not sure we really need to provide support down to the individual function level; that would make the configuration quite complex as there are internal cross-dependencies in the library on various functions. I suspect the biggest divide will be whether to provide math functions or not; that would save a few seconds of build time and there aren't as many cross dependencies (although stdio does use some math functions for long double support).

* Availability of the sub-features of the C standard library functions
  
  * e.g. for printf, optional long long support, floating point support, double support, ...

Yup, already available in the underlying library. We can lift those configuration options up to the Zephyr configuration.

* Implementation backends
  
  * e.g. different implementation strategies for `qsort`, `bsearch`, ...
  * e.g. different backends and randomness sources for `rand` (with the option to use the Zephyr random subsystem)
  * e.g. different optimisation levels and algorithms for `memset`, `memcpy`, ...

There are only two implementations of the various mem* and str* functions, one selected when optimizing for space and the other when optimizing for speed. Those are automatically selected when you choose the overall optimization level for the library. Fixing the current random number generator would be a good project -- I haven't touched it at all, other than to switch the non-reentrant versions to use a TLS variable for state. I'm not sure what you would do with qsort and bsearch; I guess replacing quicksort with a less data-dependent sort function might avoid pathological performance issues? I'd be happy to add new versions if you've got them.

  * e.g. internal buffer sizes for `vfprintf`, `vfscanf`, ...

Those functions have no internal buffers. There are functions like ecvt, gcvt and fcvt; the non-reentrant versions of those do use TLS buffers. ecvt and gcvt have small ones based on the precision required to accurately re-generate the provided value. On the other hand, fcvt has an enormous static buffer as it may produce a value with leading or trailing zeros up to the maximum exponent. The newlib version of fcvt uses malloc for this case, because the struct _reent would otherwise be much larger. Using TLS variables, picolibc only burdens applications that want to use this function, so I decided it would be safer (and certainly much easier) to simply statically allocate the required space.

* Thread-safety and reentrancy
  
  * ability to build without (or with absolute minimum) thread-safety and reentrancy provisions

Most of the 'hard' re-entrancy issues revolve around stdio, and the core stdio in picolibc is lock-less, using atomic operations for the only stateful operations (which involve handling ungetc). For buffered stdio, picolibc uses the same locking macros as newlib, which allow the user to disable locking or to provide their own locking functions. If configured for locking, picolibc does provide weak symbols, so there is a small amount of overhead which can be avoided by configuring the library without them.

  * granularity of the [`struct _reent`](https://github.com/picolibc/picolibc/blob/89007348ecf8efe9b913b05b01a64de7a0952dc1/newlib/libc/include/sys/reent.h#L282)

In the normal configuration, picolibc doesn't use struct _reent at all -- that's only used if you're using the legacy newlib stdio paths. I haven't removed that older code yet as it's still useful in validating the test suite. Fixing that code to use TLS instead of struct _reent is not high on my list as I don't expect anyone to use it in a production environment.

* Compilation options
  
  * optimisation level (O1, O2, ...)
  * optimisation strategy (optimise for size, optimise for speed, ...)
  * will be automatically taken care of by the Zephyr build system and the relevant Zephyr-side configurations if we properly integrate it

Yup, that's all easy to pass through.

@keith-packard
Copy link
Collaborator

I've got a preliminary version of this code ready now; the cmake support is on the https://github.com/picolibc/picolibc/tree/cmake branch, then there's some zephyr support added on the https://github.com/picolibc/picolibc/tree/zephyr branch. This passes nearly all of the twister tests against the arm architecture.

@mithro
Copy link

mithro commented Apr 13, 2022

FYI - @kgugala

@keith-packard
Copy link
Collaborator

Ok, I think the picolibc piece of this puzzle is complete; zephyr is now building against the 'zephyr' branch in the picolibc repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants