Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTO-compatible ISRs #66392

Merged
merged 14 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,11 @@ endif()
# Apply the final optimization flag(s)
zephyr_compile_options(${OPTIMIZATION_FLAG})

if(CONFIG_LTO)
add_compile_options($<TARGET_PROPERTY:compiler,optimization_lto>)
add_link_options($<TARGET_PROPERTY:linker,lto_arguments>)
endif()

# @Intent: Obtain compiler specific flags related to C++ that are not influenced by kconfig
zephyr_compile_options($<$<COMPILE_LANGUAGE:CXX>:$<TARGET_PROPERTY:compiler-cpp,required>>)

Expand Down Expand Up @@ -805,6 +810,10 @@ target_include_directories(${OFFSETS_LIB} PRIVATE
kernel/include
${ARCH_DIR}/${ARCH}/include
)

# Make sure that LTO will never be enabled when compiling offsets.c
set_source_files_properties(${OFFSETS_C_PATH} PROPERTIES COMPILE_OPTIONS $<TARGET_PROPERTY:compiler,prohibit_lto>)
Copy link
Collaborator

@marc-hb marc-hb Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks old compilers:

xt-xcc ERROR parsing -fno-lto:  unknown flag

@tejlmand how can we test this flag first, any good example to follow?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentative fix submitted in #69058


target_link_libraries(${OFFSETS_LIB} zephyr_interface)
add_dependencies(zephyr_interface
${SYSCALL_LIST_H_TARGET}
Expand Down Expand Up @@ -1228,10 +1237,11 @@ if(CONFIG_GEN_ISR_TABLES)
# isr_tables.c is generated from ${ZEPHYR_LINK_STAGE_EXECUTABLE} by
# gen_isr_tables.py
add_custom_command(
OUTPUT isr_tables.c
OUTPUT isr_tables.c isr_tables_vt.ld isr_tables_swi.ld
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will outputting *.ld files work with non-ld supporting linkers ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would export empty file with comment only that it is intentionally empty. Leaving it here is just to simplify the code later - just to be able to include it.

COMMAND ${PYTHON_EXECUTABLE}
${ZEPHYR_BASE}/scripts/build/gen_isr_tables.py
--output-source isr_tables.c
--linker-output-files isr_tables_vt.ld isr_tables_swi.ld
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be depending on linker in use ?

I tried a couple of samples, but all produced isr_tables_vt.ld and isr_tables_swi.ld contains /* Empty */.
Do you have any examples of samples which actually produces something in those files ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any examples of samples which actually produces something in those files ?

answering my own question, in case others are wondering the same.

It appears that in order to make use of this new feature, one must set: CONFIG_ISR_TABLES_LOCAL_DECLARATION=y.

--kernel $<TARGET_FILE:${ZEPHYR_LINK_STAGE_EXECUTABLE}>
--intlist-section .intList
--intlist-section intList
Expand Down
7 changes: 7 additions & 0 deletions Kconfig.zephyr
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,13 @@ config NO_OPTIMIZATIONS
default stack sizes in order to avoid stack overflows.
endchoice

config LTO
bool "Link Time Optimization [EXPERIMENTAL]"
depends on !(GEN_ISR_TABLES || GEN_IRQ_VECTOR_TABLE) || ISR_TABLES_LOCAL_DECLARATION
select EXPERIMENTAL
help
This option enables Link Time Optimization.

config COMPILER_WARNINGS_AS_ERRORS
bool "Treat warnings as errors"
help
Expand Down
23 changes: 23 additions & 0 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,29 @@ config NOCACHE_MEMORY

menu "Interrupt Configuration"

config ISR_TABLES_LOCAL_DECLARATION_SUPPORTED
bool
default y
# Userspace is currently not supported
depends on !USERSPACE
# List of currently supported architectures
depends on ARM || ARM64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have some more deps related to toolchain / linker ?
if $(ZEPHYR_TOOLCHAIN_VARIANT) = "<list>"

or
if $(ZEPHYR_TOOLCHAIN_VARIANT) != "<list>"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will add zephyr and gnuarmemb dependencies. I was not able to compile the code with llvm to test compatibility.

# List of currently supported toolchains
depends on "$(ZEPHYR_TOOLCHAIN_VARIANT)" = "zephyr" || "$(ZEPHYR_TOOLCHAIN_VARIANT)" = "gnuarmemb"

config ISR_TABLES_LOCAL_DECLARATION
bool "ISR tables created locally and placed by linker [EXPERIMENTAL]"
depends on ISR_TABLES_LOCAL_DECLARATION_SUPPORTED
select EXPERIMENTAL
help
Enable new scheme of interrupt tables generation.
This is totally different generator that would create tables entries locally
where the IRQ_CONNECT macro is called and then use the linker script to position it
in the right place in memory.
The most important advantage of such approach is that the generated interrupt tables
are LTO compatible.
The drawback is that the support on the architecture port is required.

config DYNAMIC_INTERRUPTS
bool "Installation of IRQs at runtime"
help
Expand Down
8 changes: 8 additions & 0 deletions arch/arm/core/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,11 @@ else()
zephyr_linker_sources(ROM_START SORT_KEY 0x0vectors vector_table.ld)
zephyr_linker_sources(ROM_START SORT_KEY 0x1vectors cortex_m/vector_table_pad.ld)
endif()

if(CONFIG_GEN_SW_ISR_TABLE)
if(CONFIG_DYNAMIC_INTERRUPTS)
zephyr_linker_sources(RWDATA swi_tables.ld)
else()
zephyr_linker_sources(RODATA swi_tables.ld)
endif()
endif()
8 changes: 8 additions & 0 deletions arch/arm/core/swi_tables.ld
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
/*
* Copyright (c) 2023 Nordic Semiconductor ASA
*
* SPDX-License-Identifier: Apache-2.0
*/
#if LINKER_ZEPHYR_FINAL
INCLUDE zephyr/isr_tables_swi.ld
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing wrong, as it appears to me that isr_tables_swi.ld is a clean ld snippet, meaning it doesn't need to be run through the pre-processor, but I would still like to know the rationale to use INCLUDE here (and let the linker do the inclusion) and then the pattern used elsewhere, which is include() and included by the preprocessor.
One advantage of doing it with pre-processor is that we end with a complete and self-contained final linker.cmd (and friends) file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to use include with preprocessor, the problem was that the file does not exist in a moment while the linker script is created. It will be created in build phase while it looks like the linker script is created during configuration phase of cmake.

#endif
4 changes: 4 additions & 0 deletions arch/arm/core/vector_table.ld
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ _vector_start = .;
KEEP(*(.exc_vector_table))
KEEP(*(".exc_vector_table.*"))

#if LINKER_ZEPHYR_FINAL && CONFIG_ISR_TABLES_LOCAL_DECLARATION
INCLUDE zephyr/isr_tables_vt.ld
#else
KEEP(*(.vectors))
#endif

_vector_end = .;
8 changes: 8 additions & 0 deletions arch/arm64/core/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,11 @@ if(CMAKE_C_COMPILER_ID STREQUAL "GNU")
endif()

add_subdirectory_ifdef(CONFIG_XEN xen)

if(CONFIG_GEN_SW_ISR_TABLE)
if(CONFIG_DYNAMIC_INTERRUPTS)
zephyr_linker_sources(RWDATA swi_tables.ld)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should sort key be used here and below ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the localization of the swi_tables in memory changes anything. Or maybe I do not get some nuanses?

else()
zephyr_linker_sources(RODATA swi_tables.ld)
endif()
endif()
8 changes: 8 additions & 0 deletions arch/arm64/core/swi_tables.ld
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
/*
* Copyright (c) 2023 Nordic Semiconductor ASA
*
* SPDX-License-Identifier: Apache-2.0
*/
#if LINKER_ZEPHYR_FINAL
INCLUDE zephyr/isr_tables_swi.ld
#endif
5 changes: 5 additions & 0 deletions arch/common/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ zephyr_linker_sources_ifdef(CONFIG_GEN_ISR_TABLES
${ZEPHYR_BASE}/include/zephyr/linker/intlist.ld
)

zephyr_linker_sources_ifdef(CONFIG_ISR_TABLES_LOCAL_DECLARATION
SECTIONS
${ZEPHYR_BASE}/include/zephyr/linker/isr-local-drop-unused.ld
)

zephyr_linker_sources_ifdef(CONFIG_GEN_IRQ_VECTOR_TABLE
ROM_START
SORT_KEY 0x0vectors
Expand Down
14 changes: 13 additions & 1 deletion arch/common/isr_tables.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,27 @@
struct int_list_header {
uint32_t table_size;
uint32_t offset;
#if IS_ENABLED(CONFIG_ISR_TABLES_LOCAL_DECLARATION)
uint32_t swi_table_entry_size;
uint32_t shared_isr_table_entry_size;
uint32_t shared_isr_client_num_offset;
#endif /* IS_ENABLED(CONFIG_ISR_TABLES_LOCAL_DECLARATION) */
};

/* These values are not included in the resulting binary, but instead form the
* header of the initList section, which is used by gen_isr_tables.py to create
* the vector and sw isr tables,
*/
Z_GENERIC_SECTION(.irq_info) struct int_list_header _iheader = {
Z_GENERIC_SECTION(.irq_info) __used struct int_list_header _iheader = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully the compiler won't optimize out a global variable.
The __used cannot be seen by the linker, so for linking it should have no effect.

So when exactly have you seen this var being optimized out, and did this change actually fix that ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had an issue while enabling LTO during tests done by @LuDuda. The header was optimized out. This fix removes the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds more like an LTO toolchain bug then? As described, that's a gcc attribute that only controls whether or not the compiler will emit assembly for the object when it can prove it's unused (which it can't by definition here, since it's not static). Not necessarily worth a -1, but would be good to dig on this and try to get closer to a root cause.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that using LTO can cause some surprising problems. In a matter of fact, with LTO, linker does some jobs that are more compiler related. Now it is linker job to prove that some functions, structures and others are not referenced. Note also, that we are adding -fffunction-sections and -fdata-sections - maybe this would help to remove this variable if unused. It surprises me a little as it is placed in predefined section anyway. But it is the effect we had and this was the fastest and simplest solution.

.table_size = IRQ_TABLE_SIZE,
.offset = CONFIG_GEN_IRQ_START_VECTOR,
#if IS_ENABLED(CONFIG_ISR_TABLES_LOCAL_DECLARATION)
.swi_table_entry_size = sizeof(struct _isr_table_entry),
#if IS_ENABLED(CONFIG_SHARED_INTERRUPTS)
.shared_isr_table_entry_size = sizeof(struct z_shared_isr_table_entry),
.shared_isr_client_num_offset = offsetof(struct z_shared_isr_table_entry, client_num),
#endif /* IS_ENABLED(CONFIG_SHARED_INTERRUPTS) */
#endif /* IS_ENABLED(CONFIG_ISR_TABLES_LOCAL_DECLARATION) */
};

/* These are placeholder tables. They will be replaced by the real tables
Expand Down
12 changes: 6 additions & 6 deletions arch/common/shared_irq.c
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ void z_shared_isr(const void *data)
{
size_t i;
const struct z_shared_isr_table_entry *entry;
const struct z_shared_isr_client *client;
const struct _isr_table_entry *client;

entry = data;

Expand All @@ -42,7 +42,7 @@ void z_isr_install(unsigned int irq, void (*routine)(const void *),
{
struct z_shared_isr_table_entry *shared_entry;
struct _isr_table_entry *entry;
struct z_shared_isr_client *client;
struct _isr_table_entry *client;
unsigned int table_idx;
int i;
k_spinlock_key_t key;
Expand Down Expand Up @@ -103,10 +103,10 @@ void z_isr_install(unsigned int irq, void (*routine)(const void *),
k_spin_unlock(&lock, key);
}

static void swap_client_data(struct z_shared_isr_client *a,
struct z_shared_isr_client *b)
static void swap_client_data(struct _isr_table_entry *a,
struct _isr_table_entry *b)
{
struct z_shared_isr_client tmp;
struct _isr_table_entry tmp;

tmp.arg = a->arg;
tmp.isr = a->isr;
Expand Down Expand Up @@ -162,7 +162,7 @@ int z_isr_uninstall(unsigned int irq,
{
struct z_shared_isr_table_entry *shared_entry;
struct _isr_table_entry *entry;
struct z_shared_isr_client *client;
struct _isr_table_entry *client;
unsigned int table_idx;
size_t i;
k_spinlock_key_t key;
Expand Down
9 changes: 7 additions & 2 deletions cmake/bintools/gnu/target.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,13 @@
find_program(CMAKE_OBJCOPY ${CROSS_COMPILE}objcopy PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_OBJDUMP ${CROSS_COMPILE}objdump PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_AS ${CROSS_COMPILE}as PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_AR ${CROSS_COMPILE}ar PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_RANLIB ${CROSS_COMPILE}ranlib PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
if(CONFIG_LTO)
find_program(CMAKE_AR ${CROSS_COMPILE}gcc-ar PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_RANLIB ${CROSS_COMPILE}gcc-ranlib PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
else()
find_program(CMAKE_AR ${CROSS_COMPILE}ar PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_RANLIB ${CROSS_COMPILE}ranlib PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
endif()
find_program(CMAKE_READELF ${CROSS_COMPILE}readelf PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_NM ${CROSS_COMPILE}nm PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
find_program(CMAKE_STRIP ${CROSS_COMPILE}strip PATHS ${TOOLCHAIN_HOME} NO_DEFAULT_PATH)
Expand Down
3 changes: 3 additions & 0 deletions cmake/compiler/gcc/compiler_flags.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ endif()
set_compiler_property(PROPERTY optimization_speed -O2)
set_compiler_property(PROPERTY optimization_size -Os)

set_compiler_property(PROPERTY optimization_lto -flto)
set_compiler_property(PROPERTY prohibit_lto -fno-lto)

#######################################################
# This section covers flags related to warning levels #
#######################################################
Expand Down
2 changes: 2 additions & 0 deletions cmake/linker/ld/linker_flags.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ endif()

set_property(TARGET linker PROPERTY partial_linking "-r")

set_property(TARGET linker PROPERTY lto_arguments -flto -fno-ipa-sra -ffunction-sections -fdata-sections)

# Some linker flags might not be purely ld specific, but a combination of
# linker and compiler, such as:
# --coverage for clang
Expand Down
Loading
Loading