-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RT1020 OTA: removed unnecessary compiler optimisation #2952
Conversation
e0db39b
to
eb8a4c0
Compare
@@ -212,8 +212,11 @@ MG_IRAM static int flexspi_nor_get_config( | |||
.ipcmdSerialClkFreq = 1, | |||
.blockSize = 64 * 1024, | |||
.isUniformBlockSize = false}; | |||
|
|||
*config = default_config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, struct copy usually ends up in stdlib calls...
src/ota_imxrt.c
Outdated
MG_ARM_DISABLE_IRQ(); | ||
ret = (flexspi_nor_get_config(&config) == 0); | ||
if (ret) ret = flash_erase(&config, addr); | ||
MG_ARM_ENABLE_IRQ(); | ||
if (!s_flash_irq_disabled) MG_ARM_ENABLE_IRQ(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this one disabled through #if 0
is intended to be a standalone call to just erase flash, it is not meant to be called in the OTA process itself, just if we reinstall our data load/save schema in the future.
As it works by itself, it needs to reenable IRQs when exiting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTA code only calls flash_erase()
, not this function
src/ota_imxrt.c
Outdated
struct mg_flexspi_nor_config *config) { | ||
struct mg_flexspi_nor_config default_config = { | ||
struct mg_flexspi_nor_config *config) { | ||
static struct mg_flexspi_nor_config default_config = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does static
make a difference ? If it is dynamic, the compiler should generate a bunch of code to fill the struct in the stack, then memory released. If it is static, the compiler will probably init this function at startup and memory be used forever.
Does our nice compiler use stdlib calls to fill this struct in the stack ?
Then I think our copy doesn't make much sense, we should probably declare this as module static (not function static) and name it and/or change the code to access this data. (With a nice comment remembering us that this must reside in RAM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not use static
, then the structure will be allocated on the stack, the compiler uses memset to zero it, and then it initializes it with the corresponding values. If we use 'static', memset is no longer used, the variable is probably initialized as global and the code just loads its address.
Here is the generated code when static is not used. Note that it reserves space for the variable on stack and then clears it with memset and then proceeds to fill it with data.
00000e1c <flexspi_nor_get_config>:
e1c: b510 push {r4, lr}
e1e: f5ad 7d02 sub.w sp, sp, #520 ; 0x208
e22: f44f 7200 mov.w r2, #512 ; 0x200
e26: 2100 movs r1, #0
e28: 4604 mov r4, r0
e2a: ab02 add r3, sp, #8
e2c: 4618 mov r0, r3
e2e: f000 f9b3 bl 1198 <__memset_veneer>
e32: 4924 ldr r1, [pc, #144] ; (ec4 <flexspi_nor_get_config+0xa8>)
e34: 4a24 ldr r2, [pc, #144] ; (ec8 <flexspi_nor_get_config+0xac>)
..........
When static is used, the code is this:
0000101c <flexspi_nor_get_config>:
101c: b082 sub sp, #8
101e: 2300 movs r3, #0
1020: 4a08 ldr r2, [pc, #32] ; (1044 <flexspi_nor_get_config+0x28>)
1022: 9301 str r3, [sp, #4]
1024: 9b01 ldr r3, [sp, #4]
1026: f5b3 7f00 cmp.w r3, #512 ; 0x200
102a: d302 bcc.n 1032 <flexspi_nor_get_config+0x16>
102c: 2000 movs r0, #0
102e: b002 add sp, #8
1030: 4770 bx lr
1032: 9901 ldr r1, [sp, #4]
1034: 9b01 ldr r3, [sp, #4]
1036: 5c51 ldrb r1, [r2, r1]
1038: 54c1 strb r1, [r0, r3]
103a: 9b01 ldr r3, [sp, #4]
103c: 3301 adds r3, #1
103e: 9301 str r3, [sp, #4]
1040: e7f0 b.n 1024 <flexspi_nor_get_config+0x8>
1042: bf00 nop
1044: 00000008 .word 0x00000008
Regarding the copy,
*config = default_config;
always results in the compiler using memcpy, like you said.
So I am using a byte-by-byte copy in order to avoid the compiler inventing another optimization with 'memcpy'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eb8a4c0
to
5cb1115
Compare
mongoose.c
Outdated
@@ -5702,7 +5712,7 @@ MG_IRAM static bool flash_erase(struct mg_flexspi_nor_config *config, | |||
#if 0 | |||
// standalone erase call | |||
MG_IRAM static bool mg_imxrt_erase(void *addr) { | |||
struct mg_flexspi_nor_config config; | |||
struct mg_flexspi_nor_config *config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see write
mongoose.c
Outdated
MG_IRAM static bool mg_imxrt_write(void *addr, const void *buf, size_t len) { | ||
struct mg_flexspi_nor_config config; | ||
struct mg_flexspi_nor_config *config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for RT1020, but will fail for RT1060, because there we need to have storage for the ROM function to fill, that's why I declared the struct and a pointer to it.
The struct is allocated in the stack;
for the RT1020, it is not used and freed at the end of the function call
for the RT1060, it is filled by the ROM call and freed at the end of the function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are right. My bad. Let me change that.
5cb1115
to
b6296e7
Compare
The compiler decided to use memset/memcpy in flexspi_nor_get_config function, which will lead to a crash during the firmware swap when we erase the sector in which stdlib resides.
I rewrote this function to make sure this optimization does not happen.
Removed sector allocation from single_bank_swap.
Added MG_IRAM to flash_wait.