Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subsys: settings: nvs: provisioning of settings images #79130

Closed

Conversation

ghost
Copy link

@ghost ghost commented Sep 27, 2024

The main purpose of this PR is to prepare provisioning of independently built settings flash images. It allows users to provision arbitrary per-device settings independently of the application binary, e.g. at the end of the production line or during an OTA update.

The PR provides an integration test environment to simulate separately built images for the NVS settings subsystem on the QEMU x68 board. This will allow users to experiment with independently provisioned settings partitions on QEMU. They can either use a provided sample partition or extract their own binary partition image from a user-provided test device:

  • It adds a reserved memory region to the QEMU x86 board and places fixed simulated flash partitions inside that region mapped to their own linker section. The linker section is added to the x86 MMU pagetable (via gen_mmu.py) to support access under memory protection.
  • It enables the flash simulator to access flash at a well-defined pre-configured addres in the reserved memory region. This will allow us to simulate pre-provisioned flash with QEMU (or any other application using the flash simulator). This feature is enabled through an independent Kconfig variable that also enforces that the memory region will be statically reserved.
  • Finally the PR adds an appropriate overlay, sample flash partition image, code and additional instructions to the settings sample.

In a later PR I'll contribute a python generator script which allows to generate arbitrary NVS settings images from a YAML configuration file. I'm splitting the contribution into multiple parts for easier reviewing across different subsystems.

This PR depends on #79084 and #79086 to be merged first. Update: done.

@Laczen
Copy link
Collaborator

Laczen commented Sep 28, 2024

@fgrandel it is nice to be able to use qemu with a flash simulator and data coming from a file. Would it also be possible to write the any changes back to the file ?

@ghost
Copy link
Author

ghost commented Sep 29, 2024

@Laczen

It is nice to be able to use qemu with a flash simulator and data coming from a file. Would it also be possible to write the any changes back to the file ?

Just to understand you correctly: You mean dumping the memory back, like Flash/Memory -> Intel Hex -> YAML?

Yes, sure: You know the memory address and size (i.e. the storage partition limits), you can easily dump it into HEX/BIN format using common tools (from QEMU and from real devices). Converting HEX/BIN back to YAML should then be easy based on the tooling I'll contribute. I estimate that the inverse path could be added in <2d of work.

Side note: While YAML->C/NVS requires a schema (~binding), you don't even need a schema to convert back to YAML, as YAML is less typed and the structural information is preserved in the config paths (except for some isomorphic config path mapping to reduce memory footprint, which might be added in the future).

I'm working on the final touches for the YAML->Intel Hex->Device path tooling for memory & flash. Expect that to be contributed in the next 1 or 2 weeks. As soon as it is halfway in shape I'll provide a draft PR for you to preview.

@ghost ghost force-pushed the feat/pre-provisioned-settings branch 3 times, most recently from fc6c624 to 868eb2a Compare October 4, 2024 19:25
The flash simulator tests were still assuming that the base memory
offset of the flash buffer is zero, see #79082.

As we'll introduce non-zero base addresses in the next change of this
change set, we need to fix it now.

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
The LINKER_DT_SECTIONS macro is being added to the x86/intel64 platform
to support memory attributes.

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
The flash simulator node is renamed for compatibility with other boards.

This is required to access simulated flash under a single name in DTS
(e.g. for samples or tests).

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
This change adds a reserved mapped memory region to the QEMU x86 board
and places fixed simulated flash partitions inside that region mapped to
its own linker section in the final ELF file. The section is then added
to the pagetable to support access even if memory protection is active.

The change places the simulated flash at a well-defined memory address
rather than in the common .bss section.

The rationale behind this approach is, that then the contents of the
simulated flash can be pre-provisioned via a separate raw binary or
Intel hex image to qemu (or any other application using simulated flash)
independently from the kernel image.

This will be used in a later PR to exemplify pre-provisioned settings
partitions with the NVS settings backend.

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
Enables the flash simulator to simulate pre-provisioned flash with QEMU
(or any other application using the flash simulator).

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
Makes all private functions static and fixes a minor typo.

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
Values may be compressed in the settings storage to reduce memory
footprint. Therefore the read length may be shorter than the target
variable's size. We need to mask the result to zero bits that may remain
from earlier values assigned to the internal variables.

We rename the variable containing the length of data read from "rc" to
"bytes_read" to make this intent clear as we believe that samples should
be as self-documenting as possible.

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
Adds an overlay, code and documentation to the sample that allows user
to test an independently provisioned settings partition on QEMU.

Also provides instructions how to achieve the same on real devices.

Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
@ghost ghost force-pushed the feat/pre-provisioned-settings branch from 868eb2a to 701617d Compare October 6, 2024 17:25
@cfriedt
Copy link
Member

cfriedt commented Oct 12, 2024

Maybe it's just me, but it looks like there are also bug fixes in this PR. E.g. in the flash simulator or sample. If that's the case, would it make sense to factor those out? Just on mobile atm, so not diving too deep, but commits 1, 2, 3, 4, 6.

That might help to clarify the provisioning changes, which also look fine to me.

@ghost
Copy link
Author

ghost commented Oct 14, 2024

would it make sense to factor those out

Maybe yes, but it's extra effort just to work around a block that in my current understanding is nowhere close to conforming to Zephyr review policies or the code of conduct. Therefore I'm reluctant to invest further w/o clarification.

I think I'd rather run this PR through the chain of escalation as it will either give me a precedence to point to, to fight such behavior in the future or teach me that I misunderstood how this community wants to work - which both would be valuable outcomes IMO.

@kartben
Copy link
Collaborator

kartben commented Oct 14, 2024

@kartben Are you the right person to be assigned to this PR? If not: who should be assigned?

probably not and... I don't know :) I'm guessing @Laczen

@kartben kartben removed their assignment Oct 14, 2024
@ghost ghost assigned Laczen Oct 14, 2024
@ghost
Copy link
Author

ghost commented Oct 14, 2024

probably not and... I don't know :) I'm guessing @Laczen

Tentatively assigned to @Laczen. Release team: feel free to re-assign if this was the wrong guess.

@nashif
Copy link
Member

nashif commented Oct 14, 2024

@nordicjm

And you're deciding to use a system that the creator deemed years ago was end of life, no way.

which system is that?

@nordicjm
Copy link
Collaborator

@nordicjm

And you're deciding to use a system that the creator deemed years ago was end of life, no way.

which system is that?

NVS

@nashif
Copy link
Member

nashif commented Oct 15, 2024

@nordicjm

And you're deciding to use a system that the creator deemed years ago was end of life, no way.

which system is that?

NVS

@Laczen is that so? Where is this documented or communicated?

@Laczen
Copy link
Collaborator

Laczen commented Oct 15, 2024

@nordicjm

And you're deciding to use a system that the creator deemed years ago was end of life, no way.

which system is that?

NVS

@Laczen is that so? Where is this documented or communicated?

@nashif, this is my view on NVS yes.

NVS was created many years ago when zephyr was in a different state than today and was in need of a simple storage solution. In the design several factors were not included that have since popped up several times. The most important issues are:

  1. It is directly using the flash interface, which has limited the use to flash,
  2. It is using flash in a peculiar way (writing from start and end of a sector), that cannot be easily applied to other flash types, e.g. nand flash,
  3. Because of the peculiar way of using flash it is difficult to create it offline,
  4. It is inefficient for devices with high write-block-sizes,
    To circumvent these issues I have made multiple proposals, and I persevere in storage: Storage area #79665.

@ghost
Copy link
Author

ghost commented Oct 15, 2024

@nordicjm @nashif Please note: This PR is in no way tied to NVS or any other specific settings system. It contains generic preconditions to load any binary image to a well-defined reserved memory partition in QEMU memory. This could be littlefs, an emulated EEPROM (which can be configured to use the flash simulator, too, IIUC) and of course also NVS - which is documented as being the recommended fallback for FS-less settings images in Zephyr.

The only reason I mentioned the settings generator PoC (which isn't conceptually tied to NVS either), was to give some perspective to this PR. The reason why I worked on NVS first is exactly because it had the highest feasibility risk, not because it is the most important technology for provisioning. Other generators will be comparatively trivial if we get this to work with NVS. IMO a PoC's main purpose is just that: reduce risk and open up the largest option space possible.

Trying to prohibit a community member from doing a PoC in their free time for whatever topic and blocking an unrelated contribution due to that is so completely inacceptable in a FOSS context that it shouldn't survive for longer than a few hours before some "community leader" steps in. And I'm grateful that @Laczen made it clear that he didn't agree. But well... the outcome is the only thing that counts - and this PR is still blocked.

It's important to note, though, that I'm perceiving this as are rare counter-example in the Zephyr community. I'm not saying that we have a general problem at all, much to the contrary. It happened in other areas, too, but I only can remember two or three interactions overall that went toxic in such a way.

I'm just hoping that now a maintainer or other "community leader" will take action (as I'd do in my maintenance area) and step in. A PR being blocked for three weeks because of this is probably not ideal.

@nashif
Copy link
Member

nashif commented Oct 15, 2024

@Laczen

@nashif, this is my view on NVS yes.

NVS was created many years ago when zephyr was in a different state than today and was in need of a simple storage solution. In the design several factors were not included that have since popped up several times. The most important issues are:

  1. It is directly using the flash interface, which has limited the use to flash,
  2. It is using flash in a peculiar way (writing from start and end of a sector), that cannot be easily applied to other flash types, e.g. nand flash,
  3. Because of the peculiar way of using flash it is difficult to create it offline,
  4. It is inefficient for devices with high write-block-sizes,
    To circumvent these issues I have made multiple proposals, and I persevere in storage: Storage area #79665.

Ok, EOL here however means it does not get any new features, it is still supported in zephyr with the above constraints...
If EOL means it is unsupported and unmaintained, we need to note this somewhere and plan for deprecation and replacement, either through what you are proposing in #79665 or some other means.

@nashif
Copy link
Member

nashif commented Oct 15, 2024

@nordicjm

And you're deciding to use a system that the creator deemed years ago was end of life, no way.

which system is that?

NVS

NVS AFAIK is not deprecated or planned to be dropped, it is just limited in features and not getting any new ones. But IMO it is there to be used, in fact, the new ZMS doc recommend using it, see https://builds.zephyrproject.io/zephyr/pr/77930/docs/services/storage/zms/zms.html#zms-and-other-storage-systems-in-zephyr.

@nordicjm
Copy link
Collaborator

NVS is not usable for a provisioning system when it doesn't work on a bunch of devices that zephyr supports, it might work on a random board but it cannot be used on e.g. some stm32 or nxp boards, so what use is that for a provisioning system

@ghost
Copy link
Author

ghost commented Oct 15, 2024

NVS is not usable for a provisioning system when it doesn't work on a bunch of devices that zephyr supports, it might work on a random board but it cannot be used on e.g. some stm32 or nxp boards, so what use is that for a provisioning system

I'd be grateful if we kept this discussion in a different PR or issue. It is off-topic IMO and unrelated to this PR as I've pointed out. But I'm happy that there seems to be consensus re the (at least partial) relevance of the NVS settings subsystem for my upcoming PRs. :-)

That being said I'm not aware of any Zephyr policy that says that all Zephyr code needs to work on all devices. I could point to lots of examples where this isn't the case - bluetooth and OpenThread probably being the most prominent ones. As far as I know, a single supported hardware is enough to demonstrate the usefulness of a contribution (e.g. when introducing new driver subsystems).

Thanks @nashif for taking the initiative!

@de-nordic
Copy link
Collaborator

From the comment #79130 (comment)

   To circumvent these issues I have made multiple proposals, and I persevere in [storage: Storage area #79665](https://github.com/zephyrproject-rtos/zephyr/pull/79665).

First of all above statement is not entirely true, because I do not remember rejecting any NVS modifications that would added features provided by @Laczen. I do though remember that ideas by other contributors have been rejected by @lacze, even though they have not been changing core functionality and were rather optional and could not be used at all when not needed by user. Those contributors have been offering taking responsibility for what they have been provided but have been refused to do so.

Under evaluation that did not pass has only been various attempts to Flash Map replacement which have been rejected or died due to technical matters; due to strictly refusal or lack of cooperation by @Laczen manifesting itself by rejection of requests for minimal changes in PRs where core functionality was not altered, which by the paragraph above represents double standards. You have presented @Laczen dismissive attitude towards maintainers and other contributors technical knowledge and/or experience in fields they represent and dismissive attitude to technical discussion that has been attempt to provide explanation on the decisions.

All of above is on public record, available for community review, in form of PR comments, discord chats and video recordings of various Zephyr meetings.

If the 79665 is storage solution as a FS or record storage I am willing to give it impartial technical review, if it is once again bundle of rejected PRs , and by the size I judge it is, then @Laczen you once again are trying to undermine #71270 then this will be rejected on the basis that you are bundling unrelated things together.

@ghost ghost closed this by deleting the head repository Oct 25, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

8 participants