Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Integrated Boot Configuration System (build, provisioning, runtime) #76902

Closed
ghost opened this issue Aug 9, 2024 · 12 comments
Closed
Labels
Architecture Review Discussion in the Architecture WG required area: Devicetree area: IEEE 802.15.4 area: Kconfig area: Networking Enhancement Changes/Updates/Additions to existing features RFC Request For Comments: want input from the community treewide 🧹

Comments

@ghost
Copy link

ghost commented Aug 9, 2024

TLDR; Please see #76903 which exemplifies most of the concepts laid out in this RFC in a PR that is much easier to review than this detailed RFC. This continues the discussion started in #68127 in more detail.

Introduction

Currently, we have no satisfactorily integrated solution to configure in-memory software component instances (as opposed to software features) at build and provisioning time.

Existing approaches like Zephyr's current flavor of Devicetree (DT), Kconfig or the settings subsystem provide partial solutions but, as generic software component instance configuration systems, they lack in overall structure, flexibility, scope or scalability.

Examples:

Apart from these pending functional requirements, we have created a rather artificial and complicated (from a user perspective) distinction of the relative domains of applicability of Kconfig and DT due to an insufficiently precise ontological hardware/software divide and structural deficiencies of Kconfig.

The Kconfig/DT distinction SHALL be made more precise, practically useful and enforceable and the resulting software component instance configuration SHALL be represented in a more maintainable and scalable unified format with currently required Zephyr-specific Kconfig/DT "quirks" removed.

Problem description

Note: See motivating use cases for the following requirements in "Exemplary Use Cases" below.

A. This RFC addresses the following specific problems:

  1. The proposed solution SHALL support easy and intuitive (ie. user-centric) configuration of multi-instance software components.
  2. Single and multi-instance software component configuration SHALL be united based on criteria of usability and maintainability rather than structural or "philosophical" constraints of DT vs. Kconfig.
  3. The hard-to-enforce, imprecise and confusing ontological hardware/software criterion SHALL be replaced by an easy-to-enforce, user-centric software feature (Kconfig) vs. software component instance (CT) distinction. This SHALL improve the overall configuration design on objective engineering measures of encapsulation, maintainability and data model normalization, ultimately leading to improved usability and developer experience.
  4. A unified and normalized abstract conceptual configuration data model SHALL be defined and decoupled of its sources and targets of serialization. Users' build and provisioning time requirements as well as requirements for different representations like YAML, protobuf, Thrift, the settings subsystem, secure keystores, etc. SHALL be based on the same self-documenting, intuitive and maintainable abstract data model w/o unnecessary incompatibilities.
  5. Configuration artifacts (partial serialization snippets) SHALL be grouped based on precise and easy-to-enforce user-centric criteria of modularization like the information hiding principle, the principle of least surprise, deployment context (e.g. security requirements), rate-of-change, deployment time (e.g. build time vs. provisioning time), etc. eventually leading to a more maintainable and more intuitive configuration system.
  6. Eventually all Zephyr-specific "quirks", extensions and exceptions SHALL be removed from both, Kconfig and DT: Kconfig SHALL be re-focussed on its original purpose of feature selection rather than in-memory software component instance configuration. DT SHALL be re-focussed on its original purpose of inter-OS portability (namely compatibility with Linux). Both not so much for technical or ontological reasons but to recover their full usefulness in letting users coming from Linux transfer existing knowledge w/o compromises.

Of course such a transitions requires time and effort. Therefore this RFC proposes a gradual migration path from the current state to the target state maintaining long-term backwards compatibility at every step forward w/o introducing further inconsistencies. No one SHALL be distracted from their actual goals or invest extra effort in the migration of configuration if not on their own demand to satisfy their own needs and requirements.

Proposed Change

This RFC proposes an abstract conceptual data model, serialized - in a first step - to a backwards compatible, semantic extension of the DT format. This DT superset is called "configtree" (CT) in the following.

The solution will be exemplified for network interface settings but will be extensible to all subsystems and applications as laid out in the problem description.

The proposed architecture allows for later addition of alternative source or target serializations, e.g. settings subsystem key/value pairs, property, protobuf IDL or Thrift files, integration with externally managed databases or secure key stores, JSON or YAML files provided locally or retrieved from a network location.

Note: See System Device Tree's simplified YAML serialization of DT (and CT) as one option to represent CT as YAML.

CT is proposed as a first serialization format for pragmatic reasons of usability, simplicity, initial effort and long-term maintainability. It will be shown, that it is entirely capable to represent the proposed abstract data model in an - as we find - rather intuitive way. It satisfies all technical, logical and business requirements of a serialization source and intermediate unified format within the proposed overall configuration approach.

The proposed migration path consists of the following steps (not necessarily in this order):

  • Introducing, documenting and exemplifying CT for the network subsystem. This includes migrating single- and multi-instance network subsystem specific Kconfig software component configuration parameters (NET_CONFIG_*) to CT and deprecate them in Kconfig. This exemplifies an improved "feature selection" vs. "configuration" Kconfig/CT divide and properly encapsulates CT network defaults and bindings in the file system tree as near as possible to actual usage sites. For backwards compatibility, deprecated Kconfig parameters will be regarded as just another source serialization format and Linux DTSpec compliant DT will be recoverable from CT for network devices w/o any Zephyr specific exceptions or extensions.
  • Extending the CT approach to drivers and other subsystems as needed including deprecation of Kconfig software instance configuration parameters.
  • Adding support for the settings subsystem as a (partial) configuration target. This requires additional tooling to split the unified intermediate configuration space among targets and translate hierarchical properties to a flat settings list.
  • Adding support for secure configuration storage sources and targets. This requires additional tooling to merge different configuration sources into the unified intermediate configuration space.
  • Adding additional source serialization formats or transports on an as-needed basis.

Note: Splitting and merging CT could be achieved with the Lopper tool from the System Device Tree project. It allows to manipulate DT (and CT) files based on a syntax similar to XPath.

Detailed RFC

This RFC specifies an improved overall hardware, software feature and software component configuration for Zephyr as existing configuration approaches are lacking:

  • Kconfig currently mixes software feature selection (include/exclude subsystems, drivers, feature switches, etc.) with singleton software component instance configuration. Kconfig was not conceived as software component instance configuration space and therefore conceptually lacks the ability to configure collections of structured software object instances.
  • Devicetree covers configuration for driver instances (peripheral-to-driver mapping, clock frequencies, interrupt lines, driver subsystem configuration, etc.) including Zephyr-specific zephyr,... and <vendor>,... extensions. It was designed to represent hardware independently of any specific operating system. Its current tree structure and usage rules in Zephyr do not represent a normalized graph of distributed configuration object instances and breaks encapsulation rules.
  • The settings subsystem addresses boot-time key/value configuration but it cannot efficiently handle a large build-time graph of structured instances of configuration objects in the context of a low-power/low-resource RTLS as its name space is rather limited and its currently available backends require settings to consume non-volatile memory. OTOH, the settings subsystem goes beyond build-time configuration by allowing for provisioning time or runtime configuration in persistent device storage. The settings property structure can however not be integrated with the configuration property structure at present which requires redundant code and makes it harder to maintain a consistent and self-validating configuration data model.

A few solutions for specific application/subsystem configuration problems exist

  • The standard DT /chosen node (DTSpec v0.4, section 3.6) allows to refer to other DT nodes to configure global switches related to/referring to hardware/driver configuration. In Zephyr these are mostly used to configure samples, basic OS features or choose hardware for specific use cases (e.g. the console target or the settings partition). This approach only allows to set <phandle>s or aliases and does therefore not scale.
  • The custom DT /zephyr,user node allows application developers to define simple key/value pairs. It is conceived as an ad-hoc configuration mechanism, though, that does not scale to the required structures.
  • There are still a few Kconfig (e.g. CONFIG_SOMETHING_0/1/2/...) "hacks" that work around Kconfig's lack of object instance support. This approach does not scale and it can only be applied to fixed multiplicities.

None of the existing approaches scales to the levels required in Zephyr today. In the absence of a proper configuration system they tend to be (ab)used for properties that should better be represented in a well-defined application/subsystem configuration framework. This RFC tries to lay out the requirements of such a system as well as proposes a specific implementation and migration approach.

Exemplary Use Cases

The following use cases illustrate and motivate detailed requirements.

Note: These use cases don't necessarily cover all features of the proposed configuration approach. If some requirement is neither self-evident nor covered by a corresponding use case, please comment and let me know.

Scalable, Resource-Optimized Build and Provisioning Time Boot Configuration

As an embedded application developer I want to configure immutable boot defaults across all enabled subsystems consistently at build-time w/o incurring avoidable resource usage (e.g. CPU cycles, RAM or ROM). I want only such boot configuration to consume non-volatile memory that needs to be injected at provisioning time and/or changed at runtime. I also want to scale effortlessly from a single instance to a multi instance software component configuration or promote build-time to a provisioning-time configuration or vice-versa w/o having to migrate properties between independent configuration approaches (e.g. from Kconfig to DT to the Settings Subsystem and back).

Extensible and Re-Usable Configuration of Samples

As a maintainer or contributor I want to create driver- or subsystem-specific samples that can as effortlessly as possible be combined and extended by embedded application developers into fully-functional customized solutions. The sample build boot configuration should therefore use the same format and tools required for single instance and multi-instance build-time or provisioning-time software component boot configuration as a scaled custom application.

Build Time Injection of Boot Configuration

As a large-scale application developer I want to be able to define large amounts of build time configuration variants externally to Zephyr. I want to use my own custom configuration format (e.g. Thrift or protobuf), possibly editable and sourced dynamically from a database or network location independently from Zephyr and application code repositories.

Provisioning Time (e.g. End-of-Line) Boot Configuration

As a production engineer I want to be able to provision device specific settings as fast as possible to target devices w/o having to re-compile the device's firmware, e.g. as a separate settings image via JTAG or a SPI flash tool to an EEPROM, flash partition or dedicated flash storage. To develop or debug end-of-line configuration, as a firmware application developer, I want to be able to simulate end-of-line configurations at build time w/o having to use complex production-specific tooling or migrate configuration properties between separate configuration approaches.

Runtime Boot Configuration

As an end user of a device, I want to be able to change provisioned boot-time defaults of my device persistently at runtime (e.g. to configure custom network details if Zephyr is powering a typical home router device). As an application firmware developer, I do not want to incur extra effort to provide provisioning and runtime configuration through separate configuration approaches.

Declare initialization and reverse dependencies between software component instances

As a maintainer or contributor, I want to declare default initialization dependencies and sequences of related software component instances ("services"). As an firmware application developer, I want to be able to override default initialization dependencies and sequences. As a maintainer, contributor or firmware application developer, I want to specify and configure arbitrary lifetime hooks in addition to the default initialization callback that should be respecting the inversion-of-control principle w/o the software component instance having to "know" (ie. depend on) the caller.

Supply security material from secure sources to secure targets

As a production engineer I want to be able to inject confidential security material directly from a secure key vault to a secure embedded storage at the end of my production line.

Detailed Requirements

This section describes detailed requirements in addition to the main functional requirements A.1 through 6.

B. Scope:

  1. The configuration design SHALL enable injection of persisted boot-time configuration at build time, provisioning time and runtime.
  2. The configuration subsystem SHALL support and be useful to all Zephyr hardware, subsystems, samples, libraries, modules and applications including out-of-tree software components and applications.

C. Source and Target Serializations:

  1. Configuration SHALL be (partially) serializable to and from every source or target representation capable to hold sufficiently typed hierarchically organized key/value pairs. This includes - but is not limited to - the following specific formats: CT/CT bindings as specified in this RFC, YAML/YAML Schema, Thrift/Thrift Schema, protobuf/protobuf IDL, the settings subsystem, JSON/JSON Schema including remote and local sources and targets.
  2. All source and target representations SHALL be completely decoupled by a merged, fully normalized, human-readable canonical intermediate serialization artifact. This RFC proposes CT encoding as intermediate representation for the moment being. Should it turn out at a later point in time that this format is lacking, CT shall be downgraded to a source or target serialization and the infrastructure SHALL be switched to whatever improved intermediate representation is considered more suitable at that time.
  3. The default source serialization SHALL enable a hierarchy of configuration layers that MAY override each other in the same way as is currently possible in the Kconfig and DT implementations.
  4. Any source serialization SHALL allow for easy-to-read internal or external pointers (references) between any two represented entities including - but not limited to - all driver, subsystem and application software component instances, etc.
  5. Any source serialization SHALL be accompanied by schema definition files that determine at least a fully C-typed target representation including C primitive types, types from <stdint.h>, structs and pointers. Additional target type systems as used in Rust or C++ SHOULD additionally be supported, at least in principle.
  6. Zephyr's default target serialization SHALL be representable and usable in C code without additional runtime resource usage (notably CPU cycles, RAM, NVM), e.g. as macros.
  7. Users SHOULD eventually be able to develop and contribute their own standard-based or custom source or target representations while being able to re-use the intermediate infrastructure.

D. Maintainability:

  1. Zephyr's default source serialization SHALL and non-default serializations SHOULD be divisible in arbitrary configuration snippets to be merged at build time. Notably the overall data model represented by the configuration SHALL be fully decoupled from its division into arbitrary deployment artifacts (files and folders).
  2. Zephyr's default source serialization SHALL be self-validating and self-consistent. By current modeling standards, this can best be achieved by normalizing the conceptual data model while keeping the physical (i.e. serialized) data model as close as possible to the conceptual data model.
  3. If several proposed configuration approaches fulfil all functional requirements, then we SHALL prefer the one that re-uses most of the existing in-tree infrastructure and out-of-tree community invest and requires less initial and long-term maintenance effort (total cost of ownership).

E. Documentation:

  1. Zephyr's default source and target serializations as well as related tools and processes SHALL NOT require considerable new syntactical or conceptual learning effort by users in addition to the current configuration approach.
  2. Zephyr's default source and target serializations, as well as related tools and processes SHALL be well documented.
  3. Schema definition files SHALL allow for machine-readable inline documentation of all entities and properties. Any complete schema definition SHALL thereby provide a full specification and documentation of the underlying abstract conceptual data model.
  4. Schema description and model metadata SHALL be accessible by Zephyr's automated documentation system and be included in the documentation build.

F. Machine-readable metadata describing configuration data and schemas:

  1. It SHALL be possible to attach machine-readable documentary or technical model-metadata to both, configuration files and schema files, e.g. to consistently document hardware-specific driver capabilities.
  2. Metadata schemas SHALL be enforceable by the same means as the model data itself.

G. CT-specific requirements:

  1. A fully compliant DT SHALL be recoverable at any moment from the CT w/o manual interaction.

Note: CT as default source and intermediate serialization format together with existing DT macros, tooling and corresponding documentation satisfy almost all of these requirements out-of-the-box with minimal initial implementation effort.

Note: Initialization dependency properties MAY be modeled as just another kind of composable binding schema that MAY be applied to certain CT nodes according to CT normalization rules. Nodes representing initializable software component instances declare initialization dependencies to other initializable software component instances via hierarchy or <phandle>. All default initializations may be accumulated in a single file or distributed over subsystems according to CT encapsulation rules.

Proposed change (Detailed)

Configtree (CT) Specification

CT is a natural semantic superset of DTSpec (and the upcoming System DT). CT SHALL use the same syntax as DTSpec without hardware specific properties in non-device/non-hardware nodes. Allowed standard properties in non-device/non-hardware nodes are "status" and "compatible". CT MAY introduce additional <prop-encoded-array> if required. Currently no such requirement is known, though.

CT SHALL be backed by a well-defined Zephyr-specific abstract conceptual configuration data model (the "Zephyr configuration space") that includes existing DT entities and attributes as well as CT-specific extensions. The abstract data model SHOULD be documented in the Zephyr user documentation using adequate textual graphing techniques (e.g. based on mermaid) for easy review. Alternatively the model MAY be generated automatically from improved binding sources that not only specify properties but also relations.

CT introduces additional nodes and properties into the device tree (called "configtree" for CT) that structurally relate 1-to-n or n-to-m to existing driver or hardware nodes. Software component instance related configuration properties SHALL be introduced into existing DT nodes if they structurally relate 1-to-1 (bijectively) to existing nodes.

Nodes that structurally relate 1-to-n to existing nodes SHALL be n-side subnodes of the 1-side node. A collection of 1-to-1 related properties inside the same node MAY be grouped in their own subnode for improved encapsulation (e.g. for separate subsystems or larger semantically related properties), similarly to the structures currently generated in Kconfig. It SHALL at all times be clearly specified, though, how nodes map to the abstract unified configuration space in order to prove normalization of the CT model representation.

Nodes that structurally relate n-to-m to device/peripheral-related nodes require an additional top-level sub-space to be introduced. The structure of CT-specific top-level subspaces SHALL follow the file structure of the drivers or subsystems that require the additional node. DT or CT SHALL NOT introduce additional top-level nodes based on other custom encapsulation criteria. Existing non-standard top-level nodes other than those explicitly defined in DT or CT SHALL be regarded as "modeling bugs" and corresponding issues SHALL be opened to document and fix them.

References between n-to-m related nodes and nodes inside DT or CT-specific DT extensions SHALL be made explicitly using a DTSpec <phandle>. References using alternative custom primary keys (e.g. driver or interface names) or logic ("the first matching interface") SHALL not be used. 1-to-1 references SHALL not be allowed as they obviously breach CT normalization requirements.

CT SHALL use the same Zephyr-specific .yaml binding files and macro targets as DT. CT-specific macro targets MAY be added. They are prefixed with "CT_" unless they can also be applied to DT.

CT introduces strict encapsulation rules. Files representing CT (including DT) and corresponding binding files SHALL be modularized into files according to the following rules:

  • Subsystem or driver specific default properties and binding files that are exclusively used in Zephyr (namely unavailable in Linux) SHALL be placed inside Zephyr's directory tree based on the "least visibility" encapsulation rule, i.e. as deep as possible in the directory tree and as close to in-tree usage sites.
  • Alternatively we MAY consider having non-Linux default properties and bindings side-by-side with corresponding public header files of drivers or subsystems that use them if we consider them to be part of the public user API.
  • Vendor, architecture, hardware or application specific configuration default property and binding files SHALL be placed as close as possible to their respective usage sites, too e.g. inside vendor-, architecture-, hardware- or application-specific directories.
  • Default properties that are semantically defined and used by Linux and their corresponding binding files SHALL be placed in a shared top-level folder structure separate from their usage sites and from all other default property and binding files. This enables us to automatically recover a fully Linux-compatible DT from CT at any times.
  • File names SHALL be chosen to place default property and binding files as close to related source files inside a folder when ordered alphanumerically.
  • No files SHALL be placed based on imprecisely or subjectively defined and hard-to-enforce ontologies like "hardware" vs. "software" properties.
  • Existing deviations from these encapsulation rules SHALL be documented as issues when found and fixed accordingly over time.

The Zephyr Configuration Space

The following diagram proposes an initial abstract conceptual data model of the Zephyr Configuration Space. See #76903 which demos the model in CT serialization.

zephyr-config

Also see https://drive.google.com/file/d/1sQuen1Y0bAIS5PX_kKRmSTNA_g4gd-gT/view?usp=sharing (requires the Google Drive draw.io plugin) for a possibly updated version.

This model SHALL be updated based on rules of normalization whenever additional entities need to be added. Property documentation MAY be added to illustrate normalization. Any serialization SHALL be validated against this conceptual data model and SHALL be rejected if not matched. Binding files SHALL document the (collection of) entities to which they can be applied. These binding file restrictions SHALL be verified during build.

Additional/Improved Binding File Semantics

Composition over Inheritance

Currently binding files only allow for inheritance of types. Composition of types (mix-ins) cannot be defined. This makes it unnecessarily hard (and sometimes impossible) to properly design a well encapsulated design hierarchy.

The following example shows current binding file design practice in Zephyr:

Example:

compatible: "adi,ad559x-adc"

include: adc-controller.yaml

This binds a specific device to a driver-specific software programming model in practice. We justified this in the past by asserting that "adc-controller-yaml" would be exclusively determined by an objectively correct hardware-only ontology of an abstract ADC hardware model sufficient to all imaginable driver implementations.

This promise was of course rarely kept in practice. Driver internals regularly leak into supposedly hardware only "base types" which breaks encapsulation. This is not surprising as DT properties are largely determined by their usage inside Zephyr, not by an independent commonly accepted shared industry standard outside Zephyr except for properties introduced by Linux for which it may be argued that they represent a de-facto standard.

In practice our inheritance tree forces a client programming model onto the peripheral which is what DT was originally conceived for but may not always be compatible with Zephyr's claim to be "vendor agnostic" and "customizable". OTOH Zephyr has no requirement to be OS agnostic. So adding Zephyr-specific additions to CT is not a problem as they can be easily ignored by custom or vendor-specific driver or subsystem implementations. But if doing so, they need to be composable as not to pollute the inheritance hierarchy and they need to be properly encapsulated.

From a data model perspective, drivers are related (=chosen) to a combined hardware instance + application key, i.e. the correct combined abstract normalized "key" to such a compositional configuration class would be the (app-id, peripheral-id) tuple. This means that the above example there should be some zephyr,adc-controller compatible mixed into the peripheral's node as default driver programming model which could be overridden on application level. The zephyr,adc-controller compatible then matches a zephyr,adc-controller.yaml file which is placed near the corresponding adc.h or adc driver folder where all the drivers reside that follow its programming model. The application would provide a partial DTS fragment that overrides or extends the driver's node "compatible" with its own custom driver client programming model if required, possibly in a directory hierarchy that again closely couples with the driver hierarchy. Introducing a proper naming convention, such rules could be verified automatically during build with resolution based error messages.

The composition of bindings for typing should not be confused with the actual driver selection at build time. Driver selection follows the logic described in DTSpec: From left to right, drivers matching one of the compatible strings are being located in the build (as configured by Kconfig and cmake). If none or more than one matches per peripheral, a warning or error message is generated and the build possibly stops. This means that the app-id part of the conceptual driver implementation key above will be provided by app-specific Kconfig feature-inclusion mechanisms while the peripheral-id part will be specified by the first matching compatible of the corresponding CT node.

This rationale results in the following additional requirements for the Zephyr binding system:

  • The Zephyr bindings system SHALL be able to match more than one binding file per node, up to one per string in the compatible string list. This allows us to apply the "composition over inheritance" heuristic to the CT type system.
  • Nodes SHALL be validated against all matching binding files in the build and the corresponding target serialization(s) be generated. This allows us to compose arbitrary "hardware programming models" (used to match drivers based on hardware as defined in DTSpec) with Zepyhr or application specific "client programming models" (used to match alternative driver implementations for the same hardware). This was not required in DTSpec which is explicitly conceived as being "client agnostic" but makes sense for Zephyr of course.
  • Custom drivers MAY extend the Zephyr driver programming model by inheriting from the corresponding Zephyr default driver subsystem binding files. This SHOULD however respect rules of encapsulation as defined elsewhere and therefore the existing hard-coded inheritance hierarchy SHALL be migrated to a compositional model over time on an as-needed basis.

Nomenclature and Directory Structure

Currently we require binding files to reside in separate top-level directories. This places binding files far from corresponding default DT source files and from usage sites and thereby breaks the above formulated CT encapsulation requirements.

This RFC therefore proposes an alternative naming schema <[vendor,]programming-model>.binding.ya[m]l. Files following this nomenclature MAY be placed anywhere in Zephyr's directory tree. They SHALL be placed as closely to their usage sites as possible, see CT encapsulation naming rules. The vendor part is optional when it is clear from context. Namely inside the Zephyr source tree the zephyr, prefix SHALL be left out, to ensure that files can be placed near to other similarly named files based on the programming model (i.e. API).

Additional Restrictions placed on Tree Structures

Similarly to JSON Schema and YAML Schema, we SHOULD be able to not only validate properties based on compatible strings but to also restrict node names for certain bindings (e.g. channel in ADC or iface, ipv6, etc. for well-defined network configuration nodes).

We SHOULD be able to restrict subnodes to certain parents, e.g. iface SHALL have to be explicitly whitelisted as allowable child node by network peripherals or channel as child node of ADC peripherals possibly including multiplicity in both cases. Therefore the iface and channel bindings SHALL be marked as "whitelist-only" and corresponding network driver nodes will have to include them explicitly in their "subnode-whitelist".

Similarly it SHOULD be possible to place restrictions on allowable parent nodes, e.g. to only let ipv6 or ieee802154 define that they only SHALL be subnodes of iface. This time encapsulation requirements are opposite, therefore it suffices to include a "parent-whitelist" property to such bindings possibly including allowed multiplicity ranges.

Configtree vs. Devicetree vs. Kconfig

Single instance vs. multi instance software component configuration

All software component instance configuration properties SHALL be deprecated in Kconfig and migrated to CT under the above rule sets.

Kconfig SHALL be exclusively responsible to select features, while all software component instance configuration SHALL be reserved to CT (including DT).

To make this more precise, the following rules SHALL apply:

  • Features (Kconfig) are represented as code while instance configuration will at some time be represented in memory (CT):
    • Use Kconfig to include or exclude code or enable or disable coded logic in the build.
    • Use CT to configure runtime software parameters, be they singletons (e.g. subsystem-level parameters or singleton drivers) or multiple instances (e.g. multiple instances of the same driver, multiple protocol instances, etc.).
  • Contributors can roughly use the following heuristic: Will the switch that I'm requiring mainly configure the content of flashed .text sections (Kconfig) or the content of boot-time stack/heap memory structures as initially represented by .data/.bss sections (CT)?

Kconfig SHOULD thereby be re-focused on its original intent to describe, compose and configure software features (in terms of included source code or logic) and software feature dependencies. This is not so much required as an end of itself (Kconfig "conformance") but has the following practical advantages:

  • Users coming from Linux will recognize more familiar Kconfig patterns.
  • Kconfig configuration will be more concise and focused.
  • Newly introduced configuration variables will no longer fall in the "initially single instance" trap that has often hindered maintainability and maintainable evolution of drivers from single to multi-instance as it led to considerable redundant development effort (see the "old" and "new" USB driver approach or L1/2 and L3+ network subsystem configuration). All software components should be conceived as intrinsically instantiable in the future.
  • Most importantly: The artificial and arbitrary distinction between single and multi instance software component configuration due to purely technical restrictions will be replaced with the more intuitive and precise feature vs. memory distinction for improved usability and a more level learning curve.

Configuring runtime software components, be it "in-memory" or as global runtime parameters, SHOULD be migrated to CT, especially such parameters that strictly belong to one of the CT abstract modeling concepts by normalization rules.

Backwards compatibility to deprecated Kconfig MAY be maintained as long as required as laid out in the requirements section.

Hardware vs. Software Configuration

The hardware vs. software distinction SHALL be dropped in favor of the following, more precise and easier-to-enforce rules:

  • Properties defined and used by Linux DT SHALL be part of the DT tree as specified above for CT.
  • Software component instance configuration properties defined and used exclusively by Zephyr, vendors or users SHALL be placed in CT and distributed across the file system based on normalization and encapsulation rules as specified above for CT.
  • Software features SHALL be selected via Kconfig switches.

Dependencies

Direct dependencies exist to Kconfig, DT and the settings-subsystem. Indirect dependencies exist to all configurable drivers or subsystems.

Concerns and Unresolved Questions

This section answers questions and evaluates concerns brought forward while discussing the aptitude of DT as a configuration source.

Concerns are responded to based on Zephyr-specific requirements and pragmatic engineering approaches, namely the concepts of data model normalization (similarly to 3NF for relational data models) and encapsulation/modularization.

Work-in-progress - please comment, I'll collect all concerns and questions here.

Is DT syntax capable to address all our software configuration requirements?

Yes. DT is just a tree of nodes with key/property values and references (phandles) that can easily be mapped via bindings to any primitive C type, <stdint.h> type, struct and pointer. Any normalized data model can obviously be mapped to DT. This should be good enough under all reasonable circumstances and is theoretically very well founded. We have a semantic modeling challenge before us, not a syntax or serialization challenge.

Also compare the DTSpec archeology section below.

People don't like DT or cannot understand DT, DT is awkward.:

As we will not dispose of DT to use YAML everywhere, no matter how bad DT is, everyone who uses Zephyr has to know and work with it anyway. From a usability pov it doesn't matter what serialization we choose as long as we choose a single one, fix the quirks and document it well.

On a Linux box you have to deal with many different config files, too.

"Because Linux does it" is not requirement or engineering argument as such. We have no Zehyr-specific requirement that forces us to use many distinct config formats. There are good usability arguments that prefer an integrated approach. Note that this RFC favors distribution of configuration over many files (see the encapsulation/modularization argument), just not many distinct semantics and syntax variants.

We should probably start solving domain-specific problems.

We have a an obvious requirement to design something that can be extended to other subsystems plus can be integrated with the settings subsys, used for provisioning and be serialized to other formats like protobuf IDL or Thrift which we should not ignore. Above all we have to be able to serialize to any syntax based on some abstract conceptual data model.

A YAML-based solution is easier to understand and maintain.:

As laid out in the "Alternatives" section, a YAML-based solution is going to be a huge maintenance and documentation nightmare. We have to re-invent every wheel that has been invented for DT: type binding, inline documentation, integration with the doc system, mappings to macros, overlay mechanisms, naming patterns, etc. Just matching and syncing with the existing DT macrobatics will be a huge effort initially and over time. The problem we're facing is not syntax but semantics and the surrounding infrastructure and tooling.

It is easy to distinguish between HW and SW properties, that's how we should separate configuration.

It is not. This is a perceptual bias: We tend to confuse our internal models and heuristics with what is out there in the world. The reality is: We fight over each and every addition to DT because some say "it's SW" others say "it's HW". If not even we are able to precisely define the line between SW and HW, how will our users? If we have to explain to our users that what they find intuitive is wrong then we are wrong.

Devicetree was derived from the [...] Open Firmware project.

Nope. See DTSpec, section 1.2:

The text of this document was derived from ePAPR.

But it's not entirely wrong either as ePAPR itself was derived from the Open Firmware spec (aka IEEE 1275-1994).

DTSpec was designed to describe hardware only.

Why it is important to insist on ePAPR rather than Open Firmware as main DTSpec predecessor is that it was only the former that removed user configuration from DT and restricted its applicability to hardware due the changed focus on backing the Power ISA boot firmware then again re-generalized by DTSpec.

IEEE 1275-1994 specified allowable contents of the Device Tree in section 3.2 as:

The device tree [...] describes [hardware and] user configuration choices [...among other things unrelated to our discussion...]. [...]

Section 3.3.1 adds:

The list of configuration variables varies from system to system.

IEEE 1275-1994 had an /options root node specifically reserved to store such non-volatile user configuration which received a default at build time and could be updated at provisioning or runtime by the end user. So exactly the use case I'm envisioning for DT.

Note: U-Boot uses DT for user configuration, too. They seem to have used the IEEE 1275-1994 /options node first but now introduced a custom /config node. Of course they are a bootloader, so they need less user config than an application development platform like Zephyr.

Saying that DTSpec was designed to describe "hardware" is therefore at least misleading. DTSpec was designed to back OS-independent bootloaders, see DTSpec, section 1.1:

The Devicetree Specification provides a complete boot program to client program interface definition.

In other words: DT is a simple HAL but a HAL is of course as much influenced by its client as by the abstracted hardware itself. And Zephyr is not a bootloader nor is Linux. So the "abuse" (or as I'd say "pragmatic re-interpretation") started when focusing DTSpec on describing OS specific device abstractions to become vendor- and architecture independent which reversed the original intent of DTSpec, ie. abstracting OS differences away.

This shows that the simplified conventional wisdom "DT is for hardware only" has never been as "pure" as one might have thought and there is no need to protect its "purity" either. Such an argument proves nothing and should be replaced by requirements analysis: Being OS-independent was their requirement but it was never ours which explains why we never truly enforced it (e.g. in the build infrastructure) except for improved knowledge transfer from Linux (see below). Our main requirement always was vendor-agnosticism.

DTSpec is careful to introduce HW specifics in a separate section after laying out a general hierarchical key/value store with generic typing. We can trivially keep all HW specific parts out of nodes that don't need it by extracting status and compatible into a separate generic node.yaml binding file which will replace base.yaml for those nodes.

In the end it doesn't even matter that much anyway. Our discussion re software/hardware is mostly academic: Structurally (i.e. by normalization criteria) the large majority of our subsystem config requirements map to existing device tree structures naturally (1-to-1 or 1-to-n). The remaining m-to-n related nodes can be isolated into top-level namespaces as inspired by IEEE 1275-1994 and referred to from inside the actual device-specific tree, see https://github.com/fgrandel/zephyr/blob/rfc/76902-systree-config/samples/net/sockets/echo/app.overlay as an example.

Wherever it made sense, we've tried to be compliant with Linux devicetree bindings.

This rule continues to be applied and even fortified by this RFC as laid out in the CT specification section. Not to ensure OS independence of Zephyr's DT (which never was a sensible requirement) but because it helps people who know Linux. They will find it easier to learn Zephyr which again is a real requirement of ours. Still we have deviated far enough from Linux (for good reasons) that it can hardly be argued that we're still "compatible" in any sensible way. That's why the above CT specification re-establishes and distinguishes much more precisely between Linux-compatible and Zephyr-specific DT parts.

The HW/SW split is "cleaner" or at some time in the past was "cleaner" than mixing up hardware and software properties in the same DT nodes.:

Our use of DT has broken basic data modeling practices from day one, namely normalization and encapsulation. Both are precisely defined design rules:

  • Normalization defines with mathematical precision that HW and SW properties SHALL be kept in the same abstract entity if they both functionally depend on it (ie. 1-to-1) to ensure model integrity and validity.
  • Physical deployment artifacts OTOH SHALL be split up along rules of encapsulation and modularization, an argument as precise as a search/find operation over our code base.

Our DTS and bindings are mostly kept far apart from usage sites instead. We have invented Zephyr-specific (but vendor agnostic) "hardware properties" that neither exist in datasheets nor in Linux and put them where the hardware lives based on imprecise ontological assumptions of what is "hardware". This is wrong: By DDD rules and Conway's law we should know that any context-agnostic ontology is doomed to fail. And by the encapsulation argument we should place Zephyr-driver-specific DT snippets near the drivers that use them exclusively while keeping shared concerns at as central a place as required but still as local as possible.

Further de-normalized and de-modularized configuration will inevitable lead to more modeling inconsistencies and less readability/maintainability in practice as the model is not self-validating and consistency cannot be automatically enforced with sensible effort (examples of which abound in our own partially de-normalized DT variant today).

We have to distinguish between the global conceptual data model and its local physical representation instead. YAML doesn't determine a data model. But the model is much more relevant to usability and maintainability than the syntax. This shows how far our discussion has strayed from the real problem so far. Zephyr is an application development platform, as such application architecture concepts are to be applied.

While DT has been promoted as a great solution to many problems, to me, it has several drawbacks on the way it is implemented in Zephyr.

This is true. It is due to Zephyr-specific architectural and implementation deficiencies (many of which have been laid out in this RFC) that our use of DT feels awkward. Not due to its syntax. This can be fixed.

If I had to start Zephyr again, I'd probably stay away from DT.

Maybe, but that's not an option in practice.

If we start diverging from [DT], we either define our own spec, or it'll just organically grow into a mess.

True. This is why CT is specified much more precisely than our current use of DT while acknowledging additional practical requirements that had not been systematically covered by DT so far.

DT and DT bindings have come a long way. Lets focus our resources on making DT more intuitive by fixing a few "quirks" rather than starting from zero because this will immediately benefit us doubly: on the hardware and on the software modeling side. As soon as the cracks in the YAML approach are inevitably going to appear everyone will wish that we had not opened another Pandora's box.

Alternatives

An alternative, separate YAML-based approach has been considered and rejected in this RFC for the following reasons:

  • It would further fragment and complicate Zephyr's boot-time configuration system which already is considered rather complex and hard-to-learn by users today.
  • Even if we manage to come up with a good definition of what goes where in a combined KConfig, DT, YAML approach, the community will inevitably misunderstand and misuse it because no one will read let alone understand such an artificial definition. This would cause additional review effort for maintainers and collaborators.
  • The Kconfig/DT divide regularly causes confusion for newcomers. A YAML based approach would not contribute to making this distinction more intuitive and natural.
  • A separate YAML-based approach would have to duplicate many of the existing DT structures, processes and tools which would cause a multiple of the initial development and long-term maintenance effort required for the above proposed CT approach.
  • It would be very hard to keep separate YAML-based tooling and macros in sync with DT tooling and macros for users to transfer knowledge between the two configuration subsystems.
  • These are big future problems already solved in a way in CT that everyone in the community understands well. Building on top of the DT infrastructure and knowledge is a huge advantage and gives us a considerable head start wrt existing tooling.
  • The obvious community pressure to add more and more software component instance configuration to DT shows that it is an intuitive target where everyone expects application config, too. We should channel this energy productively rather than risking organic uncontrolled growth of DT based on misleading ontologies and assumptions.

The settings subsystem was considered as an exclusive configuration target but was then conceived as optional part of this more general RFC because it would be lacking as a general configuration subsystem as laid out in the "Detailed RFC" section.

Thrift and protobuf were proposed as exclusive configuration sources but were then conceived as optional part of this more general RFC as convergence could hardly be achieved in the community to a single binary source. Apart from that all arguments listed under the YAML approach apply to these source serializations as well.

Kconfig-based approaches are not adequate due to Kconfig's structural limitations as laid out in the "Detailed RFC" section.

@ghost ghost added Enhancement Changes/Updates/Additions to existing features area: Networking area: Devicetree RFC Request For Comments: want input from the community area: Kconfig area: IEEE 802.15.4 treewide 🧹 Architecture Review Discussion in the Architecture WG required labels Aug 9, 2024
@ghost ghost self-assigned this Aug 9, 2024
@cfriedt
Copy link
Member

cfriedt commented Aug 10, 2024

Just a comment in relation to "Detailed RFC" point 5. zephyr,chosen nodes are also incapable of being valued as anything other than phandles / aliases. So, for example, multiple uart log instances use a "pseudo-device" because there is no comparably simple way to specify multiple logging uarts.

@cfriedt
Copy link
Member

cfriedt commented Aug 10, 2024

W.r.t. Boot Configuration, there is currently a PR that adds a multiboot / efi "kernel command line" (CMDLINE in Linux). I had some concerns about it being limited to just a couple of architectures.

In that ballpark though, it would be helpful to specify non-security-sensitive parameters that can vary at build, provision, and init-time (aka runtime) - e.g. logging uart(s), framebuffer details, runtime-queryable software parameters, and so on. It's not ideal to have to parse a string, but could be encoded a number of different ways that would make it parseable (and maybe even directly addressable).

I think, at least at one point, the Linux kernel command line had been added to Devicetree.

Some Zephyr community members might prefer more structured formats as well (e.g. protobuf, thrift)

@ghost
Copy link
Author

ghost commented Aug 12, 2024

In that ballpark though, it would be helpful to specify non-security-sensitive parameters that can vary at build, provision, and init-time (aka runtime)

Absolutely. Such an input at runtime could be considered just another source that is layered on top of the others with high priority. That's why it is so much more important to agree on a common data model rather than discussing serializations.

@ghost ghost moved this to Review in RFC Backlog Aug 13, 2024
@ghost ghost changed the title [RFC][Enh. Req.] Proposal for an integrated build-time, provisioning-time and runtime approach to boot configuration in Zephyr. [RFC] Integrated Boot Configuration System (build, provisioning, runtime) Aug 13, 2024
@ghost ghost removed their assignment Aug 18, 2024
@ghost ghost closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in Architecture Review Aug 18, 2024
@github-project-automation github-project-automation bot moved this from Review to Done in RFC Backlog Aug 18, 2024
@benediktibk
Copy link
Collaborator

benediktibk commented Aug 19, 2024

Sad to see this closed. I really think @fgrandel has some good points here about what Zephyr should be and actually is (application development platform vs pure RTOS) and how the use cases which derive from this design decision can be solved best.

I am also convinced that the strict separation between hardware and software configuration is not beneficiary, as the software is written for this specific hardware, although vendor agnostic, and will therefore reference it and use it. Even more if we consider the reoccurring discussion topic if something is software or hardware, which shows that this separation is not as clear as one would think of in the first place.

@ghost
Copy link
Author

ghost commented Aug 19, 2024

@benediktibk Thanks for your support. Maybe s/o else can continue the work on this PR then? I seem to have started it off on the wrong foot, so I'm somewhat burnt wrt this topic: the discussion of this RFC has been too much hassle for me with too little result. I'm sure this proposal is far from optimal but the feedback given so far has not been constructive either.

Anyone who likes to take over, feel free to assign yourself.

WRT CT (DT syntax only) as a first serialization format: Please note that I chose it for purely pragmatic reasons to save the community the hassle of having to re-invent the wheel. It's a stepping stone on the migration path, otherwise mostly irrelevant to the proposed architecture.

I never understood how discussions of syntax could be so much more relevant to the arch WG than semantics. This honestly does not fit my own understanding of what matters in architecture. But this seems to be a minority opinion in the arch WG, so I won't insist.

@decsny
Copy link
Member

decsny commented Aug 23, 2024

Wasn't this discussed in the arch WG this week? why is there not a summary post? @carlescufi I am missing context for why the issue is closed now, I missed the meeting

@decsny
Copy link
Member

decsny commented Aug 23, 2024

6. DT SHALL be re-focussed on its original purpose of inter-OS portability (namely compatibility with Linux) [...] not so much for technical or ontological reasons but to recover their full usefulness in letting users coming from Linux transfer existing knowledge w/o compromises

I am curious, do you really think there is not a technical benefit at least from a maintenance aspect for both users and tree developers, to keep the DT focused only on hardware? I also am skeptical of the "philosophical" statements about it which always seem dogmatic... but it still seems like maybe it would be good to have a more stable devicetree definition for hardware that doesn't change as easily as the whims of software changes could. Maybe I'm misunderstanding what you said here though, and you were only saying the inter-OS compatibility doesn't really have a technical reason? In which case, maybe I agree about that, it seems like a highly suspect aspiration that is also thrown around a lot as a hand wavy justification for the zephyr DT "philosophy". (it seems like the whole point of this RFC is to say, to separate hardware and software configuration, only this sentence confused me about what you see as the benefits of having DT)

@ghost
Copy link
Author

ghost commented Aug 23, 2024

@decsny

I am curious, do you really think there is not a technical benefit at least from a maintenance aspect for both users and tree developers, to keep the DT focused only on hardware?

Your questions have been addressed in the RFC. I'll cite it for your convenience where appropriate.

As several others in the Zephyr community, you seem to take it for granted that the distinction between hardware and software properties is precise and objective. However, our own experience and decades of research have consistently shown that ontological definitions are subjective and based on "group think" (ie. self-stabilizing systemic discourse) and therefore truly "hand wavy". Unfortunately humans tend to ignore the fact, see e.g. Kahnemann for a number of nobel price awarded self-tests that everyone can reproduce at home.

From this RFC:

This is a well-known perceptual bias: We tend to confuse our internal models and heuristics with what is out there in the world. The reality is: We fight over each and every addition to DT because some say "it's SW" others say "it's HW". If not even we are able to precisely define the line between SW and HW, how will our users? If we have to explain to our users that what they find intuitive is wrong then we are wrong.

This means that we probably have as many hw/sw distinctions in practice as we have maintainers. ;-) Due to confirmation bias we tend to describe distinctions made by others as "errors" or even "deviations from the pristine beauty of DT".

The only axiomatic definition I heard so far is "every property 1-to-1 (i.e. functionally bijective) to a driver instance is DT". But then why do we fight over network configuration as it is almost exclusively 1-to-1 to driver instances with mathematical precision? And what about all those existing DT properties that are not 1-to-1 to driver instances (eg cpus, ram, flash, interrupts, etc.)? (Note: I don't back this approach as it breaks encapsulation, deviates from established Linux practice and doesn't scale.)

The RFC calls this out:

Our use of DT has broken basic data modeling practices from day one, namely normalization and encapsulation [...]:

  • Normalization defines with mathematical precision that HW and SW properties SHALL be kept in the same abstract entity [note: not the same file, though!] if they both functionally depend on it (ie. 1-to-1) to ensure model integrity and validity.
  • Physical deployment artifacts [aka: files] OTOH [note: not so the abstract model entities!] SHALL be split up along rules of encapsulation and modularization, an argument as precise as a search/find operation over our code base.

Our DTS and bindings are mostly kept far apart from usage sites instead. We have invented Zephyr-specific (but vendor agnostic) "hardware properties" that neither exist in datasheets nor in Linux and put them where the hardware lives based on imprecise ontological assumptions of what is "hardware".

You and I agree that maintainability means that files that change at different frequencies should be kept apart. Our current dogmatic use of DT breaks this rule due to coupling of almost constant (and mostly Linux-compatible) board/soc related properties with driver- and subsystem-related properties.

Further down in my RFC I'm therefore introducing well-established axiomatic concepts and practices that are all but "hand wavy" to replace the software/hardware distinction and reduce DT to the de facto objectively Linux-compatible part, all the rest is per definition abstractly modeled CT (which can be expressed in any syntax including - but not limited to - YAML and DT syntax), even this basic fact has been ignored by almost all commenters:

  • Subsystem or driver specific default properties and binding files that are exclusively used in Zephyr (namely unavailable in Linux) SHALL be placed inside Zephyr's directory tree based on the "least visibility" encapsulation rule, i.e. as deep as possible in the directory tree and as close to in-tree usage sites.
  • Default properties that are semantically defined and used by Linux and their corresponding binding files SHALL be placed in a shared top-level folder structure separate from their usage sites and from all other default property and binding files. This enables us to automatically recover a fully Linux-compatible DT from CT at any times.
  • No files SHALL be placed based on imprecisely or subjectively defined and hard-to-enforce ontologies like "hardware" vs. "software" properties.

Note: Just do a search in our code base and that of Linux to find the "least visible" file system folder precisely.

I am missing context for why the issue is closed now

"Conventional wisdom" (as I call the kind of biased assumptions that we all unavoidably share as human beings) can be overcome by argument but this is hard especially if held by trusted entities like the Linux community, the DTSpec, TSC members or an initial majority of the arch WG that is unaware of the arguments because they have not read the RFC but unfortunately still opinionate - and do so highly emotionally as we've seen.

@nashif, @carlescufi As you seem to be back from holiday:

As others have pointed out: Not rational arguments have so far been important in discussing this RFC but what the majority of devs subjectively believes to be true (even consciously and proudly so without giving "rational" arguments).

My RFC goes beyond what can be digested in five minutes by arch WG members. I already can see eyes rolling because I'm writing another long argument that cannot be subsumed in 140 characters. But the many misunderstandings show that this unfortunately seems to be required. I hope we all still read books, datasheets and specs for a reason. Not so RFCs and comments it seems?

I believe that we need a different architecture WG decision procedure and debating culture if we want to address problems like this systematically in a more productive way:

  • Opinions must be given referring to well-defined functional requirements or established design criteria,
  • personal opinions must refer to shared community interest and
  • opinions must show that the person speaking is fully aware of the RFC to be decided upon and responds to arguments given rather than consciously ignoring them.

Such behaviors should be actively encouraged by whoever currently chairs the discussion or stumbles upon them in RFCs tagged with "Architecture Review" and contributions no respecting these basic rules disregarded in decisions. This is what our CoC requests anyway and is required to level the playing field for the OP.

IMO it is not the task of the OP alone to defend herself against non-productive behavior. The burden should be shared by all community members, especially maintainers and of course TSC members. RFCs and PRs systematically land in arch review because they have caused conflict that needs to be actively moderated.

IMO a structured architecture decision process also requires establishing a shared list of requirements (the problem space) before accepting statements about implementation details (the solution space). Except for a few examples this RFC has been exclusively discussed in the solution space right away w/o giving me the time and the benefit of the doubt to calmly lay out my findings. Worse: Even after the problem space was laid out in written form, it was largely ignored and it seemed to be considered the OPs task to deal with that. Carles, I did recognize your attempts to ensure that we define requirements first and I really appreciated this.

Thinking and reading before commenting takes time but it spares people like me a lot of trouble as our only way to consent is argument and debate plus it spares the community having to work around lack of decision or deadlock for years as we obviously do in the config area where "irrational" opinions seem to be allowed to block "rational" arguments.

If casual arch WG participants do not have time to consider complex arguments routinely then we need a dedicated arch WG team to support OPs that contributes bandwidth to prepare discussions, establish internal consensus about architecture and design principles and enforces them when challenged by participants unaware of the problem space.

Should I open an RFC so we discuss this in more detail in the next arch WG? ;-)

@henrikbrixandersen

I also request that we do not lightly accuse someone of breaching the CoC as has happened to me in this context. It should be acknowledged that to me as a non-native speaker it is hard to distinguish between an argument that is "nil" and an argument that is "simply not true" (your own words). I'd say both dismiss a previously given argument equally which IMO is an integral part of ongoing debate and should be acceptable. I also request that TSC members calmly refer to the CoC in private to give people unaware of the emotions they caused a chance to apologize or rectify misunderstandings. I also believe that the CoC was nowhere breached, it is especially hard to accept that I was accused of bringing in religion to the discussion after citing a well-known metaphor by R. Feynman (which I even immediately replaced with s/th less easy to misinterpret once pointed to it). I firmly believe that this could even be considered abuse of the CoC as clearly religion is meant differently in that context.

Such moderated behavior is of course desirable by all community members including myself but I find it much more forgivable if those who have less visibility and responsibility in the community now and then become emotional, this is so human. The only thing I'd have expected was a prompt clarification and correction once the facts were on the table. Unfortunately this never happened and IMO largely contributed to damaging the discussion and leading it in the wrong direction.

Additionally statements of the kind I described as non-productive above were repeatedly endorsed after I had given criteria for what IMO is broadly accepted as "good debating culture" to re-focus the discussion on argument and community requirement rather than personal taste.

I consider this an example of really outstandingly bad behavior by a trusted community leader. I did mention so in private before calling it out here in public but unfortunately to no avail - much to the contrary. Plus I was encouraged to place my critique in public which I hereby do.

I hope this sufficiently explains why under such circumstances I considered it a waste of time to continue supporting this RFC w/o leading the meta-discussion first.

@decsny
Copy link
Member

decsny commented Aug 23, 2024

As many others in the Zephyr community, you seem to take it for granted that the distinction between hardware and software properties is precise and objective.

Actually, I would say as the only active person on the DT collaborator list for the last couple months, it's probably more apparent to me than most people that the gray area is an issue. From being on the (recently removed, by me) DT binding maintainer area, I saw many things being added to DT that aroused this type of argument over HW/SW distinction.

I am still fairly green to the area, but what I have been trying to do by asking around to some active community members is to try to characterize the community widsom (maybe what you called "group think") about what is HW vs SW as well as what benefit are we deriving from the automatically accepted belief that we need to separate configuration of them into separate schemas and languages. Because I am loathe to disregard the well established conventions but also skeptical about what I keep hearing repeated all the time without much consideration, especially given the frequency of disagreements as you have pointed out. This is why I was trying to clarify your opinion about it in my question above, since you have clearly put so much systematic thought into this, I thought it would be valuable to understand your perspective.

So far, the most convincing thing I have heard in the past is what I mentioned, about frequency and reason for changes to the configurations. I don't know if you would call it axiomatic, but it seems like to me that a good rule of thumb might be that you should not have to change DT unless your physical hardware configuration has changed. Of course this is not how zephyr DT works right now.

6. Eventually all Zephyr-specific "quirks", extensions and exceptions SHALL be removed from both, Kconfig and DT: Kconfig SHALL be re-focussed on its original purpose of feature selection rather than in-memory software component instance configuration.

I am not sure if I agree or not about what you're describing as the original purpose of Kconfig. Feature selection is a big part yes, but there is clearly built into the language support for configs that are strings, ints, hex, etc, do you really think these types are semantically meant just for feature selection?

@ghost
Copy link
Author

ghost commented Aug 23, 2024

Actually, I would say as the only active person on the DT collaborator list for the last couple months, it's probably more apparent to me than most people that the gray area is an issue.

I didn't know this, I misunderstood you there. Sorry for pointing out what you already knew. It's reassuring, though, that we made the same observation.

I thought it would be valuable to understand your perspective.

Same on my side. I'm very much interested in yours! I can see that you bring in constructive argument.

So far, the most convincing thing I have heard in the past is what I mentioned, about frequency and reason for changes to the configurations.

Yes, I agree. The argument is good, but we don't really follow it because we mostly don't add new properties because the hardware changed. Most of our new properties are added at the same rate as new driver features. So they are actually driver properties from an "axiomatic" rate-of-change perspective although they of course describe some kind of hardware/driver interface. But this doesn't matter for encapsulation purposes as these properties are nowhere used than in the drivers. As soon as they actually are used elsewhere we need to move them of course, but only then.

I don't know if you would call it axiomatic, but it seems like to me that a good rule of thumb. [...] Of course this is not how zephyr DT works right now.

Yes, we again agree. See above.

To make it clearer what I mean by "axiomatic" as opposed to "ontological":

  • To me something is axiomatic if it doesn't require us to make assumptions about the world outside abstract software models. Software constructs refer to other software constructs as in mathematics formulae are derived from other formulae by purely syntactical operations independently of any "interpretation". Such derivations can be made intersubjectively precise, their correlation to external objects depend on context, though. That's why we attach properties to "hardware" based on our drivers' need. We don't model size, weight, atoms or PCBs although in other contexts such properties legitimately describe hardware, too.

  • To me something is ontological if it refers to something metaphysical, something that is true independently of context, something that is imagined to be "real" outside of our discoursively agreed "software formulae". But as humans, we don't have direct access to the "real", we need intermediate modeling. As neuroscientist Anil Seth likes to say: Consciousness is a controlled hallucination. This is good enough so that we don't usually spill our coffee besides our mouth. But it's not good enough to keep it stable independent of context and individual perception. Platon has made an early attempt to define "eternal" ontologies 2000 years ago, but over time they always didn't seem to match or got extremely complex (like failed approaches to define the enterprise data model or the semantic web). Unfortuntely the basic idea returns in each generation because of the biases I mentioned.

  1. Eventually all Zephyr-specific "quirks", extensions and exceptions SHALL be removed from both, Kconfig and DT: Kconfig SHALL be re-focussed on its original purpose of feature selection rather than in-memory software component instance configuration.

I am not sure if I agree or not about what you're describing as the original purpose of Kconfig. Feature selection is a big part yes, but there is clearly built into the language support for configs that are strings, ints, hex, etc, do you really think these types are semantically meant just for feature selection?

You're right, I formulated this too sloppily. As Linux doesn't have another build-time kernel configuration mechanism they partly use it for configuration as we do. The separation between feature selection and software instance configuration is my own "invention" and would be specific to Zephyr to overcome the artificial distinction between "single instance" and "multi instance" configuration. Thanks for pointing that out.

@tmon-nordic
Copy link
Contributor

I am a bit out of context regarding the reactions to this RFC: did all this happen during arch WG meeting(s)? Is the recording and/or transcript available somewhere?

On the actual RFC I do agree with it. Unfortunately I am not skilled enough to implement it myself nor to even imagine what the implementation would look like.

I do know that there are situations where DT is the current way-to-go because it provides the necessary technical implementation (UAC2 with its extreme macrobatics is actually about translating "user-readable syntax describing UAC2 device" to "a bunch of blobs for external host use and a bunch of easy-to-use lookup arrays for class implementation") . The alternative would be coming up with some a subsystem (or even worse - a problem) specific custom language with its own tooling.

@ghost
Copy link
Author

ghost commented Sep 5, 2024

I am a bit out of context regarding the reactions to this RFC: did all this happen during arch WG meeting(s)? Is the recording and/or transcript available somewhere?

@tmon-nordic

Thanks for coming back to this RFC, the discussion continued here: #77638

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Architecture Review Discussion in the Architecture WG required area: Devicetree area: IEEE 802.15.4 area: Kconfig area: Networking Enhancement Changes/Updates/Additions to existing features RFC Request For Comments: want input from the community treewide 🧹
Projects
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants