forked from systemd/systemd
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
16212 lines (13172 loc) · 861 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
systemd System and Service Manager
CHANGES WITH 254 in spe:
Announcements of Future Feature Removals and Incompatible Changes:
* We intend to remove cgroup v1 support from a systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* The next release (v255) will remove support for split-usr (/usr/
mounted separately during late boot, instead of being mounted by the
initrd before switching to the rootfs) and unmerged-usr (parallel
directories /bin/ and /usr/bin/, /lib/ and /usr/lib/, …). For more
details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* EnvironmentFile= now treats the line following a comment line
trailing with escape as a non comment line. For details, see:
https://github.com/systemd/systemd/issues/27975
* Support for System V service scripts is now deprecated and will be
removed in a future release. Please make sure to update your software
*now* to include a native systemd unit file instead of a legacy
System V script to retain compatibility with future systemd releases.
* Behaviour of the per-user service manager units have changed w.r.t.
sandboxing options, so that they work without having to manually
enable PrivateUsers= as well, which is not required for system units.
To make this work, we will implicitly enable user namespaces
(PrivateUsers=yes) when a sandboxing option is enabled in a user unit.
The drawback is that system users will no longer be visible (and
appear as 'nobody') to the user unit when a sandboxing option is
enabled. By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great source
of confusion that has been reported by users over the years, due to
how these options require an extra setting to be manually enabled when
used in the per-user service manager, as opposed as to the system
service manager. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
Security Relevant Changes:
* pam_systemd will now by default pass the CAP_WAKE_ALARM ambient
process capability to invoked session processes of regular users on
local seats (as well as to systemd --user), unless configured
otherwise via data from JSON user records, or via the PAM module's
parameter list. This is useful in order allow desktop tools such as
GNOME's Alarm Clock application to set a timer for
CLOCK_REALTIME_ALARM that wakes up the system when it elapses. A
per-user service unit file may thus use AmbientCapability= to pass
the capability to invoked processes. Note that this capability is
relatively narrow in focus (in particular compared to other process
capabilities such as CAP_SYS_ADMIN) and we already — by default —
permit more impactful operations such as system suspend to local
users.
Service Manager:
* "Startup" memory settings are now supported. Previously IO and CPU
settings were already supported via StartupCPUWeight= and similar,
this adds the same logic for the various per-unit memory settings
StartupMemoryMax= and related.
* The service manager gained support for enqueuing POSIX signals to
services that carry an additional integer value, exposing the
sigqueue() system call. This is accessible via new D-Bus calls
QueueSignalUnit() (and related), as well as in systemctl via the new
--kill-value= parameter.
* systemctl gained a new "list-paths" verb, which shows all currently
active .path units, similar to how "systemctl list-timers" shows
active timers, and "systemctl list-sockets" shows active sockets.
* If MemoryDenyWriteExecute= is enabled for a service and the kernel
supports the new PR_SET_MDWE prctl() call it is used in preference
over seccomp() based system call filtering to achieve the same effect.
* systemctl gained a new --when= switch which is honoured by the various
forms of shutdown (i.e. reboot, kexec, poweroff, halt) and allows
scheduling these operations by time, similar in fashion to how this
has been supported by SysV shutdown.
* A new set of kernel command line options is now understood:
systemd.tty.term.<name>=, systemd.tty.rows.<name>=,
systemd.tty.columns.<name>= allow configuring the TTY type and
dimensions for the tty specified via <name>. When systemd invokes a
service on a tty (via TTYName=) it will look for these and configure
the TTY accordingly. This is particularly useful in VM environments,
to propagate host terminal settings into the appropriate TTYs of the
guest.
* A new RootEphemeral= setting is now understood in service units. It
takes a boolean argument. If enabled for services that use RootImage=
or RootDirectory= an ephemeral copy of the disk image or directory
tree is made when the service is started. It is removed automatically
when the service is stopped. That ephemeral copy is made using
btrfs/xfs reflinks or btrfs snaphots, if available.
* The service activation logic gained new settings RestartSteps= and
RestartMaxDelaySec= which allow exponentially growing restart
intervals for Restart=.
* PID 1 will now automatically load the virtio_console kernel module
during early initialization if running in a suitable VM. This is done
so that early-boot logging can be written to the console if available.
* Similar, virtio-vsock supported is loaded early too in suitable VM
environments. Since PID 1 sends sd_notify() notifications via
AF_VSOCK to the VMM these days (if requested), loading this early is
beneficial.
* A new verb "fdstore" has been added to systemd-analyze to show the
current contents of the file descriptor store of a unit. This is
backed by a new D-Bus call DumpUnitFileDescriptorStore() provided by
the service manager.
* The service manager will now set a new $FDSTORE environment variable
when invoking processes for services that have the file descriptor
store enabled.
* A new service option FileDescriptorStorePreserve= has been added that
allows tuning the life-cycle of the per-service file descriptor
store. If set to "yes" the entries in the fd store are retained even
after the service is fully stopped.
* The "systemctl clean" command may now be used to clear the fdstore of
a service.
* Unit *.preset files gained a new directive "ignore", in addition to
the existing "enable" and "disable". As the name suggests it leaves
units defined like this in its status quo, i.e. neither enables nor
disables them.
* Service units gained a new setting DelegateSubgroup=. It takes the
name of a sub-cgroup to place any processes the service manager forks
off in. Previously, the service manager would place all service
processes directly in the top-level cgroup it creates for them, no
matter what. This usually meant that services with delegation enabled
would first have to move themselves down some level in order to not
conflict with the "no processes in inner cgroups" rule of
cgroupv2. With this option it is now possible to configure the name
of a subgroup to place all processes forked off by PID 1 in directly.
* The service manager will now look for .upholds/ directories, similar
to the existing support for .wants/ and .requires/ directories, and
uses contained symlinked units for creating Upholds=
dependencies. The [Install] section of unit files gained support for
a new UpheldBy= directive to generate symlinks of this automatically
when a unit is enabled.
* The service manager now supports a new kernel command line option
systemd.default_device_timeout_sec=, which may be used to override
the default timeout for .device units.
* A new "soft-reboot" mechanism has been added to the service
manager. A "soft reboot" is similar to a regular reboot, except that
it affects userspace only: the service manager shuts down the running
services and other units, then optionally switches into a new root
file system (mounted to /run/nextroot/), and then passes control to a
systemd instance in the new file system which then starts the system
up again. The kernel is not rebooted and neither is hardware,
firmware or boot loader. It is a fast, lightweight mechanism to
quickly reset or update userspace, without the latency that a full
system reset involves. Moreover, open file descriptors may be passed
across the soft reboot into the new system where they will be passed
back to the originating services. This allows pinning resources
across the reboot, thus minimizing grey-out time further. Moreover,
it is possible to allow specific crucial services to survive the
reboot process, if they run off a separate root file system (i.e. use
RootDirectory= or RootImage=, or are portable services). This new
reboot mechanism is accessible via the new "systemctl soft-reboot"
command.
* A new service setting MemoryKSM= has been added, which may be used to
enable kernel same-page merging individually for services.
* A new service setting ImportCredentials= has been added that augments
LoadCredential= and LoadCredentialEncrypted= and searches for
credentials to import from the system, and supports globbing.
Journal:
* The sd-journal API learnt a new call sd_journal_get_seqnum() for
retrieving the current log record's sequence number and sequence
number ID, which allows applications to order records the same way as
journal does internally already. The sequence number is now also
exported in the JSON and "export" output of the journal.
* journalctl gained a new switch --truncate-newline. If specified
multi-line log records will be truncated at the first newline,
i.e. only the first line of each log message is shown.
systemd-repart:
* systemd-repart's drop-in files gained a new ExcludeFiles= option which
may be used to exclude certain files from the effect of CopyFiles=,
which allows populating newly created partitions automatically.
* systemd-repart's Verity support now implements the Minimize= setting
to minimize the size of the resulting partition.
* systemd-repart gained a new --offline= switch, which may be used to
control whether images shall be built "online" or "offline",
i.e. whether to make use of kernel facilities such as loopback block
devices and DM or not.
* If systemd-repart is told to populate a newly created ESP or XBOOTLDR
partition with some files it will now default to VFAT rather than
ext4, unless specified otherwise.
* systemd-repart gained a new --architecture= switch. If specified, the
per-architecture GPT partition types (i.e. the root and /usr/
partitions) configured in the partition drop-in files are
automatically adjusted to match the specified CPU architecture, in
order to simplify cross-architecture DDI building.
systemd-boot, systemd-stub, ukify, bootctl, kernel-install:
* bootctl gained a new switch --print-root-device (or short: -R) that
prints the main block device the root file system is backed by. It's
useful for invocations such as "cfdisk $(bootctl -R)" to quickly have
a look at the partition table of the running OS.
* systemd-stub will now look for the SMBIOS Type 1 field
"io.systemd.stub.kernel-cmdline-extra" and append its value to the
kernel command line it invokes. This is useful for VMMs such as qemu
to pass additional kernel command lines into the system even when
booting via full UEFI. It's measured into TPM PCR 12.
* The KERNEL_INSTALL_LAYOUT= setting for kernel-install gained a new
value "auto". If used a kernel will be automatically analyzed, and if
it qualifies as UKI it will be installed as if the setting was to set
to "uki", otherwise via "bls".
* systemd-stub can now optionally load UEFI PE "add-on" images that may
contain additional kernel command line information. These "add-ons"
superficially look like a regular UEFI executable, and are expected
to be signed via SecureBoot/shim. However, they do not actually
contain code, but instead a subset of the PE sections that UKIs
support. They are supposed to provide a way to extend UKIs with
additional resources in a secure and authenticated way. Currently,
only the .cmdline PE section may be used in add-ons, in which case
any specified string is appended to the command line embedded into
the UKI itself. A new 'addon<EFI-ARCH>.efi.stub' is now provided that
can be used to trivially create addons, via 'ukify' or 'objcopy'. In
the future we expect other sections to be made extensible like this as
well.
* ukify has been updated to allow building these UEFI PE "add-on"
images, using the new 'addon<EFI-ARCH>.efi.stub'.
* ukify gained a new "genkey" verb for generating a set of of key pairs
to sign UKIs and their PCR data with.
* The kernel-install script has been rewritten in C, and reuses much of
the infrastructure of existing tools such as bootctl. It also gained
--esp-path= and --boot-path= options to override the path to the ESP,
and the $BOOT partition. Options --make-entry-directory= and
--entry-token= have been added as well, similar to bootctl's options
of the same name.
* A new kernel-install plugin 60-ukify has been added which will
combine kernel/initrd locally into an UKI and sign them with a local
key. This may be used to switch to UKI mode even on systems where a
local kernel or initrd shall be supported. (Typically UKIs are built
and signed on OS vendor systems.)
* The ukify tool now supports "petool" in addition to the pre-existing
"sbsign" for signing UKIs.
* systemd-measure and systemd-stub now look for a new .uname PE section
that should encode the kernel's "uname -r" string.
* systemd-measure may now calculate expected PCR hashes for a UKI
"offline", i.e. requires no access to a TPM (neither physical nor
software emulated).
Memory Pressure & Control:
* The sd-event API gained new calls sd_event_add_memory_pressure(),
sd_event_source_set_memory_pressure_type(),
sd_event_source_set_memory_pressure_period() for creating and
configuring an event source that is called whenever the OS signals
memory pressure. Another call sd_event_trim_memory() is provided that
compacts the process' memory use by releasing allocated but unused
malloc() memory back to the kernel. Services can also provide their
own custom callback to do memory trimming. This should improve system
behaviour under memory pressure, as on Linux traditionally provided no
mechanism to return process memory back to the kernel if the kernel
was under pressure to acquire some. This makes use of the kernel's PSI
interface. Most long-running services that systemd contains have been
hooked up with this, and in particular systems with low memory should
benefit from this.
* Service units learnt the new MemoryPressureWatch=,
MemoryPressureThresholdSec= for configuring the PSI memory pressure
logic individually. If these options are used the
$MEMORY_PRESSURE_WATCH and $MEMORY_PRESSURE_WRITE environment
variables will be set for the invoked processes to inform them about
the requested memory pressure behaviour. (This is used by the
aforementioned sd-events API additions, if set.)
* systemd-analyze gained a new "malloc" verb that shows the output
generated by glibc's malloc_info() on services that support it. Right
now, only the service manager has been updated accordingly.
User & Session Management:
* The sd-login API gained a new call sd_session_get_username() for
returning the user name who owns a specific login session. It also
gained a new call sd_session_get_start_time() for retrieving the time
the login session started. A new call sd_session_get_leader() has
been added to return the PID of the "leader" process of a session. A
new call sd_uid_get_login_time() returns the time since when the
specified user has most recently been continuously logged in with at
least one session.
* JSON user records gained a new set of fields capabilityAmbientSet and
capabilityBoundingSet which contain a list of POSIX capabilities to
set for the logged in users in the ambient and bounding sets,
respectively. homectl gained the ability to configure these two sets
for users via --capability-bounding-set=/--capability-ambient-set=.
* pam_systemd learnt two new module options
default-capability-bounding-set= + default-capability-ambient-set= to
configure the default bounding sets for users as they are logging in,
if the JSON user record doesn't specify this explicitly (see
above). The built-in default for the ambient set now contains the
CAP_WAKE_ALARM, thus allowing regular users who may log in locally to
resume from a system suspend via a timer. (see above)
* The Session D-Bus objects systemd-logind gained a new SetTTY() method
call for updating the TTY of a session after it has been allocated
already. This is useful for SSH sessions which are typically
allocated first, and for which a TTY is added in later.
* The sd-login API gained a new call sd_pid_notifyf_with_fds() which
combines the various other sd_pid_notify() flavours into one: takes a
format string, an overriding PID, and a set of file descriptors to
send along. It also gained a new call sd_pid_notify_barrier() which
is equivalent to sd_notify_barrier() but allows specification of the
originating PID.
* "loginctl list-users" and "loginctl list-sessions" will now show the
state of each logged in user/session in their tabular output. It will
also show the current idle state of sessions.
DDIs:
* systemd-dissect will now show the intended CPU architecture of an
inspected DDI.
* systemd-dissect will now install itself as mount helper for the "ddi"
pseudo-file system type. This means you may now mount DDIs directly
via /bin/mount or /etc/fstab, making full use of embedded Verity
information and all other DDI features. Example: mount -t ddi
myimage.raw /some/where
* The systemd-dissect tool gained the new switches --attach/--detach for
attaching a DDI to a loopback block device without mounting it. It
will automatically derive the right sector size from the image and set
up Verity and similar, but not mount the file systems in it.
* When systemd-gpt-auto-generator or the DDI mounting logic mount an ESP
or XBOOTLDR partition the MS_NOSYMFOLLOW mount option is now
implied. Given that these file systems are typically untrusted
territory this should make mounting them automatically have less of a
security impact.
* All tools that parse DDIs (such as systemd-nspawn, systemd-dissect,
systemd-tmpfiles, …) now understand a new switch --image-policy= which
takes a string encoding image dissection policy. With this mechanism
automatic discovery and use of specific partition types and the
cryptographic requirements on the partitions (Verity, LUKS, …) can be
restricted, permitting better control of the exposed attack surfaces
when mounting disk images. systemd-gpt-auto-generator will honour such
an image policy too, configurable via the systemd.image_policy= kernel
command line option. Unit files gained the RootImagePolicy=,
MountImagePolicy= and ExtensionImagePolicy= to configure the same for
disk images a service runs off.
* systemd-analyze gained a new verb "image-policy" for validating and
parsing image policy strings.
* systemd-dissect gained support for a new --validate switch for
superficially validating DDI structure, and checking whether a
specific image policy allows the DDI.
Network Management:
* networkd's GENEVE support as gained a new .network option
InheritInnerProtocol=.
Device Management:
* udevadm gained the new "verify" verb for validating udev rules files
offline.
* udev will now create symlinks to loopback block devices in the
/dev/loop/by-ref/ directory that are based on the .lo_file_name string
field selected during allocation. The systemd-dissect tool and the
util-linux losetup command now supports a complementing new switch
--loop-ref= for selecting the string. This means a loopback block
device may now be allocated under a caller chosen reference and can
subsequently be referenced by that without first having to look up the
block device name the caller ended up with.
* udev also creates symlinks to loopback block devices in the
/dev/loop/by-ref/ directory based on the .st_dev/st_ino fields of the
inode attached to the loopback block device. This means that attaching
a file to a loopback device will implicitly make a handle available to
be found via that file's inode information.
* udev gained a new tool "iocost" that can be used to configure QoS IO
cost data based on hwdb information onto suitable block devices. Also
see https://github.com/iocost-benchmark/iocost-benchmarks.
TPM2 Support + Disk Encryption & Authentication:
* systemd-cryptenroll/systemd-cryptsetup will now install a TPM2 SRK
("Storage Root Key") as first step in the TPM2, and then use that
for binding FDE to, if TPM2 support is used. This matches
recommendations of TCG (see
https://trustedcomputinggroup.org/wp-content/uploads/TCG-TPM-v2.0-Provisioning-Guidance-Published-v1r1.pdf)
* systemd-cryptenroll and other tools that take TPM2 PCR parameters now
understand textual identifiers for these PCRs.
* systemd-veritysetup + /etc/veritytab gained support for a series of
new options: hash-offset=, superblock=, format=, data-block-size=,
hash-block-size=, data-blocks=, salt=, uuid=, hash=, fec-device=,
fec-offset=, fec-roots= to configure various aspects of a Verity
volume.
* systemd-cryptsetup + /etc/crypttab gained support for a new
veracrypt-pim= option for setting the Personal Iteration Multiplier
of veracrypt volumes.
* systemd-integritysetup + /etc/integritytab gained support for a new
mode= setting for controlling the dm-integrity mode (journal, bitmap,
direct) for the volume.
* systemd-analyze gained a new verb "pcrs" that shows the known TPM PCR
registers, their symbolic names and current values.
systemd-tmpfiles:
* The ACL support in tmpfiles.d/ has been updated: if an uppercase "X"
access right is specified this is equivalent to "x" but only if the
inode in question already has the executable bit set for at least
some user/group. Otherwise the "x" bit will be turned off.
* tmpfiles.d/'s C line type now understands a new modifier "+": a line
with C+ will result in a "merge" copy, i.e. all files of the source
tree are copied into the target tree, even if that tree already
exists, resulting in a combined tree of files already present in the
target tree and those copied in.
* systemd-tmpfiles gained a new --graceful switch. If specified lines
with unknown users/groups will silently be skipped.
systemd-notify:
* systemd-notify gained two new options --fd= and --fdname= for sending
arbitrary file descriptors to the service manager (while specifying an
explicit name for it).
* systemd-notify gained a new --exec switch, which makes it execute the
specified command line after sending the requested messages. This is
useful for sending out READY=1 first, and then continuing invocation
without changing process ID, so that the tool can be nicely used
within an ExecStart= line of a unit file that uses Type=ready.
sd-event + sd-bus APIs:
* The sd-event API gained a new call sd_event_source_leave_ratelimit()
which may be used to explicitly end a rate-limit state an event
source might be in, resetting all rate limiting counters.
* When the sd-bus library is used to make connections to AF_UNIX D-Bus
sockets, it will now encode the "description" one can set via
sd_bus_set_description into the source socket address. It will also
look for this information when accepting a connection. This is useful
to track individual D-Bus connections on a D-Bus broker for debug
purposes.
systemd-resolved:
* systemd-resolved gained a new resolved.conf setting
StateRetentionSec= which may be used to retain cached DNS records
even after their nominal TTL, and use them in case upstream DNS
servers cannot be reached. This should make name resolution more
resilient in case of network problems.
* resolvectl gained a new verb "show-cache" for showing current cache
contents of systemd-resolved.
Other:
* The default keymap to apply may now be chosen at build-time via the
new default-keymap meson option.
* Most of systemd's long-running services now have a generic handler of
the SIGRTMIN+18 signal handler which executes various operations
depending on the sigqueue() parameter sent along. For example, values
0x100…0x107 allow changing the maximum log level of such
services. 0x200…0x203 allow changing the log target of such
services. 0x300 make the services trim their memory similar to the
automatic PSI triggered action, see above. 0x301 make the services
output their malloc_info() data to the logs.
* machinectl gained new "edit" and "cat" verbs for editing .nspawn
files, inspired by systemctl's verbs of the same which edit unit
files. Similar, networkctl gained the same verbs for editing
.network, .netdev, .link files.
* A new syscall filter group "@sandbox" has been added that contains
syscalls for sandboxing system calls such as those for seccomp and
Landlock.
* New documentation has been added:
https://systemd.io/COREDUMP
https://systemd.io/MEMORY_PRESSURE
* systemd-firstboot gained a new --reset option. If specified the
settings in /etc/ it normally initializes are reset instead.
* systemd-sysext is now a multi-call binary and also installed under the
systemd-confext alias name (via a symlink). When invoked that way it
will operate on /etc/ instead of /usr/ + /opt/. It thus becomes a
powerful, atomic, secure configuration management of sorts, that
locally can merge configuration from multiple confext configuration
images into a single immutable tree.
* The --network-macvlan=, --network-ipvlan=, --network-interface=
switches of systemd-nspawn may now optionally take the intended
network interface inside the container.
* All our programs will now send an sd_notify() message with their exit
status in the EXIT_STATUS= field when exiting, using the usual
protocol, including PID 1. This is useful for VMMs and container
managers to collect an exit status from a system as it shuts down, as
set via "systemctl exit …". This is particularly useful in test cases
and similar, as invocations via a VM can now nicely propagate an exit
status to the host, similar to local processes.
* systemd-run gained a new switch --expand-environment=no to disable
server-side environment variable expansion in specified command
lines.
* The systemd-system-update-generator has been update to also look for
the special flag file /etc/system-update in addition to the existing
support for /system-update to decide whether to enter system update
mode.
* The /dev/hugepages/ file system is now mounted with nosuid + nodev
mount options by default.
* systemd-fstab-generator now understands two new kernel command line
options systemd.mount-extra= and systemd.swap-extra= which may be
used to configure additional mounts or swaps via the kernel command
line, in a format similar to /etc/fstab lines.
* systemd-sysupdate's sysupdate.d/ drop-ins gained a new setting
PathRelativeTo=, which can be set to "esp", "xbootldr", "boot", in
which case the Path= setting is taken relative to the ESP or XBOOTLDR
partitions, rather than the system's root directory /. The relevant
directories are automatically discovered.
* The systemd-ac-power tool gained a new switch --low, which reports
whether the battery charge is considered "low", similar to how the
s2h suspend logic checks this state to decide whether to enter system
suspend or hibernation.
* The /etc/os-release file now has two new optional fields VENDOR_NAME=
and VENDOR_URL= carrying information about the vendor of the OS.
* When the system hibernates information about the used device and
offset is now written to a non-volatile EFI variable. On next boot
the system will attempt to resume from the location indicated in this
EFI variable. This should make hibernation a lot more robust, and
requiring no manual configuration of the resume location.
* The $XDG_STATE_HOME environment variable (added in more recent
versions of the XDG basedir specification) is now honoured to
implement the StateDirectory= setting in user services.
* A new component "systemd-battery-check" has been added. It may run
during early boot (usually in the initrd), and checks the battery
charge level of the system. In case the charge level is very low the
user is notified (graphically via Plymouth – if available – as well
as in text form on the console), and the system is turned off after a
10s delay.
CHANGES WITH 253:
Announcements of Future Feature Removals and Incompatible Changes:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to change behaviour w.r.t. units of the per-user service
manager and sandboxing options, so that they work without having to
manually enable PrivateUsers= as well, which is not required for
system units. To make this work, we will implicitly enable user
namespaces (PrivateUsers=yes) when a sandboxing option is enabled in a
user unit. The drawback is that system users will no longer be visible
(and appear as 'nobody') to the user unit when a sandboxing option is
enabled. By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great source
of confusion that has been reported by users over the years, due to
how these options require an extra setting to be manually enabled when
used in the per-user service manager, as opposed as to the system
service manager. We plan to enable this change in the next release
later this year. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
Deprecations and incompatible changes:
* systemctl will now warn when invoked without /proc/ mounted
(e.g. when invoked after chroot() into an directory tree without the
API mount points like /proc/ being set up.) Operation in such an
environment is not fully supported.
* The return value of 'systemctl is-active|is-enabled|is-failed' for
unknown units is changed: previously 1 or 3 were returned, but now 4
(EXIT_PROGRAM_OR_SERVICES_STATUS_UNKNOWN) is used as documented.
* 'udevadm hwdb' subcommand is deprecated and will emit a warning.
systemd-hwdb (added in 2014) should be used instead.
* 'bootctl --json' now outputs a single JSON array, instead of a stream
of newline-separated JSON objects.
* Udev rules in 60-evdev.rules have been changed to load hwdb
properties for all modalias patterns. Previously only the first
matching pattern was used. This could change what properties are
assigned if the user has more and less specific patterns that could
match the same device, but it is expected that the change will have
no effect for most users.
* systemd-networkd-wait-online exits successfully when all interfaces
are ready or unmanaged. Previously, if neither '--any' nor
'--interface=' options were used, at least one interface had to be in
configured state. This change allows the case where systemd-networkd
is enabled, but no interfaces are configured, to be handled
gracefully. It may occur in particular when a different network
manager is also enabled and used.
* Some compatibility helpers were dropped: EmergencyAction= in the user
manager, as well as measuring kernel command line into PCR 8 in
systemd-stub, along with the -Defi-tpm-pcr-compat compile-time
option.
* The '-Dupdate-helper-user-timeout=' build-time option has been
renamed to '-Dupdate-helper-user-timeout-sec=', and now takes an
integer as parameter instead of a string.
* The DDI image dissection logic (which backs RootImage= in service
unit files, the --image= switch in various tools such as
systemd-nspawn, as well as systemd-dissect) will now only mount file
systems of types btrfs, ext4, xfs, erofs, squashfs, vfat. This list
can be overridden via the $SYSTEMD_DISSECT_FILE_SYSTEMS environment
variable. These file systems are fairly well supported and maintained
in current kernels, while others are usually more niche, exotic or
legacy and thus typically do not receive the same level of security
support and fixes.
* The default per-link multicast DNS mode is changed to "yes"
(that was previously "no"). As the default global multicast DNS mode
has been "yes" (but can be changed by the build option), now the
multicast DNS is enabled on all links by default. You can disable the
multicast DNS on all links by setting MulticastDNS= in resolved.conf,
or on an interface by calling "resolvectl mdns INTERFACE no".
New components:
* A tool 'ukify' tool to build, measure, and sign Unified Kernel Images
(UKIs) has been added. This replaces functionality provided by
'dracut --uefi' and extends it with automatic calculation of PE file
offsets, insertion of signed PCR policies generated by
systemd-measure, support for initrd concatenation, signing of the
embedded Linux image and the combined image with sbsign, and
heuristics to autodetect the kernel uname and verify the splash
image.
Changes in systemd and units:
* A new service type Type=notify-reload is defined. When such a unit is
reloaded a UNIX process signal (typically SIGHUP) is sent to the main
service process. The manager will then wait until it receives a
"RELOADING=1" followed by a "READY=1" notification from the unit as
response (via sd_notify()). Otherwise, this type is the same as
Type=notify. A new setting ReloadSignal= may be used to change the
signal to send from the default of SIGHUP.
user@.service, systemd-networkd.service, systemd-udevd.service, and
systemd-logind have been updated to this type.
* Initrd environments which are not on a pure memory file system (e.g.
overlayfs combination as opposed to tmpfs) are now supported. With
this change, during the initrd → host transition ("switch root")
systemd will erase all files of the initrd only when the initrd is
backed by a memory file system such as tmpfs.
* New per-unit MemoryZSwapMax= option has been added to configure
memory.zswap.max cgroup properties (the maximum amount of zswap
used).
* A new LogFilterPatterns= option has been added for units. It may be
used to specify accept/deny regular expressions for log messages
generated by the unit, that shall be enforced by systemd-journald.
Rejected messages are neither stored in the journal nor forwarded.
This option may be used to suppress noisy or uninteresting messages
from units.
* The manager has a new
org.freedesktop.systemd1.Manager.GetUnitByPIDFD() D-Bus method to
query process ownership via a PIDFD, which is more resilient against
PID recycling issues.
* Scope units now support OOMPolicy=. Login session scopes default to
OOMPolicy=continue, allowing login scopes to survive the OOM killer
terminating some processes in the scope.
* systemd-fstab-generator now supports x-systemd.makefs option for
/sysroot/ (in the initrd).
* The maximum rate at which daemon reloads are executed can now be
limited with the new ReloadLimitIntervalSec=/ReloadLimitBurst=
options. (Or the equivalent on the kernel command line:
systemd.reload_limit_interval_sec=/systemd.reload_limit_burst=). In
addition, systemd now logs the originating unit and PID when a reload
request is received over D-Bus.
* When enabling a swap device systemd will now reinitialize the device
when the page size of the swap space does not match the page size of
the running kernel. Note that this requires the 'swapon' utility to
provide the '--fixpgsz' option, as implemented by util-linux, and it
is not supported by busybox at the time of writing.
* systemd now executes generator programs in a mount namespace
"sandbox" with most of the file system read-only and write access
restricted to the output directories, and with a temporary /tmp/
mount provided. This provides a safeguard against programming errors
in the generators, but also fixes here-docs in shells, which
previously didn't work in early boot when /tmp/ wasn't available
yet. (This feature has no security implications, because the code is
still privileged and can trivially exit the sandbox.)
* The system manager will now parse a new "vmm.notify_socket"
system credential, which may be supplied to a VM via SMBIOS. If
found, the manager will send a "READY=1" notification on the
specified socket after boot is complete. This allows readiness
notification to be sent from a VM guest to the VM host over a VSOCK
socket.
* The sample PAM configuration file for systemd-user@.service now
includes a call to pam_namespace. This puts children of user@.service
in the expected namespace. (Many distributions replace their file
with something custom, so this change has limited effect.)
* A new environment variable $SYSTEMD_DEFAULT_MOUNT_RATE_LIMIT_BURST
can be used to override the mount units burst late limit for
parsing '/proc/self/mountinfo', which was introduced in v249.
Defaults to 5.
* Drop-ins for init.scope changing control group resource limits are
now applied, while they were previously ignored.
* New build-time configuration options '-Ddefault-timeout-sec=' and
'-Ddefault-user-timeout-sec=' have been added, to let distributions
choose the default timeout for starting/stopping/aborting system and
user units respectively.
* Service units gained a new setting OpenFile= which may be used to
open arbitrary files in the file system (or connect to arbitrary
AF_UNIX sockets in the file system), and pass the open file
descriptor to the invoked process via the usual file descriptor
passing protocol. This is useful to give unprivileged services access
to select files which have restrictive access modes that would
normally not allow this. It's also useful in case RootDirectory= or
RootImage= is used to allow access to files from the host environment
(which is after all not visible from the service if these two options
are used.)
Changes in udev:
* The new net naming scheme "v253" has been introduced. In the new
scheme, ID_NET_NAME_PATH is also set for USB devices not connected via
a PCI bus. This extends the coverage of predictable interface names
in some embedded systems.
The "amba" bus path is now included in ID_NET_NAME_PATH, resulting in
a more informative path on some embedded systems.
* Partition block devices will now also get symlinks in
/dev/disk/by-diskseq/<seq>-part<n>, which may be used to reference
block device nodes via the kernel's "diskseq" value. Previously those
symlinks were only created for the main block device.
* A new operator '-=' is supported for SYMLINK variables. This allows
symlinks to be unconfigured even if an earlier rule added them.
* 'udevadm --trigger --settle' now also works for network devices
that are being renamed.
Changes in sd-boot, bootctl, and the Boot Loader Specification:
* systemd-boot now passes its random seed directly to the kernel's RNG
via the LINUX_EFI_RANDOM_SEED_TABLE_GUID configuration table, which
means the RNG gets seeded very early in boot before userspace has
started.
* systemd-boot will pass a disk-backed random seed – even when secure
boot is enabled – if it can additionally get a random seed from EFI
itself (via EFI's RNG protocol), or a prior seed in
LINUX_EFI_RANDOM_SEED_TABLE_GUID from a preceding bootloader.
* systemd-boot-system-token.service was renamed to
systemd-boot-random-seed.service and extended to always save a random
seed to ESP on every boot when a compatible boot loader is used. This
allows a refreshed random seed to be used in the boot loader.
* systemd-boot handles various seed inputs using a domain- and
field-separated hashing scheme.
* systemd-boot's 'random-seed-mode' option has been removed. A system
token is now always required to be present for random seeds to be
used.
* systemd-boot now supports being loaded from other locations than the
ESP, for example for direct kernel boot under QEMU or when embedded
into the firmware.
* systemd-boot now parses SMBIOS information to detect
virtualization. This information is used to skip some warnings which
are not useful in a VM and to conditionalize other aspects of
behaviour.
* systemd-boot now supports a new 'if-safe' mode that will perform UEFI
Secure Boot automated certificate enrollment from the ESP only if it
is considered 'safe' to do so. At the moment 'safe' means running in
a virtual machine.
* systemd-stub now processes random seeds in the same way as
systemd-boot already does, in case a unified kernel image is being
used from a different bootloader than systemd-boot, or without any
boot load at all.
* bootctl will now generate a system token on all EFI systems, even
virtualized ones, and is activated in the case that the system token
is missing from either sd-boot and sd-stub booted systems.
* bootctl now implements two new verbs: 'kernel-identify' prints the
type of a kernel image file, and 'kernel-inspect' provides
information about the embedded command line and kernel version of
UKIs.
* bootctl now honours $KERNEL_INSTALL_CONF_ROOT with the same meaning
as for kernel-install.
* The JSON output of "bootctl list" will now contain two more fields:
isDefault and isSelected are boolean fields set to true on the
default and currently booted boot menu entries.
* bootctl gained a new verb "unlink" for removing a boot loader entry
type #1 file from disk in a safe and robust way.
* bootctl also gained a new verb "cleanup" that automatically removes
all files from the ESP's and XBOOTLDR's "entry-token" directory, that
is not referenced anymore by any installed Type #1 boot loader
specification entry. This is particularly useful in environments where
a large number of entries reference the same or partly the same
resources (for example, for snapshot-based setups).
Changes in kernel-install:
* A new "installation layout" can be configured as layout=uki. With
this setting, a Boot Loader Specification Type#1 entry will not be
created. Instead, a new kernel-install plugin 90-uki-copy.install
will copy any .efi files from the staging area into the boot
partition. A plugin to generate the UKI .efi file must be provided
separately.
Changes in systemctl:
* 'systemctl reboot' has dropped support for accepting a positional
argument as the argument to the reboot(2) syscall. Please use the
--reboot-argument= option instead.
* 'systemctl disable' will now warn when called on units without
install information. A new --no-warn option has been added that
silences this warning.
* New option '--drop-in=' can be used to tell 'systemctl edit' the name
of the drop-in to edit. (Previously, 'override.conf' was always
used.)
* 'systemctl list-dependencies' now respects --type= and --state=.
* 'systemctl kexec' now supports XEN VMM environments.
* 'systemctl edit' will now tell the invoked editor to jump into the
first line with actual unit file data, skipping over synthesized
comments.
Changes in systemd-networkd and related tools:
* The [DHCPv4] section in .network file gained new SocketPriority=
setting that assigns the Linux socket priority used by the DHCPv4 raw
socket. This may be used in conjunction with the
EgressQOSMaps=setting in [VLAN] section of .netdev file to send the
desired ethernet 802.1Q frame priority for DHCPv4 initial
packets. This cannot be achieved with netfilter mangle tables because
of the raw socket bypass.
* The [DHCPv4] and [IPv6AcceptRA] sections in .network file gained a
new QuickAck= boolean setting that enables the TCP quick ACK mode for
the routes configured by the acquired DHCPv4 lease or received router
advertisements (RAs).
* The RouteMetric= option (for DHCPv4, DHCPv6, and IPv6 advertised
routes) now accepts three values, for high, medium, and low preference
of the router (which can be set with the RouterPreference=) setting.
* systemd-networkd-wait-online now supports matching via alternative
interface names.
* The [DHCPv6] section in .network file gained new SendRelease=
setting which enables the DHCPv6 client to send release when
it stops. This is the analog of the [DHCPv4] SendRelease= setting.
It is enabled by default.
* If the Address= setting in [Network] or [Address] sections in .network
specified without its prefix length, then now systemd-networkd assumes
/32 for IPv4 or /128 for IPv6 addresses.
* networkctl shows network and link file dropins in status output.
Changes in systemd-dissect:
* systemd-dissect gained a new option --list, to print the paths of
all files and directories in a DDI.
* systemd-dissect gained a new option --mtree, to generate a file
manifest compatible with BSD mtree(5) of a DDI
* systemd-dissect gained a new option --with, to execute a command with
the specified DDI temporarily mounted and used as working
directory. This is for example useful to convert a DDI to "tar"
simply by running it within a "systemd-dissect --with" invocation.
* systemd-dissect gained a new option --discover, to search for
Discoverable Disk Images (DDIs) in well-known directories of the
system. This will list machine, portable service and system extension
disk images.
* systemd-dissect now understands 2nd stage initrd images stored as a
Discoverable Disk Image (DDI).
* systemd-dissect will now display the main UUID of GPT DDIs (i.e. the
disk UUID stored in the GPT header) among the other data it can show.
* systemd-dissect gained a new --in-memory switch to operate on an
in-memory copy of the specified DDI file. This is useful to access a
DDI with write access without persisting any changes. It's also
useful for accessing a DDI without keeping the originating file
system busy.
* The DDI dissection logic will now automatically detect the intended
sector size of disk images stored in files, based on the GPT
partition table arrangement. Loopback block devices for such DDIs
will then be configured automatically for the right sector size. This
is useful to make dealing with modern 4K sector size DDIs fully
automatic. The systemd-dissect tool will now show the detected sector
size among the other DDI information in its output.
Changes in systemd-repart:
* systemd-repart gained new options --include-partitions= and
--exclude-partitions= to filter operation on partitions by type UUID.
This allows systemd-repart to be used to build images in which the
type of one partition is set based on the contents of another