Replies: 3 comments
-
With an admittedly limited knowledge of the full implications, on the surface of it I would guess that following the path of the other rootless OCI runtimes would make sense. While it's less consistent with SingularityCE's existing behaviour, it's more consistent with the rest of the OCI space. In the event that someone wants the traditional CE behaviour with an OCI image, could they simply run the OCI image in non-OCI mode with 4.0? If so, that would seemingly cover both bases? |
Beta Was this translation helpful? Give feedback.
-
Thanks @tri-adam
I think I'm leaning towards this, slightly, too.
Yes, the non-oci runtime certainly isn't disappearing in 4.0, so the old behavior will still be easily available. |
Beta Was this translation helpful? Give feedback.
-
As this has now been implemented (following the OCI runtime approach) in a series of PRs, I'll close the thread. Thanks again. |
Beta Was this translation helpful? Give feedback.
-
SingularityCE, using the existing native runtime, has the following default capabilities handling:
User - no capabilities in container
User +
--fakeroot
- full capabilities for the fakeroot user inside the user namespaceRoot
Depends on
singularity.conf
:This differs quite a bit from OCI rootless runtimes:
podman, rootless userns as user in the container - specific set of bounding caps
podman, rootless userns as root in the container - specific set of current/bounding caps,
podman, rootfull - specific set of current/bounding caps
The question, then, is what behaviour makes sense for Singularity CE's new
--oci
mode?At present (#1587) no capability sets are wired up. The current, bounding, ambient sets are always empty in
--oci
mode.I believe there are a number of options:
1. Following SingularityCE native behaviour
--fakeroot
container, all capabilities are available in the user namespace, vs a limited set of caps.singularity.conf
default root capabilities
.The consistency gained from following SingularityCE's existing behaviour would be beneficial in a world where we can expect mixed use of the native and oci runtimes for some time.
The main argument against is that rootfull containers are more dangerous.. and we have to maintain a clear expectation that a rootfull container in SingularityCE's
--oci
mode is not intended to give any kind of security boundary. I don't think we have the resources to focus on making--oci
mode safe in rootfull operation anyway. There are many other aspects of--oci
mode which would have to be addressed including application of seccomp filtering by default, full review of mount and image handling etc.--fakeroot
containers run as a user would be more powerful than when following OCI caps. There is an argument that the large set of fakeroot caps here opens up additional risk, albeit limited to what the user can do anyway in a host shell. It's fairly equivalent to running e.g. rootless podman with--privileged
.2. Follow podman/OCI style in --oci mode only
--fakeroot
mode is reduced.This option would provide consistency with OCI rootless runtimes. It is likely that some workflows in a user namespace may be prevented, mostly container-in-container flows where
--privileged
is required with rootless Docker/podman etc. We'd have to provide a way to open things up. Note that we have a--keep-privs
flag with the native mode at the moment... but it's currently effective for certain rootful circumstances only.In rootfull
--oci
containers, operations would be more restricted. However, as above, rootfull containers would not be 'safe' without a huge amount of additional work that I don't believe we can accomplish for 4.0.3. Do something inbetween for --oci mode only
I'm not sure this would make sense. Opening up the
--oci
mode behaviour a little bit vs podman could ease some workflows, but we are then not consistent with the Singularity native mode, or stock OCI rootless runtimes.4. Change native mode behaviour, --oci based on podman/OCI rootless runtimes
We could implement capabilities in --oci mode following the podman / rootless OCI patterns.... and then move native mode closer to the same behaviour. This would bring internal consitency for 4.0, but make it inconsitent with prior releases.
I don't think we want to do this. Particularly any container in container workflow using
--fakeroot
in the outer layer could be affected.Beta Was this translation helpful? Give feedback.
All reactions