-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snap: minimal read-only squashfs library to read squashfs images #11170
Conversation
… squashfs image. currently can read metadata blocks, decompress xz, and almost list files from directory
…flags in the superblock
Hi, thanks for this, it looks nice. I have a couple of questions which hopefully can help answer some of your questions :-)
|
Hi Ian, thanks for your reply!
|
I have only briefly looked at the code. The only 2 instances we actually look inside is early checks during install, and later when unpacking the kernel contents. Perhaps it makes sense to support reading squahfs to handle untrusted input, but there's always a question of dependencies for the compression which must be present in Fedora and Debian (the only 2 distros that require non-vendored dependencies atm). There's also https://github.com/diskfs/go-diskfs which I think can handle squahfs and was used by lxd. However, perhaps the solution lies elsewhere. Right now we'll run unsquashfs directly from snapd so it runs as root. What if we put it in a confined sandbox and communicate over fds? We could in theory invoke it through systemd-run, lock down access to everything, set up seccomp filtering and maybe even slap an apparmor profile on top. I feel this could be easier to achieve and more flexible in the long run. |
@bboozzoo @anonymouse64 Sorry for the confusion here. So far we have not needed this and we are very careful with squashfs and only (generally) touch it if it's validated. However with the new authorization delegation work that Samuele is driving this changes and we will need to also look into squashfs that we are not fully sure about yet (Samuele has the details). This is why this work on having a implementation to look into squashfs in a safe language was started. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I have not done a thorough review, but have some scattered comments. Also, I think you need to follow the usual conventions for commits (one line summary, empty line, then more detailed explanation), although I understand that maybe you will rebase/refactor this.
I also worry a bit about maintenance and performance of https://github.com/ulikunitz/xz - the maintainer explicitly says that the lib is under development. I understand we want to avoid C libraries in general, but maybe it is better in this case to use a heavily used, well maintained and probably heavily scrutinized by security researchers (as noted by @valentindavid) library like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. This is great, I have a bunch of bike-shed/comments/suggestions inline for you consideration. I did not review everyting in detail but instead read over it to and just wrote down what I noticed (a bit stream of conscious like). Hope it's still helpful :)
// reading from it immediately. | ||
err = sfs.loadRootDirectory() | ||
if err != nil { | ||
sfs.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need error handling for "Close()" ? I.e. does it return an error, should it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes definitely, currently it just closes the stream which I'm afraid will coincide with the deferred close of the stream? Would that be an issue? But you are certainly correct here that I need to check the error
return nil, err | ||
} | ||
} | ||
return nil, fmt.Errorf("squashfs: %s not found", path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should have something like a custom error type, e.g. squashfs.NotExistsError for this case but that can be a followup of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really good point! I'll keep this open because this would be really nice to have.
Restructured code, and cleaned up a lot in names. Isolated functionality like the superblock parsing. Handle missing errors in the code. Handle uncompressed inodes in superblock flags
Moved some consts around where they belong instead of having them in a file where they werent used. Also updated naming on some of the squashfs structs.
…port Removed usage of the XZ go library, we now interface directly against liblzma, also implemented support for lzo backend by using liblzo.
…ation Based on review feedback, also restructured the code that parses inodes and added error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your changes, this is looking very nice. Please see some additional comments.
From a quick look at the code (in the Long story short, this appears to be called directly from the snapd executable. But then I'm very concerned by these changes (not by the code itself, which is fine, but the overall decision of doing this in this way): it seems that we are going to unpack the squashfs right from within the snapd process, and this can be a source of unpredictable issues. And of a few predictable ones:
Summing up, if what we are going to do is to replace the invocation of a potentially unsafe unsquashfs as a separate process with a potentially safer implementation (potentially, because having to call into a C library also brings its issues) implemented in the snapd process, this is a total no-go for me. I find @bboozzoo's idea of a stricter confinement for unsquashfs the most beneficial, and if we want to improve things further, reimplementing the tool in golang is certainly a good idea. But we should continue to execute it as a separate process, and I would be even more confident if the code was stored in a separate repository, because then it would be easier to make sure we are not accidentally embedding it into our process. All the above is just my 2 cents, as I don't have a decision role here. |
Change Inode and CompressionType to enums, consolidate code in lzma_backend.go so dublicate code is removed. Return nil instead of empty object
Thanks for your input Mardy, definitely relevant and you're raising a lot of potentional issues! I would like to (try) counter some of your points, and maybe come up with a few points of why switching to golang anyway, even if the compression happens in C would be beneficial for security: You are completly right that running the tool in a seperate process gives us the benefit of probably being immune to any memory issues there may exist from decompresing the image, or any vulnerabilites that causes memory issues/crashing (which doesn't crash the pc aswell). But in my eyes the benefit ends with that. When it comes to security issues, being vulernable to crashing is the least of the evils there is when it comes to malicious actors (RCE, privilege-escapes etc etc). We're not immune to anything else when we are using the tool in a seperate process, and we have to remember that unsquashfs is development utility, and not a tool meant for production usage. If the worst you can experience is a crash from a bad squashfs image, this is fixable. But a RCE or privilege escape? That can be fatal for your data or IP. By moving most of the code and functionality to golang (all except the compression), we provide better protection against both memory corruption, crashes and other traditional weaknesses in C, and yes, we will be vulernable in the compression libraries as we interface with C, but the only thing we lose here was the isolation unsquashfs provided, and we then trade that isolation for better security in all other aspects, we also gain better control of the code and how we decide to use this squashfs library. This can be for better or worse, but we have the potentional for a more secure way of handling squshfs images. |
Hmm there seems to be some misunderstanding here. The suggestion I made was to put a process in a sandbox. This could partially be achieved by invoking systemd-run with the right set of parameters + maybe some aa-exec/selinux wrapper on top. Ideally, the unsquahfs process runs in a read only /, with a minimal etc, no /home, no network, private tmp, etc. Exact details of how to achieve this are TBD, so not necessarily less work than squashfs in Go, but we won't know until we do some research. FWIW, it'd be great to also pass the squashfs package through go-fuzz before we land it in the tree.
We use unsquashfs when checking the snaps during installation. |
Apologies for the drive-by review, but from a security point of view, I agree with @bboozzoo that the best architecture here would be to perform the unsquashing in a sandboxed process - this could be achieved by a combination of seccomp and apparmor confinement (where available), and in a separate mount + network namespace etc. Additionally then having the unsquashing be implemented in Golang rather than C via unsquashfs would add additional benefits but I think it would be better to approach it from the system-level sandboxing angle first to provide some protection against the current implementation and then the Golang approach can be investigated. |
Thanks for all the insightful comments about the security. We can probably combine the approaches, i.e. fork and run our (go) code in a sandbox if needed. Fwiw, I really like this work as it has the potential to replace code like https://github.com/snapcore/snapd/blob/master/snap/squashfs/stat.go#L283 which always irked me a bit. Or avoid surprises like #10567. |
Thanks a lot for working on this and the excellent code. We need to come back to this a bit later because the immediate need for the authority delgation got worked-around in a different way. I think this will come up again and we will need it for various other reasons but for now I will close it as "precious" that we need to come back to. Hope that is okay. |
This is very early work for a library in snapd that supports reading squashfs images. This was done to be able to avoid using the unsquashfs command line tool. This is to provide a safer way of reading metadata from snaps that come from unstrusted sources instead of using unsquashfs which runs in root.
The goal was a minimal read-only library that should just help us read the meta/snap.yaml initially, but we can add whatever support we need. The goal is to support XZ and LZO initially. XZ/LZMA is now supported using liblzma, and lzo is supported through liblzo.
I just wanted to create this draft PR as soon as possible to get feedback on the implementation, and whether this is the correct way to go. I know the code is still early and tests are missing, and still figuring out where the code should be placed, but any feedback is welcome, and I definitely want to learn so correct anything please!
Again, please correct me if I've put the code the wrong place, or any golang issues. Still learning golang!