-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS/HTTP Node Attestor #4788
Comments
I'm currently thinking, I start with the x509pop plugin, copy it to a plugin named 'http', then modify it as follows: For the server plugin, change its Attest function, removing the x509 cert validation bits. Then change the challenge to generate a 'token' as per https://datatracker.ietf.org/doc/html/rfc8555#section-8.3. It is returned to the agent. The agent would then start a webserver on port 80 (default) or any configured port. (If port is != 80, something else needs to proxy on the host from 80->the chosen port). The agent would share out just "/.well-known/acme-challenge/$token" as per the acme rfc. The content would be the token Once the webserver is started, the agent would respond to the server that its ready, along with its proposed dns name. |
So the concern I have with this flow is: Some assumptions:
The ideal scenario is:
Now imagine this scenario:
At this point the bad actor has successfully hijacked the issued identity. Note: I may be making some wrong assumptions here, that may make this not really a possible attack. Let me think a bit more about how this would work securely. |
So, I think the way I can see this being made a bit safer without making a ton of changes to this flow is to use a self-signed mTLS identity for the Client side. The server would need to be configured to trust all and any client certificate for establishing an mTLS session for the attestation endpoint. Once we have that assumption, this DNS/HTTP auth plugin can be designed that for the entire lifetime of the challenge, the challenge is scoped to that specific client certificate. E.g. if the certificate is changed, it can not interject itself into another challenge-response flow. |
Ah. I see... I'll think some more on this too. Thanks. :) |
Looking at the plugin code, the plugin will only respond to the request from the same client tcp stream (?) so not sure a bad actor can man in the middle that process. If they can, it looks like there is a piece in the acme protocol meant to handle that: The initial client request is done with a jwk pair, and the client is expected to put its public fingerprint at the http token url as well. If we did the same, it would also close the loop I think? |
Yes ACME gets around this by using the ACME account. I didn't know if you wanted to build an ACME account model here.
I think as long as the bad actor isn't a layer 7 proxy that you connected to to talk to the server it should be fine. Note that the layer 7 proxy would have to be the one terminating TLS too, so it'd come down to do you trust the TLS certificate of the server at that point. |
Ah, gotcha. Not sure we need to adopt all of ACME, but there may be some advantage to reusing the bits of their protocol that work, to solve all the same problems? I could go either way though. |
Honestly, I think if this is scoped to a single TCP connection, and new TCP connections would have to full restart the flow, you'd solve the majority of my concerns with this. The only other stipulation being that the server MUST be protected by TLS for this to work properly. |
I think that is currently true with spire's currnent plugin model? Anyone we can have double check that assumption?
Just to double check, your referring here to the server plugin hosted out of spire, which is TLS protected? The temp webserver for the handshaking can be http only? |
If so then I think the initial proposal wouldn't create any concerns.
Yes & Yes |
Thank you @kfox1111 for bringing this up and thank you @aaomidi for your feedback. @kfox1111 If you think that a DNS/HTTP node attestor is the best option, and in the absence of other proposals, it would be great to make progress on scoping the work that needs to be done for the proposed solution, including some more details about the implementation, configuration and the mechanics of the attestation.
I'm sure there are other things to figure out also, but finding answers to those items will help a lot to have this scoped. Thanks again @kfox1111 for bringing this up to our attention! |
Yes it is correct. SPIRE server/agent node attestation is a bi-directional gRPC stream. It remains open until node attestation is complete. In the case we're discussing, the server will initiate the challenge check all while the agent is blocked on it, and the server will unblock after success. So the whole process is covered by a single stream lifetime. Thanks @amartinezfayo for the guidance, I agree answers to those points will help to move the issue out of unscoped. Considering my above comment, as far as the flows go, a starting point can be: I'm sure it will change as we get answers to e.g. multiple hosts, configuration (dns server config?) etc. One nice thing about this attestation type is it's repeatable. |
Will work on these things. but initial thoughts inline:
Will work on this. Some potential details discussed above. I'm thinking of sticking to port 80 from server -> agent for the reasons described in the acme http-01 documentation. (Short short answer, one of the most firewall friendly protocols/ports. Random ports can cause problems to some orgs. low ports can have extra security too)
I think this would be not allowed. Each node that wants to attest needs to have its own dns entry, and the selector returned is that dns name, so uniquely identifies the node. acme http-01 assumes this as well I believe.
I think this is transparent for http. The dns entry the server looks up is just a bit more trustworthy. If there was a pure dns attestor like the acme dns-01 challenge, then it would help that I think. But for the scope of this plugin, I'm thinking limiting it to http attestion utilizing dns just for hostname lookups? So akin to acme http-01 only.
spiffe://$trustDomain/spire/agent/http/hostname/foo.example.org
http:hostname:foo.example.org |
I think the main thing left is finalizing the details of the communication flows? I was thinking about @evan2645's suggestion of random ports again, and could see an advantage to that when having multiple spire-agents on the same node (used to attest to different spire servers). I also can see some benefits on restricting the port back to port 80 for easier internet traversal. So maybe it should be a configurable on both sides? That the agent allows specifying port to use and passes it to the server and the server can force override the port to always be port 80 should the server be intended to traverse the internet? Maybe even defaulting to 80 unless the user overrides? In that case, the config might be:
|
Started to work up the documentation around this. #4909 And scaffolded a bit based on the x509pop plugins. |
hmm.... should the plugin be named 'http' or 'httppop'? |
The pr has reached the level of a workable prototype. It seems to attest, and when I set agent_ttl to something very small, it seems to reattest ok too. It has very little error checking and no testing at the moment. Once we work through all the details, then those things can be added. |
On bare metal nodes without TPM's, it would be very nice if using HTTP/DNS like ACME does for initial attestation could be used for bootstrapping rather then needing to ssh in (and accept an untrusted key) and using a join token. It wouldn't need to be ACME itself, but something that functions similarly.
The text was updated successfully, but these errors were encountered: