Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: support struct xdp_sock *bpf_skc_to_xdp_sock(void *sk) #293

Open
thediveo opened this issue Feb 21, 2023 · 13 comments
Open

Comments

@thediveo
Copy link

thediveo commented Feb 21, 2023

Please forgive me, if this is the correct place to ask for support to dynamically cast a struct sock * into a struct xdp_sock *. will be thankful for guiding me to the correct place then. I'm unclear if my request is fine here or instead should get upstream to the kernel lists (which one)?

At this time, the bpf helpers support dynamically casting sk pointers to, say, a tcp_sock pointer (struct tcp_sock *bpf_skc_to_tcp_sock(void *sk)). As far as I can tell from looking at the newest kernel 6.1 headers as well as the libxdp bfp-helpers.h there doesn't seem to be support for dynamically casting to an XDP socket pointer.

Would it be possible to support this type of cast that becomes useful in, for instance, diagnosing XDP sockets?

@tohojo
Copy link
Member

tohojo commented Feb 21, 2023

This is maybe something @magnus-karlsson can answer...

@magnus-karlsson
Copy link
Member

You are correct that there is no such support. Would you mind sending me a pointer to an example or two of how this is used for other socket types?

@thediveo
Copy link
Author

This code snippet here hopefully serves as an example, it is from Cilium's ebpf Go module: https://github.com/cilium/ebpf/blob/master/examples/tcprtt/tcprtt.c.

SEC("fentry/tcp_close")
int BPF_PROG(tcp_close, struct sock *sk) {
	if (sk->__sk_common.skc_family != AF_INET) {
		return 0;
	}

	// The input struct sock is actually a tcp_sock, so we can type-cast
	struct tcp_sock *ts = bpf_skc_to_tcp_sock(sk);
	if (!ts) {
		return 0;
	}
	// ...
}

Just to be sure before I'm asking here I tried a "normal" pointer cast and of course the verifier rejects it upon trying to load and attach my probe to some XSK-related kernel function:

struct sock {
    struct sock_common __sk_common;
} __attribute__((preserve_access_index));

struct socket {
    struct sock *sk; // sic! a pointer
} __attribute__((preserve_access_index));

struct net_device {
    int ifindex;
} __attribute__((preserve_access_index));

struct xdp_sock {
    struct sock sk;
    struct net_device *dev;
    u16 queue_id;
} __attribute__((preserve_access_index));

struct sock *sk = sock->sk;
if (sk->__sk_common.skc_family != AF_XDP) {
    return 0;
}
struct xdp_sock *xs = (struct xdp_sock *)sk;
struct net_device *dev = xs->dev;
load program: permission denied: access beyond struct sock at off 776 size 8

It is kind of ironic that the cute comic strip about how to not have to wait as an application developer for things to first land in kernel and also sometime later to get delivered is now poking fun of me.

@magnus-karlsson
Copy link
Member

Thanks. For the example from Cilium, it makes sense that there is an AF_INET socket associated with a TCP close action. What kind of actions/fentries/eBPF hook points would you like to attach an eBPF program and then examine the xsk sock?

@thediveo
Copy link
Author

I would like to recover XSK information for diagnosis that currently is not exposed to user space; especially the netdevs and queues XSKs are bound to. For this, I could hook into one of the unfortunately unimplemented XSK stub methods. I'm sure there will be more use cases in the future.

@magnus-karlsson
Copy link
Member

How about adding this to the "ss" command? There are already some things added to it, but more could be added if there are things missing.

@thediveo
Copy link
Author

While "show sockets" (what a horrible abbreviation) is okay, I need to be able to access the functionality I need from several of my own tools, which are container and namespace-aware.

There's my open source lxkns discovery engine that does some heavy lifting when it comes to all kind of namespaces and different container engines. While I'm probably what could be called an "old hand" from the days of the first betas of Turbo C I nowadays prefer more productive language ecosystems, trading in some bits for better code base maintenance, et cetera.

Not sure if you mentioning "show sockets" was an idea in this direction: an option independent of "show sockets" would be to get a /proc/net/xdp somewhat along the lines of tcp and company.

@magnus-karlsson
Copy link
Member

You can access the information that ss gets through netlink without having to use ss. The functionality can be found in net/xdp/xsk_diag.c. Though there might be info missing that you need. Could you please try it out and see what is missing and if you can use it? Would be nice if we could use something that already exists instead of adding something new. Though a dynamic language like eBPF to access socket info would be really nice :-).

@thediveo
Copy link
Author

Now that's looking like an attractive rabbit hole: let's see if I can come up with a PR to vishvananda/netlink so we get this upstream anyway. I wasn't aware of the XSK netlink diagnosis functionality, so I need now to poke Michael Kerrisk to update his famous https://man7.org/linux/man-pages/man7/sock_diag.7.html page.

@thediveo
Copy link
Author

thediveo commented Feb 22, 2023

I've made progress and have some code that adds the XDP socket diagnose message handling to my fork of https://github.com/vishvananda/netlink: this looks very promising to be the data I was looking for. Needs more testing but hopefully gets accepted upstream.

@magnus-karlsson
Copy link
Member

Good question, but unfortunately I do not know. Maybe you have already solved the problem?

@thediveo
Copy link
Author

I've sent a PR upstream to the golang netlink module. The xdp_diag.c code doesn't seem to be implementing filtering (on index number and/or cookie) or am I mistaken here?

@magnus-karlsson
Copy link
Member

Correct. The xdp_diag.c code is just a function of a single provided AF_XDP socket. Given a socket, it just fills in information about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants