A killsnoop is a tool that monitors the system for kill signals and logs the process that received the signal. It is useful for finding out which process is being killed and by whom. It is like bcc/killsnoop but with more detailed information.
- Display the signal source (who sent the signal) and target (who received the signal) in realtime
- Process cmdline included. 1
- Process tree included. 1
Log output is in logfmt
so you can easily parse it with log parsers. For example, if you are using LogQL:
- find all SIGTERM:
{app="killsnoop"} | logfmt --strict | signal = "15"
- find who killed PID 1234:
{app="killsnoop"} | logfmt --strict | target_pid = "1234"
Here is some sample output (newlines added for readability):
level=INFO
msg="snooped signal"
signal=15 # <-- What signal?
signal.string=terminated
source.pid=1646437 # <-- Who sent this signal?
source.cmdline=[runc] # <-- Who sent this signal?
source.comm=runc # <-- Who sent this signal?
source.parent.pid=1645247
target.pid=1645331 # <-- Who received this signal?
target.cmdline="[python /main.py --port=8080]" # <-- Who received this signal? (this field may not exist)
target.comm=python # <-- Who received this signal? (this field may not exist)
target.parent.pid=1645238
target.parent.cmdline="[/usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 4813c0ed0a96d843b694a980527bb9b628ccbe9052237ce4260274e4a95ac25d -address /run/containerd/containerd.sock]"
target.parent.comm=containerd-shim
... (deeper process tree omitted)
If we run yes
and then killed it by pkill yes
,
yes >/dev/null &
sleep 5
pkill yes
the output will be:
level=INFO
msg="snooped signal"
signal=15 # <-- What signal?
signal.string=terminated
source.pid=1657512 # <-- Who sent this signal?
source.cmdline=[pkill] # <-- Who sent this signal?
source.comm=pkill # <-- Who sent this signal?
source.parent.pid=1657386
source.parent.cmdline=[-zsh]
source.parent.comm=zsh
... (deeper process tree omitted)
target.pid=1657487 # <-- Who received this signal?
target.cmdline=[yes] # <-- Who received this signal? (this field may not exist)
target.comm=yes # <-- Who received this signal? (this field may not exist)
target.parent.pid=1657386
target.parent.cmdline=[-zsh]
target.parent.comm=zsh
... (deeper process tree omitted)
Note
Ideally, the build machine should be on the same kernel (same version, same kconfig) as the target machine because this program relies on certain kernel structs (task_struct, mm_struct, and etc.), which are prone to change in different kernels. However, if it is working as expected, you can ignore this.
Aside from golang
, install these dependencies (to build BPF binary):
# On Debian/Ubuntu:
apt-get install gcc-multilib clang llvm libelf-dev libbpf-dev
# On Alpine:
apk add clang-dev llvm-dev libbpf-dev linux-headers musl-dev
Build:
make clean
make
Note
The default build option ignores SIG 0. To include SIG 0, run CFLAGS= make
Necessary volumes and permissions must be given:
docker run --rm -it \
-v /proc:/host/proc:ro \
-v /sys/fs/bpf:/sys/fs/bpf:rw \
-v /sys/kernel/tracing:/sys/kernel/tracing:rw \
--cap-add BPF \
--cap-add PERFMON \
--cap-add SYS_ADMIN \
charlie0129/killsnoop:v0.2.1-debian-12-kernel-6.1 --root /host
/proc
: finding process tree and detailed info/sys/fs/bpf
: bpf maps/sys/kernel/tracing
: tracep sys_killCAP_BPF
: employ privileged BPF operationsCAP_PERFMON
: load tracing programsCAP_SYS_ADMIN
: iterate system wide loaded programs, maps, links, BTFs
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: killsnoop
namespace: default
spec:
selector:
matchLabels:
name: killsnoop
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 100%
template:
metadata:
labels:
name: killsnoop
spec:
volumes:
- name: bpf-maps
hostPath:
path: /sys/fs/bpf
type: DirectoryOrCreate
- name: proc
hostPath:
path: /proc
type: Directory
- name: tracing
hostPath:
path: /sys/kernel/tracing
type: DirectoryOrCreate
securityContext: {}
restartPolicy: Always
containers:
- name: killsnoop
image: charlie0129/killsnoop:v0.2.1-debian-12-kernel-6.1
securityContext:
# Required for killsnoop to work (eBPF and tracepoints)
capabilities:
add:
- BPF
- SYS_ADMIN
- PERFMON
drop:
- ALL
resources:
requests:
cpu: 50m
memory: 256Mi
limits:
cpu: 2000m
memory: 1Gi
command:
- "/killsnoop"
- "--root=/host"
- "--log-level=info"
- "--exclude-signals=17" # SIGCHLD
- "--exclude-signals=18" # SIGCONT
- "--exclude-signals=23" # SIGURG
- "--exclude-signals=28" # SIGWINCH
volumeMounts:
- name: tracing
mountPath: /sys/kernel/tracing
readOnly: false
- name: proc
mountPath: /host/proc
readOnly: true
- name: bpf-maps
mountPath: /sys/fs/bpf
readOnly: false
Footnotes
-
While basic information (source pid, ppid, comm, cmdline and targe pid) and collected in kernel space (which is more reliable), detailed process information (target cmdline, process tree, and etc.) is collected in user space periodically, which will miss ephemeral processes. So these information can sometimes be omitted in the log. ↩ ↩2