The documenation for all hosts lives in here. The corresponding nixos configuration is in ./hosts.
- Install nix (the recommended Multi-user installation is not NixOS, but only a package manager)
- Enable flake support in nix. This effectively adds the following flags to all your
nix <flags> develop
-like commands:--extra-experimental-features nix-command --extra-experimental-features flakes
- Clone the
doctor-cluster-config
repo,cd
into it and run:nix develop
. This opens a shell with additional packages available such asinv --list
,sops
andage
. - To generate new admin key, run (requires age):
mkdir -p ~/.config/sops/age/
age-keygen -o ~/.config/sops/age/keys.txt
Provide the generated key to a pre-existing admin and wait for him to re-encrypt all secrets in this repo with it. After pulling the re-encrypted secrets you can read them with sops secrets.yml
.
Choose a deployment target:
$ inv -l
Available tasks:
cleanup-gcroots
deploy Deploy to servers
deploy-host Deploy to a single host, i.e. inv deploy-host --host 192.168.1.2
deploy-local Deploy NixOS configuration on the same machine. The NixOS configuration is
format-disks Format disks with zfs, i.e.: inv format-disks --hosts new-hostname --disk /dev/nvme0n1
generate-root-password Generate password hashes for users i.e. for root in ./hosts/$HOSTNAME.yml
generate-ssh-cert Generate ssh cert for host, i.e. inv generate-ssh-cert bill 131.159.102.1
install-nixos install nixos, i.e.: inv install-nixos --hosts new-hostname --flakeattr
ipmi-powercycle
ipmi-serial
mount-disks Mount disks from the installer, i.e.: inv mount-disks --hosts new-hostname --disk /dev/nvme0n1
print-age-key Print age key for sops, inv print-age-key --hosts "host1,host2"
print-tinc-key
reboot Reboot hosts. example usage: fab --hosts clara.r,donna.r reboot
update-docs Regenerate docs for all servers
update-lldp-info Regenerate lldp info for all servers
update-sops-files Update all sops yaml and json files according to .sops.yaml rules
Run!
$ inv deploy
Add chair members to ./modules/users.nix and students to ./modules/students.nix.
For chair members use a uid in the 1000-2000. For new students use a uid in the 2000-3000 range. Check that the uid is unique across both files and in the range between to avoid conflicts.
For installing new servers, see Add servers.
We use flakes to manage nixpkgs versions. To upgrade use:
$ nix flake update
Than commit flake.lock
.
To install home-manager for a user simply run:
$ nix-shell '<home-manager>' -A install
This will initiate your home-manager and will generate a file similar to the one in home/.config/nixpkgs/home.nix
You can use this to enable support for VS Code Server in NixOS.
An example of the home.nix
configured for VS Code support is shown in home/.config/nixpkgs/home.nix
.
On our TUM rack machines we have IPMI support.
Generally, you can find the IPMI web interface at
https://$HOST-mgmt.dse.in.tum.de/
(i.e. https://bill-mgmt.dse.in.tum.de)
once the device has been installed in the rack. These addresses are only
available through the management network, so you must use the RBG
vpn for il1 to access them.
You can also retrieve the IP addresses assigned to the IPMI/BMC firmware by running:
ipmitool lan print
on the machine. On the other host (i.e. your laptop) you can run the following command to get a serial console:
$ ipmitool -I lanplus -H <ipmi-ip-address> -U ADMIN -P "$(sops -d --extract '["ipmi-passwords"]' secrets.yml)" sol activate
The following will reboot the machine:
$ ipmitool -I lanplus -H <ipmi-ip-address> -U ADMIN -P "$(sops -d --extract '["ipmi-passwords"]' secrets.yml)" power cycle
The IPMI password here is encrypted with
sops. To decrypt it on your machine, your
age/pgp fingerprint must be added to .sops.yaml
in this repository. And one of
the existing users must re-encrypt secrets.yml
with your key.
Then press enter to get a login prompt. The root password for all machines is also stored in secrets.yaml.
Hosts are monitored here: https://grafana.thalheim.io/d/Y3JuredMz/monitoring?orgId=1
All machines are build by gitlab ci on a self-hosted runner. Gitlab will also propagate the build status to the github repository eventually. The resulting builds are uploaded to https://tum-dse.cachix.org from where machines can download them while upgrading.