Production cluster - architecture



I think it’s a good time (and place) to continue the conversation started by @pierreozoux at last year.

In a nutshell, he was proposing to setup a shared production cluster and asking a few questions:

  1. Bare metal vs. Cloud?
  2. Load Balancer with Floating IPs on bare metal?
  3. Persistent Layer (Rook? Ceph?)
  4. FDE for deployed nodes?


Regarding 4. (FDE) I found this nice tutorial (in French) to setup full disk encryption on a cloud instance running Alpine Linux. The initramfs includes OpenSSH which is configured to run /sbin/ over SSH. Users with the right SSH key can then enter the LUKS passphrase and the system can boot.


I’m also experimenting with Clevis/Tang, for auto-decrypting during an unattended reboot.


I fail to understand how Clevis/Tang would keep a server VM safe. It has no access to TPM on the physical host and keeping the key on the server to automatically decipher a disk seems to defeat the purpose of encrypting the disk in the first place. Am I wrong?


Well the key is actually on another server (the one that runs Tang), which “offers” the decryption key during boot of the encrypted machine. Their Readme fie has some more detailed explanation of how this works.

It’s not bulletproof of course. I can see Thread models that don’t fit into this and it would make more sense to ssh to the machine and type a passphrase. But in some cases it might be useful.


Any news about this?