WireGuard mesh

Step-by-step guide to the self-hosted WireGuard tunnel between the control plane and every worker — keys, addresses, ports, troubleshooting.

This page explains the WireGuard mesh that carries every byte of control traffic between the WiseHosting control plane and its workers. If you are setting up a new worker, fixing a broken tunnel, or just trying to understand why ports look the way they do, start here.

Who needs to read this?

Operators standing up a new worker — follow Setting it up end to end.
On-callers debugging a worker that fell off the network — jump to Troubleshooting.
Curious newcomers — start with Why WireGuard? for the high-level picture.

Why WireGuard?

We previously used Tailscale to reach workers privately. WireGuard replaces it because:

No third-party identity provider. A Tailscale outage or account problem could lock out the entire fleet. WireGuard keys live in /etc/wireguard/ on each host — fully self-contained.
Stable, owned IPs. Every worker gets a deterministic 10.50.0.x address that we control, with no risk of churn from a coordination server.
Minimal moving parts. A [Peer] block per worker, a wg syncconf to reload, and systemctl enable wg-quick@wg0 to make it persist. No daemons beyond the kernel module.
Encrypted end-to-end. Curve25519 key exchange + ChaCha20-Poly1305 in the kernel. Once the tunnel is up, plain HTTP across it is fine — the kernel encrypts every packet before it leaves the NIC.

The tradeoff: you have to manually copy a public key between the two hosts the first time you bring a worker up. That's it. Once the peer block is on the CP, the tunnel comes back up automatically across reboots and network blips.

Topology at a glance

Item	Value	Why
Subnet	`10.50.0.0/24`	Private RFC-1918 space, no overlap with `wisehosting-build` (`10.89.0.0/16`) or Podman default (`10.88.0.0/16`).
Control plane address	`10.50.0.1`	Always the first address. Workers and the proxy always dial this.
Worker N address	`10.50.0.{N+1}`	Worker `--worker-id 1` → `10.50.0.2`, `--worker-id 2` → `10.50.0.3`, …
Proxy server address	`10.50.0.30`	Fixed address for the dedicated proxy VPS. The proxy Traefik reaches worker containers via these WG IPs.
UDP port	`51821`	We deliberately skip `51820` so a `wg-easy` instance can co-exist on a host.
Listen address	`0.0.0.0`	The CP must be reachable on its public IPv4 to receive worker and proxy handshakes.
Worker → CP control URL	`http://10.50.0.1:8081`	Plain HTTP — encryption is handled by the kernel WireGuard layer, not TLS.
Proxy → CP config URL	`http://10.50.0.1:8081`	Same tunnel; proxy polls `/v1/traefik/proxy-config` every 5 s.

Public exposure stays small

The only ports the CP exposes publicly are the dashboard (:8081 via Cloudflare) and :51821/udp for WG handshakes. The proxy server exposes :80 and :443 for end-user app traffic. Workers expose no public ports — all container traffic reaches them via the proxy over WireGuard. Everything else is firewalled or simply not bound.

Key concepts (newcomers, start here)

WireGuard is intentionally tiny — there are really only four ideas to keep in mind.

Peers, not clients/servers. Both ends are equal. Each side has a private key and knows the other side's public key.
AllowedIPs — the list of remote addresses you accept and route over this tunnel. On the worker we set AllowedIPs = 10.50.0.1/32: only the CP. On the CP we set AllowedIPs = 10.50.0.<worker>/32 per worker. This double-restricts traffic; nothing else can ride the tunnel.
Endpoint — the public IP:port you handshake with. Workers know the CP's public IPv4. The CP doesn't need to know the worker's public IP; the kernel learns it from the first handshake packet.
PersistentKeepalive — every 25 seconds, the worker sends a keepalive so home-NAT / cloud-NAT mappings stay open. Without it, a worker behind NAT becomes unreachable from the CP after a few minutes of silence.

What the script does

scripts/wireguard-setup.sh is the only setup file you need. It has two modes:

# On the control plane:
./scripts/wireguard-setup.sh control

# On each worker:
./scripts/wireguard-setup.sh worker \
    --cp-pubkey   '<paste from CP /etc/wireguard/cp_public.key>' \
    --cp-public-ip '<CP public IPv4>' \
    --worker-id    2

Both modes:

Install wireguard-tools if missing (with a dpkg -i fallback if apt is in a half-broken state — common after a partial Podman upgrade).
Create /etc/wireguard/ with mode 0700.
Generate a Curve25519 keypair into cp_private.key/cp_public.key (CP) or worker_private.key/worker_public.key (worker), if one doesn't already exist.
Write /etc/wireguard/wg0.conf with mode 0600.
Open UDP 51821 in iptables (idempotent — uses -C to check first).
systemctl enable --now wg-quick@wg0.

Re-running the script is safe: existing keys are preserved, and the iptables rule check prevents duplicates.

Control-plane mode

Generates wg0.conf like:

[Interface]
Address = 10.50.0.1/24
ListenPort = 51821
PrivateKey = <CP private key>

# Add a [Peer] block per worker (re-run this script with role=worker
# on each worker, then paste the worker's public key + endpoint here).

Note: the [Peer] blocks are empty initially — you append them as workers come online (see next).

Worker mode

The script needs three flags:

Flag	What goes there
`--cp-pubkey`	Paste the contents of `/etc/wireguard/cp_public.key` from the CP.
`--cp-public-ip`	The CP's public IPv4 (the script prints it via `curl ipify.org` after `control` mode finishes).
`--worker-id`	A small integer unique to this worker. The address becomes `10.50.0.{ID + 1}`.

It produces:

[Interface]
Address = 10.50.0.{ID+1}/24
ListenPort = 51821
PrivateKey = <worker private key>

[Peer]
# control plane
PublicKey = <CP public key>
AllowedIPs = 10.50.0.1/32
Endpoint = <CP public IPv4>:51821
PersistentKeepalive = 25

After this brings up wg0, the worker's tunnel is half-finished — it's set up to talk to the CP, but the CP has no [Peer] block for this worker yet. The worker can't reach 10.50.0.1 until you do that.

Setting it up

The whole flow takes maybe two minutes per worker:

CP side, once: sudo ./scripts/wireguard-setup.sh control. Save the output — it prints both the CP public key and the public IPv4. Workers will need both.

Worker side, once per worker:

sudo ./scripts/wireguard-setup.sh worker \
    --cp-pubkey   "<paste CP public key>" \
    --cp-public-ip "<paste CP public IPv4>" \
    --worker-id    2

It prints the worker's own public key. Save that.

CP side again — paste the worker's public key as a peer:
```
sudo tee -a /etc/wireguard/wg0.conf <<EOF

[Peer]
# worker-2
PublicKey = <worker public key>
AllowedIPs = 10.50.0.3/32
EOF
```
Then reload without dropping existing peers:
```
sudo wg syncconf wg0 <(wg-quick strip wg0)
```
wg syncconf is the magic incantation: it diffs the running config against the file and applies only the delta. Other workers stay connected.
Worker side, point the agent at the tunnel. Edit /etc/wisehosting/config.yaml:
```
api_server:
  url: "http://10.50.0.1:8081"
```
Then sudo systemctl restart wisehosting-worker.

Verify:

# On the worker:
ping -c 1 10.50.0.1
curl -fsS http://10.50.0.1:8081/healthz

# On the CP:
sudo wg show           # latest handshake should be < 30 s old
journalctl -u wisehosting -f | grep "hub: worker"
# → "hub: worker <id> (<name>) connected via WSS"

Why port 8081, not 8080?

The control plane HTTP server listens on :8081. The worker's Traefik listens on :8080 for end-user app traffic. They're different services on different hosts; the worker config's api_server.url points at the control plane, hence :8081.

Why plain HTTP across the tunnel?

The validator in internal/worker/agent.go deliberately permits http:// for loopback and RFC-1918 addresses, including 10.50.0.0/24:

// Allows http:// only when the host is loopback or an RFC-1918 address
// (the WireGuard tunnel encrypts those end-to-end). Public hosts must use https.

The reasoning: WireGuard already encrypts every packet with ChaCha20-Poly1305 and authenticates it with Poly1305 — adding TLS on top would re-encrypt the same bytes for no extra security and add a CA-management burden (self-signed cert pinning on every worker, expiry rotations, …). Public addresses still require https://, so a misconfigured worker that points at https://hosting.example.com and falls back to public DNS doesn't accidentally downgrade.

Reloading after config changes

Action	Command
Add or remove peers	edit `/etc/wireguard/wg0.conf`, then `sudo wg syncconf wg0 <(wg-quick strip wg0)`
Full restart (drops all peers briefly)	`sudo systemctl restart wg-quick@wg0`
Bring tunnel down	`sudo wg-quick down wg0`
Bring tunnel up	`sudo wg-quick up wg0`
Permanent (persists across reboots)	`sudo systemctl enable wg-quick@wg0`

wg syncconf is preferred for live changes because it doesn't tear down the kernel interface — every other worker on the mesh keeps its handshake.

Troubleshooting

Tunnel is up locally but `ping 10.50.0.1` fails

Run sudo wg show on the worker:

No handshake at all → the CP never received the first packet. Check that:
- The CP's iptables -L INPUT actually has the udp dpt:51821 ACCEPT rule.
- The hosting provider's firewall (DigitalOcean / Hetzner / GCP / …) allows inbound UDP 51821.
- The --cp-public-ip you passed to the worker is correct. A wrong endpoint just times out silently.
Handshake older than 3 minutes → keepalives stopped. Confirm PersistentKeepalive = 25 is in the worker's [Peer] block. Some providers drop UDP NAT mappings aggressively.
Latest handshake < 30 s ago, ping still fails → the CP hasn't been told about this worker. Did you append the worker's [Peer] block on the CP and run wg syncconf?

`Permission denied` reading `/etc/wireguard/`

The directory is 0700 and files are 0600 — that's intentional, only root reads them. Use sudo.

`apt` fails halfway through `install_wg_tools`

The script falls back to:

apt download wireguard-tools
dpkg -i wireguard-tools_*.deb

This handles the case where unrelated podman / buildah packages have left apt in a broken state (a common occurrence when the alvistack OBS repo goes missing). If even that fails, install wireguard-tools manually from your distro mirror.

Worker config still points at the old Tailscale address

If you're migrating from the previous Tailscale design, update /etc/wisehosting/config.yaml:

api_server:
  url: "http://10.50.0.1:8081"   # was 100.x.y.z under Tailscale

Then systemctl restart wisehosting-worker. The agent re-registers, fetches a new JWT, and reconnects WSS.

`systemctl status wg-quick@wg0` says dead

sudo journalctl -u wg-quick@wg0 -n 50

Common causes:

Syntax error in wg0.conf (a stray space, a missing =).
Two [Interface] blocks (don't run the script twice with different role flags on the same host).
Kernel module missing (modprobe wireguard should succeed; on very old kernels you need wireguard-dkms).

Key rotation

Compromised key, periodic rotation, or rebuilding a host?

On the affected host, rename the old keypair:

sudo mv /etc/wireguard/{cp,worker}_private.key{,.bak}
sudo mv /etc/wireguard/{cp,worker}_public.key{,.bak}

Re-run the matching role of the script. It will generate a fresh keypair and rewrite wg0.conf.
Update the other side's [Peer] block with the new public key.
sudo wg syncconf wg0 <(wg-quick strip wg0) on the other side.
Delete the .bak files.

The control-plane API key (the one in /etc/wisehosting/config.yaml under worker.api_key) is separate from the WireGuard key and rotates independently.

Where this fits in the rest of the system

The control plane binds :8081 for everything: dashboard, OAuth, REST API, worker control endpoints (/v1/workers/*), and the two Traefik HTTP-provider endpoints (/v1/traefik/config legacy, /v1/traefik/proxy-config for the proxy server). Cloudflare WAF rules drop public hits to the worker and Traefik paths at the edge, but defence-in-depth means workers and the proxy only dial those paths via the tunnel.
The worker's WSS connection (/v1/workers/ws) goes to :8081 over the tunnel. See Worker & WSS reference for the connection lifecycle.
The proxy Traefik polls /v1/traefik/proxy-config every 5 s over the tunnel. See Proxy server setup for full details.
The proxy server also uses WireGuard to reach workers — container ports are bound on the worker's WG IP (10.50.0.x) and the proxy forwards HTTP directly to them. No public port on the worker is required.

Reference: full files

CP /etc/wireguard/wg0.conf (after two workers and the proxy server are joined):

[Interface]
Address = 10.50.0.1/24
ListenPort = 51821
PrivateKey = <CP private key>

[Peer]
# worker-1
PublicKey = <worker-1 public key>
AllowedIPs = 10.50.0.2/32

[Peer]
# worker-2
PublicKey = <worker-2 public key>
AllowedIPs = 10.50.0.3/32

[Peer]
# proxy server (192.99.14.173)
PublicKey = <proxy public key>
AllowedIPs = 10.50.0.30/32

Worker-1 /etc/wireguard/wg0.conf:

[Interface]
Address = 10.50.0.2/24
ListenPort = 51821
PrivateKey = <worker-1 private key>

[Peer]
# control plane
PublicKey = <CP public key>
AllowedIPs = 10.50.0.1/32
Endpoint = <CP public IPv4>:51821
PersistentKeepalive = 25

Proxy server /etc/wireguard/wg0.conf:

[Interface]
Address = 10.50.0.30/24
ListenPort = 51821
PrivateKey = <proxy private key>

[Peer]
# control plane
PublicKey = <CP public key>
AllowedIPs = 10.50.0.0/24
Endpoint = <CP public IPv4>:51821
PersistentKeepalive = 25

Proxy AllowedIPs is broader

The proxy peer uses AllowedIPs = 10.50.0.0/24 (the whole mesh) rather than just 10.50.0.1/32. This lets Traefik on the proxy route directly to any worker container IP in the mesh without an additional route. Workers keep their narrower AllowedIPs = 10.50.0.1/32 since they only need to reach the control plane.

On this page