Admin subsystem
How the wisehosting-admin sibling reads live operational state — the loopback internal-API, admin RPC over WSS, and the admin_accounts identity surface.
WiseHosting splits operator tooling into a separate process called wisehosting-admin rather than mixing it into the user-facing dashboard. The split is for security:
- Different identity surface — admin login uses bcrypt-hashed credentials in
admin_accounts, not OAuth intousers. A compromised user account never grants admin access. - Different network exposure — admin lives on a private interface; the user-facing API never has admin endpoints to begin with.
- Different audit trail — admin actions are logged separately from user actions.
The admin process talks to the control plane over a localhost-only HTTP listener (the internal-api) and, for live worker introspection, the control plane fans out admin RPCs over the WSS hub to a hard-coded whitelist of read-only podman commands.
Three things to keep in mind
- The
internal-apilistens only on127.0.0.1— it is impossible to reach from outside the host. - Authentication is a single bearer token shared with the admin process via systemd
LoadCredential. - The worker-side admin command set is a closed whitelist of 5 read-only podman queries. Workers never accept arbitrary shell.
Layout
control-plane host
├── wisehosting # main control plane (this repo)
│ ├── public :8081 # dashboard / OAuth / REST / WSS hub
│ └── loopback :9091 # internal-api ← admin reads live state from here
└── wisehosting-admin # sibling process (different repo)
├── public :443 # admin login (bcrypt + optional TOTP)
└── reads localhost:9091 with bearer token
each worker
└── wisehosting-worker # receives TypeAdminQuery envelopes over WSS
# runs whitelisted podman commands; replies with TypeAdminQueryReplyThe loopback internal-api
internal/api/internal_api.go exposes a small HTTP API the admin sibling uses to read state that doesn't live in Postgres — live stats, log buffers, the WSS hub's connection map.
| Method | Path | What it returns |
|---|---|---|
GET | /internal-api/dashboard-stats | Apps online/total, deployments today, error counts, plus 24h-ago deltas for trend arrows. |
GET | /internal-api/dashboard-extras | Recent deployments, activity feed, node breakdown, alerts, system status, active-alert count — packaged together to avoid a fan-out from the admin UI. |
GET | /internal-api/dashboard-deployments-timeseries | Per-bucket deploy success/failure counts for the dashboard chart. |
GET | /internal-api/workers/live | Current worker connection state from the WSS hub: which workers are connected, last heartbeat, busy vs idle. |
GET | /internal-api/apps/:id/live-stats | CPU%, memory bytes, network MB/s for one app, pulled from the in-process stats cache. |
GET | /internal-api/apps/:id/logs/tail | Last N lines from the per-app internal/logbus ring buffer. |
GET | /internal-api/apps/recent-logs | Tail of the in-process log bus across every app — used for the admin "Logs" page. |
GET | /internal-api/workers/:id/podman/:cmd | Sends a TypeAdminQuery envelope to that worker over WSS, waits for the reply. :cmd must be one of ps, ps_all, network_ls, volume_ls, image_ls. |
Authentication
All requests must carry Authorization: Bearer <token>. The middleware uses crypto/subtle.ConstantTimeCompare so timing oracles can't leak the token a byte at a time:
func (s *InternalServer) requireToken(c *gin.Context) {
h := c.GetHeader("Authorization")
if subtle.ConstantTimeCompare([]byte(h), []byte("Bearer "+s.token)) != 1 {
c.AbortWithStatus(http.StatusUnauthorized)
return
}
}Configuration
In config.yaml:
internal_api:
bind: "127.0.0.1:9091" # default; override only to bind to a different loopback alias
token: "" # leave empty here — load via systemdThe systemd unit ships:
LoadCredential=internal_api_token:/etc/credstore/wisehosting/internal_api_token…and config.go's loadCredential("internal_api_token") overlay overrides whatever's in YAML. An empty token disables the listener entirely — main.go logs "internal-api disabled — set internal_api.token (or LoadCredential=internal_api_token)" and skips internalSrv.Run(...).
Helpers
pctDelta(cur, prev int64) float64 returns the percentage change for trend arrows in dashboard-stats. Returns 0 when prev <= 0 to avoid NaN and infinite-growth arrows on cold-start.
Admin RPC over WSS — TypeAdminQuery
When the admin sibling needs to peek at a worker's running containers, the control plane forwards the request to the right worker over the existing WSS hub. Two new envelope types carry it:
| Type constant | Direction | Payload |
|---|---|---|
TypeAdminQuery = "admin_query" | CP → worker | AdminQueryPayload { QueryID, Command string } |
TypeAdminQueryReply = "admin_query_reply" | worker → CP | AdminQueryReplyPayload { QueryID, Command, Output, Err string } |
Both share the standard wsproto envelope: HMAC-signed with sha256(api_key), sequence-numbered, timestamp-bounded. See Worker & WSS for the envelope details.
CP side — (*Hub) WorkerQuery
func (h *Hub) WorkerQuery(ctx context.Context, workerID int, command string)
(*wsproto.AdminQueryReplyPayload, error)Generates a unique QueryID (q-{unixnano}-{outSeq}), opens a buffered channel in h.pendingQueries[queryID], sends the envelope, then waits for a reply on the channel or for ctx to cancel. The reply channel is non-blocking — if no goroutine is waiting, the reply is dropped (a stray reply for a query that already timed out shouldn't deadlock the dispatcher).
(*Hub) IsConnected(workerID int) bool lets the internal API short-circuit and return a 503 if the worker is offline, instead of hanging until ctx expires.
Worker side — runAdminQuery
internal/worker/agent.go recognises TypeAdminQuery and looks the command up in a map:
var adminQueryCommands = map[string][]string{
"ps": {"podman", "ps", "--format", "json"},
"ps_all": {"podman", "ps", "-a", "--format", "json"},
"network_ls": {"podman", "network", "ls", "--format", "json"},
"volume_ls": {"podman", "volume", "ls", "--format", "json"},
"image_ls": {"podman", "image", "ls", "--format", "json"},
}If the command isn't in the map, the worker replies with Err: "unknown command". Otherwise it exec.CommandContext with a 10-second timeout and returns the combined output.
Why this is a closed whitelist
The worker process runs with privileged capabilities (podman requires them). If the worker accepted arbitrary commands, a compromised control plane would be game-over for the whole fleet. The whitelist guarantees only read-only inspection queries can ever cross the wire — destructive operations (rm, stop, kill) are deliberately absent.
The worker returns Output even on partial failure (non-zero exit) so operators can see permission errors during debugging — only Err distinguishes a transport-level problem (timeout, unknown command) from podman saying no.
Admin identity (admin_accounts)
Migration 0008_admin_accounts.up.sql creates the table and drops the earlier admin_role_grants/admin_sessions shape (which had been linked to users.id). Admin accounts are now a fully independent identity surface.
CREATE TABLE admin_accounts (
id SERIAL PRIMARY KEY,
username TEXT NOT NULL UNIQUE,
password_hash TEXT NOT NULL, -- bcrypt $2a$
totp_secret TEXT, -- nullable
totp_enabled BOOLEAN NOT NULL DEFAULT FALSE,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
last_login_at TIMESTAMPTZ
);
CREATE TABLE admin_sessions (
id BIGSERIAL PRIMARY KEY,
admin_id INT NOT NULL REFERENCES admin_accounts(id) ON DELETE CASCADE,
token_hash TEXT NOT NULL UNIQUE, -- sha256 hex of the cookie
ip TEXT NOT NULL DEFAULT '',
user_agent TEXT NOT NULL DEFAULT '',
issued_at TIMESTAMPTZ NOT NULL DEFAULT now(),
last_seen TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ NOT NULL,
revoked_at TIMESTAMPTZ
);
CREATE INDEX idx_admin_sessions_admin ON admin_sessions(admin_id);
CREATE INDEX idx_admin_sessions_expires ON admin_sessions(expires_at);Notes:
- Password hashing is bcrypt with the
$2a$prefix — the admin process owns hashing and verification, the control plane simply stores what it's given. - Sessions are idle 60 m / absolute 8 h by default (admin-side policy). The cookie value is a random token; the DB stores its sha256 hex so a DB leak doesn't leak active sessions.
revoked_atlets the admin process revoke a session out-of-band without deleting the audit trail.
Roles (admin_role_grants)
Earlier migration 0006_admin_roles.up.sql introduced role grants. The original migration linked user_id → users.id; the rewrite (0008) keeps the table conceptually but with user_id redirected at admin_accounts.id (the column was kept generic on purpose).
CREATE TABLE admin_role_grants (
user_id BIGINT NOT NULL, -- now: admin_accounts.id
role TEXT NOT NULL, -- super_admin | support | billing | read_only
granted_by BIGINT NOT NULL, -- admin who issued the grant
granted_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ,
reason TEXT NOT NULL DEFAULT '',
PRIMARY KEY (user_id, role)
);The role column is typed TEXT, not an enum, on purpose — the role list is closed and code-driven (handlers say "I require support"). A DB-side enum would couple migrations to handler code; with TEXT you add a new role by deploying new code, no migration required.
expires_at (nullable) supports time-bounded grants for break-glass sessions: e.g. a one-week super_admin grant for an on-call rotation that auto-expires without a manual revoke.
Migration narrative
| Migration | Purpose | Notes |
|---|---|---|
0006_admin_roles | Introduce admin_role_grants linked to users | First pass: re-used the users row for admin identity. |
0007_admin_sessions | Add admin_sessions linked to users | Same approach: cookie sessions for admin actions. |
0008_admin_accounts | Drop both and recreate with a separate admin_accounts table | Decision was that admin identity should be fully decoupled from users — different lifecycle, different password hashing, different recovery story. |
The down.sql for 0008 reverts to 0007's shape so a rollback is possible during the switchover; once 0008 is the production schema, the 0007 shape is unreachable.
Putting it together
When an operator opens the admin "Worker X containers" page:
- Browser →
wisehosting-admin :443with admin session cookie. wisehosting-admincallsGET http://127.0.0.1:9091/internal-api/workers/123/podman/pswith the bearer token.internal-apichecks the bearer, looks up worker 123 in the hub, returns 503 if offline.Hub.WorkerQuerypicks a freshQueryID, sendsTypeAdminQuery{QueryID, "ps"}over WSS, blocks on a channel.- Worker runs
podman ps --format json, sendsTypeAdminQueryReply{QueryID, Output, Err}back. - Hub resolves the channel; the JSON output streams back through
internal-apito the admin sibling, which renders it.
Total trust stack: bearer token (admin → CP) and WSS HMAC + JWT (CP → worker) and podman whitelist (worker policy). Each layer is independent — compromising one doesn't grant the next.