Admin subsystem

How the wisehosting-admin sibling reads live operational state — the loopback internal-API, admin RPC over WSS, and the admin_accounts identity surface.

WiseHosting splits operator tooling into a separate process called wisehosting-admin rather than mixing it into the user-facing dashboard. The split is for security:

Different identity surface — admin login uses bcrypt-hashed credentials in admin_accounts, not OAuth into users. A compromised user account never grants admin access.
Different network exposure — admin lives on a private interface; the user-facing API never has admin endpoints to begin with.
Different audit trail — admin actions are logged separately from user actions.

The admin process talks to the control plane over a localhost-only HTTP listener (the internal-api) and, for live worker introspection, the control plane fans out admin RPCs over the WSS hub to a hard-coded whitelist of read-only podman commands.

Three things to keep in mind

The internal-api listens only on 127.0.0.1 — it is impossible to reach from outside the host.
Authentication is a single bearer token shared with the admin process via systemd LoadCredential.
The worker-side admin command set is a closed whitelist of 5 read-only podman queries. Workers never accept arbitrary shell.

Layout

control-plane host
├── wisehosting           # main control plane (this repo)
│   ├── public  :8081     # dashboard / OAuth / REST / WSS hub
│   └── loopback :9091    # internal-api  ← admin reads live state from here
└── wisehosting-admin     # sibling process (different repo)
    ├── public :443       # admin login (bcrypt + optional TOTP)
    └── reads localhost:9091 with bearer token

each worker
└── wisehosting-worker    # receives TypeAdminQuery envelopes over WSS
                          # runs whitelisted podman commands; replies with TypeAdminQueryReply

The loopback `internal-api`

internal/api/internal_api.go exposes a small HTTP API the admin sibling uses to read state that doesn't live in Postgres — live stats, log buffers, the WSS hub's connection map.

Method	Path	What it returns
`GET`	`/internal-api/dashboard-stats`	Apps online/total, deployments today, error counts, plus 24h-ago deltas for trend arrows.
`GET`	`/internal-api/dashboard-extras`	Recent deployments, activity feed, node breakdown, alerts, system status, active-alert count — packaged together to avoid a fan-out from the admin UI.
`GET`	`/internal-api/dashboard-deployments-timeseries`	Per-bucket deploy success/failure counts for the dashboard chart.
`GET`	`/internal-api/workers/live`	Current worker connection state from the WSS hub: which workers are connected, last heartbeat, busy vs idle.
`GET`	`/internal-api/apps/:id/live-stats`	CPU%, memory bytes, network MB/s for one app, pulled from the in-process stats cache.
`GET`	`/internal-api/apps/:id/logs/tail`	Last N lines from the per-app `internal/logbus` ring buffer.
`GET`	`/internal-api/apps/recent-logs`	Tail of the in-process log bus across every app — used for the admin "Logs" page.
`GET`	`/internal-api/workers/:id/podman/:cmd`	Sends a `TypeAdminQuery` envelope to that worker over WSS, waits for the reply. `:cmd` must be one of `ps`, `ps_all`, `network_ls`, `volume_ls`, `image_ls`.

Authentication

All requests must carry Authorization: Bearer <token>. The middleware uses crypto/subtle.ConstantTimeCompare so timing oracles can't leak the token a byte at a time:

func (s *InternalServer) requireToken(c *gin.Context) {
    h := c.GetHeader("Authorization")
    if subtle.ConstantTimeCompare([]byte(h), []byte("Bearer "+s.token)) != 1 {
        c.AbortWithStatus(http.StatusUnauthorized)
        return
    }
}

Configuration

In config.yaml:

internal_api:
  bind: "127.0.0.1:9091"   # default; override only to bind to a different loopback alias
  token: ""                 # leave empty here — load via systemd

The systemd unit ships:

LoadCredential=internal_api_token:/etc/credstore/wisehosting/internal_api_token

…and config.go's loadCredential("internal_api_token") overlay overrides whatever's in YAML. An empty token disables the listener entirely — main.go logs "internal-api disabled — set internal_api.token (or LoadCredential=internal_api_token)" and skips internalSrv.Run(...).

Helpers

pctDelta(cur, prev int64) float64 returns the percentage change for trend arrows in dashboard-stats. Returns 0 when prev <= 0 to avoid NaN and infinite-growth arrows on cold-start.

Admin RPC over WSS — `TypeAdminQuery`

When the admin sibling needs to peek at a worker's running containers, the control plane forwards the request to the right worker over the existing WSS hub. Two new envelope types carry it:

Type constant	Direction	Payload
`TypeAdminQuery = "admin_query"`	CP → worker	`AdminQueryPayload { QueryID, Command string }`
`TypeAdminQueryReply = "admin_query_reply"`	worker → CP	`AdminQueryReplyPayload { QueryID, Command, Output, Err string }`

Both share the standard wsproto envelope: HMAC-signed with sha256(api_key), sequence-numbered, timestamp-bounded. See Worker & WSS for the envelope details.

CP side — `(*Hub) WorkerQuery`

func (h *Hub) WorkerQuery(ctx context.Context, workerID int, command string)
    (*wsproto.AdminQueryReplyPayload, error)

Generates a unique QueryID (q-{unixnano}-{outSeq}), opens a buffered channel in h.pendingQueries[queryID], sends the envelope, then waits for a reply on the channel or for ctx to cancel. The reply channel is non-blocking — if no goroutine is waiting, the reply is dropped (a stray reply for a query that already timed out shouldn't deadlock the dispatcher).

(*Hub) IsConnected(workerID int) bool lets the internal API short-circuit and return a 503 if the worker is offline, instead of hanging until ctx expires.

Worker side — `runAdminQuery`

internal/worker/agent.go recognises TypeAdminQuery and looks the command up in a map:

var adminQueryCommands = map[string][]string{
    "ps":         {"podman", "ps", "--format", "json"},
    "ps_all":     {"podman", "ps", "-a", "--format", "json"},
    "network_ls": {"podman", "network", "ls", "--format", "json"},
    "volume_ls":  {"podman", "volume", "ls", "--format", "json"},
    "image_ls":   {"podman", "image", "ls", "--format", "json"},
}

If the command isn't in the map, the worker replies with Err: "unknown command". Otherwise it exec.CommandContext with a 10-second timeout and returns the combined output.

Why this is a closed whitelist

The worker process runs with privileged capabilities (podman requires them). If the worker accepted arbitrary commands, a compromised control plane would be game-over for the whole fleet. The whitelist guarantees only read-only inspection queries can ever cross the wire — destructive operations (rm, stop, kill) are deliberately absent.

The worker returns Output even on partial failure (non-zero exit) so operators can see permission errors during debugging — only Err distinguishes a transport-level problem (timeout, unknown command) from podman saying no.

Admin identity (`admin_accounts`)

Migration 0008_admin_accounts.up.sql creates the table and drops the earlier admin_role_grants/admin_sessions shape (which had been linked to users.id). Admin accounts are now a fully independent identity surface.

CREATE TABLE admin_accounts (
    id              SERIAL      PRIMARY KEY,
    username        TEXT        NOT NULL UNIQUE,
    password_hash   TEXT        NOT NULL,                  -- bcrypt $2a$
    totp_secret     TEXT,                                  -- nullable
    totp_enabled    BOOLEAN     NOT NULL DEFAULT FALSE,
    is_active       BOOLEAN     NOT NULL DEFAULT TRUE,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    last_login_at   TIMESTAMPTZ
);

CREATE TABLE admin_sessions (
    id              BIGSERIAL   PRIMARY KEY,
    admin_id        INT         NOT NULL REFERENCES admin_accounts(id) ON DELETE CASCADE,
    token_hash      TEXT        NOT NULL UNIQUE,           -- sha256 hex of the cookie
    ip              TEXT        NOT NULL DEFAULT '',
    user_agent      TEXT        NOT NULL DEFAULT '',
    issued_at       TIMESTAMPTZ NOT NULL DEFAULT now(),
    last_seen       TIMESTAMPTZ NOT NULL DEFAULT now(),
    expires_at      TIMESTAMPTZ NOT NULL,
    revoked_at      TIMESTAMPTZ
);

CREATE INDEX idx_admin_sessions_admin ON admin_sessions(admin_id);
CREATE INDEX idx_admin_sessions_expires ON admin_sessions(expires_at);

Notes:

Password hashing is bcrypt with the $2a$ prefix — the admin process owns hashing and verification, the control plane simply stores what it's given.
Sessions are idle 60 m / absolute 8 h by default (admin-side policy). The cookie value is a random token; the DB stores its sha256 hex so a DB leak doesn't leak active sessions.
revoked_at lets the admin process revoke a session out-of-band without deleting the audit trail.

Roles (`admin_role_grants`)

Earlier migration 0006_admin_roles.up.sql introduced role grants. The original migration linked user_id → users.id; the rewrite (0008) keeps the table conceptually but with user_id redirected at admin_accounts.id (the column was kept generic on purpose).

CREATE TABLE admin_role_grants (
    user_id     BIGINT      NOT NULL,                       -- now: admin_accounts.id
    role        TEXT        NOT NULL,                       -- super_admin | support | billing | read_only
    granted_by  BIGINT      NOT NULL,                       -- admin who issued the grant
    granted_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    expires_at  TIMESTAMPTZ,
    reason      TEXT        NOT NULL DEFAULT '',
    PRIMARY KEY (user_id, role)
);

The role column is typed TEXT, not an enum, on purpose — the role list is closed and code-driven (handlers say "I require support"). A DB-side enum would couple migrations to handler code; with TEXT you add a new role by deploying new code, no migration required.

expires_at (nullable) supports time-bounded grants for break-glass sessions: e.g. a one-week super_admin grant for an on-call rotation that auto-expires without a manual revoke.

Migration narrative

Migration	Purpose	Notes
`0006_admin_roles`	Introduce `admin_role_grants` linked to `users`	First pass: re-used the `users` row for admin identity.
`0007_admin_sessions`	Add `admin_sessions` linked to `users`	Same approach: cookie sessions for admin actions.
`0008_admin_accounts`	Drop both and recreate with a separate `admin_accounts` table	Decision was that admin identity should be fully decoupled from `users` — different lifecycle, different password hashing, different recovery story.

The down.sql for 0008 reverts to 0007's shape so a rollback is possible during the switchover; once 0008 is the production schema, the 0007 shape is unreachable.

Putting it together

When an operator opens the admin "Worker X containers" page:

Browser → wisehosting-admin :443 with admin session cookie.
wisehosting-admin calls GET http://127.0.0.1:9091/internal-api/workers/123/podman/ps with the bearer token.
internal-api checks the bearer, looks up worker 123 in the hub, returns 503 if offline.
Hub.WorkerQuery picks a fresh QueryID, sends TypeAdminQuery{QueryID, "ps"} over WSS, blocks on a channel.
Worker runs podman ps --format json, sends TypeAdminQueryReply{QueryID, Output, Err} back.
Hub resolves the channel; the JSON output streams back through internal-api to the admin sibling, which renders it.

Total trust stack: bearer token (admin → CP) and WSS HMAC + JWT (CP → worker) and podman whitelist (worker policy). Each layer is independent — compromising one doesn't grant the next.

On this page