WiseHosting
Reference

Admin subsystem

How the wisehosting-admin sibling reads live operational state — the loopback internal-API, admin RPC over WSS, and the admin_accounts identity surface.

WiseHosting splits operator tooling into a separate process called wisehosting-admin rather than mixing it into the user-facing dashboard. The split is for security:

  • Different identity surface — admin login uses bcrypt-hashed credentials in admin_accounts, not OAuth into users. A compromised user account never grants admin access.
  • Different network exposure — admin lives on a private interface; the user-facing API never has admin endpoints to begin with.
  • Different audit trail — admin actions are logged separately from user actions.

The admin process talks to the control plane over a localhost-only HTTP listener (the internal-api) and, for live worker introspection, the control plane fans out admin RPCs over the WSS hub to a hard-coded whitelist of read-only podman commands.

Three things to keep in mind

  1. The internal-api listens only on 127.0.0.1 — it is impossible to reach from outside the host.
  2. Authentication is a single bearer token shared with the admin process via systemd LoadCredential.
  3. The worker-side admin command set is a closed whitelist of 5 read-only podman queries. Workers never accept arbitrary shell.

Layout

control-plane host
├── wisehosting           # main control plane (this repo)
│   ├── public  :8081     # dashboard / OAuth / REST / WSS hub
│   └── loopback :9091    # internal-api  ← admin reads live state from here
└── wisehosting-admin     # sibling process (different repo)
    ├── public :443       # admin login (bcrypt + optional TOTP)
    └── reads localhost:9091 with bearer token

each worker
└── wisehosting-worker    # receives TypeAdminQuery envelopes over WSS
                          # runs whitelisted podman commands; replies with TypeAdminQueryReply

The loopback internal-api

internal/api/internal_api.go exposes a small HTTP API the admin sibling uses to read state that doesn't live in Postgres — live stats, log buffers, the WSS hub's connection map.

MethodPathWhat it returns
GET/internal-api/dashboard-statsApps online/total, deployments today, error counts, plus 24h-ago deltas for trend arrows.
GET/internal-api/dashboard-extrasRecent deployments, activity feed, node breakdown, alerts, system status, active-alert count — packaged together to avoid a fan-out from the admin UI.
GET/internal-api/dashboard-deployments-timeseriesPer-bucket deploy success/failure counts for the dashboard chart.
GET/internal-api/workers/liveCurrent worker connection state from the WSS hub: which workers are connected, last heartbeat, busy vs idle.
GET/internal-api/apps/:id/live-statsCPU%, memory bytes, network MB/s for one app, pulled from the in-process stats cache.
GET/internal-api/apps/:id/logs/tailLast N lines from the per-app internal/logbus ring buffer.
GET/internal-api/apps/recent-logsTail of the in-process log bus across every app — used for the admin "Logs" page.
GET/internal-api/workers/:id/podman/:cmdSends a TypeAdminQuery envelope to that worker over WSS, waits for the reply. :cmd must be one of ps, ps_all, network_ls, volume_ls, image_ls.

Authentication

All requests must carry Authorization: Bearer <token>. The middleware uses crypto/subtle.ConstantTimeCompare so timing oracles can't leak the token a byte at a time:

func (s *InternalServer) requireToken(c *gin.Context) {
    h := c.GetHeader("Authorization")
    if subtle.ConstantTimeCompare([]byte(h), []byte("Bearer "+s.token)) != 1 {
        c.AbortWithStatus(http.StatusUnauthorized)
        return
    }
}

Configuration

In config.yaml:

internal_api:
  bind: "127.0.0.1:9091"   # default; override only to bind to a different loopback alias
  token: ""                 # leave empty here — load via systemd

The systemd unit ships:

LoadCredential=internal_api_token:/etc/credstore/wisehosting/internal_api_token

…and config.go's loadCredential("internal_api_token") overlay overrides whatever's in YAML. An empty token disables the listener entirelymain.go logs "internal-api disabled — set internal_api.token (or LoadCredential=internal_api_token)" and skips internalSrv.Run(...).

Helpers

pctDelta(cur, prev int64) float64 returns the percentage change for trend arrows in dashboard-stats. Returns 0 when prev <= 0 to avoid NaN and infinite-growth arrows on cold-start.

Admin RPC over WSS — TypeAdminQuery

When the admin sibling needs to peek at a worker's running containers, the control plane forwards the request to the right worker over the existing WSS hub. Two new envelope types carry it:

Type constantDirectionPayload
TypeAdminQuery = "admin_query"CP → workerAdminQueryPayload { QueryID, Command string }
TypeAdminQueryReply = "admin_query_reply"worker → CPAdminQueryReplyPayload { QueryID, Command, Output, Err string }

Both share the standard wsproto envelope: HMAC-signed with sha256(api_key), sequence-numbered, timestamp-bounded. See Worker & WSS for the envelope details.

CP side — (*Hub) WorkerQuery

func (h *Hub) WorkerQuery(ctx context.Context, workerID int, command string)
    (*wsproto.AdminQueryReplyPayload, error)

Generates a unique QueryID (q-{unixnano}-{outSeq}), opens a buffered channel in h.pendingQueries[queryID], sends the envelope, then waits for a reply on the channel or for ctx to cancel. The reply channel is non-blocking — if no goroutine is waiting, the reply is dropped (a stray reply for a query that already timed out shouldn't deadlock the dispatcher).

(*Hub) IsConnected(workerID int) bool lets the internal API short-circuit and return a 503 if the worker is offline, instead of hanging until ctx expires.

Worker side — runAdminQuery

internal/worker/agent.go recognises TypeAdminQuery and looks the command up in a map:

var adminQueryCommands = map[string][]string{
    "ps":         {"podman", "ps", "--format", "json"},
    "ps_all":     {"podman", "ps", "-a", "--format", "json"},
    "network_ls": {"podman", "network", "ls", "--format", "json"},
    "volume_ls":  {"podman", "volume", "ls", "--format", "json"},
    "image_ls":   {"podman", "image", "ls", "--format", "json"},
}

If the command isn't in the map, the worker replies with Err: "unknown command". Otherwise it exec.CommandContext with a 10-second timeout and returns the combined output.

Why this is a closed whitelist

The worker process runs with privileged capabilities (podman requires them). If the worker accepted arbitrary commands, a compromised control plane would be game-over for the whole fleet. The whitelist guarantees only read-only inspection queries can ever cross the wire — destructive operations (rm, stop, kill) are deliberately absent.

The worker returns Output even on partial failure (non-zero exit) so operators can see permission errors during debugging — only Err distinguishes a transport-level problem (timeout, unknown command) from podman saying no.

Admin identity (admin_accounts)

Migration 0008_admin_accounts.up.sql creates the table and drops the earlier admin_role_grants/admin_sessions shape (which had been linked to users.id). Admin accounts are now a fully independent identity surface.

CREATE TABLE admin_accounts (
    id              SERIAL      PRIMARY KEY,
    username        TEXT        NOT NULL UNIQUE,
    password_hash   TEXT        NOT NULL,                  -- bcrypt $2a$
    totp_secret     TEXT,                                  -- nullable
    totp_enabled    BOOLEAN     NOT NULL DEFAULT FALSE,
    is_active       BOOLEAN     NOT NULL DEFAULT TRUE,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    last_login_at   TIMESTAMPTZ
);

CREATE TABLE admin_sessions (
    id              BIGSERIAL   PRIMARY KEY,
    admin_id        INT         NOT NULL REFERENCES admin_accounts(id) ON DELETE CASCADE,
    token_hash      TEXT        NOT NULL UNIQUE,           -- sha256 hex of the cookie
    ip              TEXT        NOT NULL DEFAULT '',
    user_agent      TEXT        NOT NULL DEFAULT '',
    issued_at       TIMESTAMPTZ NOT NULL DEFAULT now(),
    last_seen       TIMESTAMPTZ NOT NULL DEFAULT now(),
    expires_at      TIMESTAMPTZ NOT NULL,
    revoked_at      TIMESTAMPTZ
);

CREATE INDEX idx_admin_sessions_admin ON admin_sessions(admin_id);
CREATE INDEX idx_admin_sessions_expires ON admin_sessions(expires_at);

Notes:

  • Password hashing is bcrypt with the $2a$ prefix — the admin process owns hashing and verification, the control plane simply stores what it's given.
  • Sessions are idle 60 m / absolute 8 h by default (admin-side policy). The cookie value is a random token; the DB stores its sha256 hex so a DB leak doesn't leak active sessions.
  • revoked_at lets the admin process revoke a session out-of-band without deleting the audit trail.

Roles (admin_role_grants)

Earlier migration 0006_admin_roles.up.sql introduced role grants. The original migration linked user_id → users.id; the rewrite (0008) keeps the table conceptually but with user_id redirected at admin_accounts.id (the column was kept generic on purpose).

CREATE TABLE admin_role_grants (
    user_id     BIGINT      NOT NULL,                       -- now: admin_accounts.id
    role        TEXT        NOT NULL,                       -- super_admin | support | billing | read_only
    granted_by  BIGINT      NOT NULL,                       -- admin who issued the grant
    granted_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    expires_at  TIMESTAMPTZ,
    reason      TEXT        NOT NULL DEFAULT '',
    PRIMARY KEY (user_id, role)
);

The role column is typed TEXT, not an enum, on purpose — the role list is closed and code-driven (handlers say "I require support"). A DB-side enum would couple migrations to handler code; with TEXT you add a new role by deploying new code, no migration required.

expires_at (nullable) supports time-bounded grants for break-glass sessions: e.g. a one-week super_admin grant for an on-call rotation that auto-expires without a manual revoke.

Migration narrative

MigrationPurposeNotes
0006_admin_rolesIntroduce admin_role_grants linked to usersFirst pass: re-used the users row for admin identity.
0007_admin_sessionsAdd admin_sessions linked to usersSame approach: cookie sessions for admin actions.
0008_admin_accountsDrop both and recreate with a separate admin_accounts tableDecision was that admin identity should be fully decoupled from users — different lifecycle, different password hashing, different recovery story.

The down.sql for 0008 reverts to 0007's shape so a rollback is possible during the switchover; once 0008 is the production schema, the 0007 shape is unreachable.

Putting it together

When an operator opens the admin "Worker X containers" page:

  1. Browser → wisehosting-admin :443 with admin session cookie.
  2. wisehosting-admin calls GET http://127.0.0.1:9091/internal-api/workers/123/podman/ps with the bearer token.
  3. internal-api checks the bearer, looks up worker 123 in the hub, returns 503 if offline.
  4. Hub.WorkerQuery picks a fresh QueryID, sends TypeAdminQuery{QueryID, "ps"} over WSS, blocks on a channel.
  5. Worker runs podman ps --format json, sends TypeAdminQueryReply{QueryID, Output, Err} back.
  6. Hub resolves the channel; the JSON output streams back through internal-api to the admin sibling, which renders it.

Total trust stack: bearer token (admin → CP) and WSS HMAC + JWT (CP → worker) and podman whitelist (worker policy). Each layer is independent — compromising one doesn't grant the next.

On this page