Changelog · Halton Meter Docs

v0.4.1 · 2026-06-08

Adds pricing for Cursor Composer 2.5 (Fast tier), which the existing Grok-CLI adapter already captured but left unpriced.

Added

Cursor Composer 2.5 pricing. Composer 2.5 (Fast tier) is served via the grok CLI on cli-chat-proxy.grok.com/v1/responses as model grok-composer-2.5-fast (api_backend=responses, agent_type=cursor), so it was already captured by the existing Grok-CLI adapter; it was just unpriced (NULL cost). Added the operator-supplied Fast-tier rate ($3.00 / $15.00 per MTok input / output); usage attributes to the xAI / Grok-CLI surface in reports.

v0.4.0 · 2026-06-08

First-class Linux and Windows support, plus cross-platform safety fixes. Driven by a multi-agent audit of the per-OS install/uninstall/lifecycle paths. macOS behaviour is unchanged except for the L-M7 stable-bundle adoption and the X1/X2 safety fixes, which apply to all platforms. Linux/Windows remain validated by mocked unit tests only; real-hardware soak is the next gate (a non-blocking macOS + windows-latest CI axis was added to start exercising them).

Added

Non-blocking macos-latest + windows-latest CI axes; a central test fixture that heals cross-test PEP-562 dispatch-package monkeypatch leaks.

Fixed

Cross-platform (all OSes): uninstall --purge could destroy the cost database (X1). When $TMPDIR/%TEMP% was on a different volume than $HOME, the move-aside of db.sqlite raised a cross-volume EXDEV, the file was skipped from the preserve list, and the subsequent wipe deleted it. The purge now stages on the same volume and hard-aborts (restoring and refusing the wipe) if any DB file cannot be moved aside intact.
Cross-platform (all OSes): Storage migration robustness. A covering-index migration step ran CREATE INDEX … ON request_bodies before that table existed on very old databases, rolling the whole migration back; it is now guarded. The migration chain is now fail-stop: a step that swallows a transient ALTER failure halts the chain instead of letting later steps advance user_version past the gap and permanently mask the missing column.
Linux: init --apps actually meters now. Previously it wired no HTTPS_PROXY and no cert trust (all macOS-gated) while the success panel claimed success, a silent no-op. It now persists the proxy + CA env via ~/.config/environment.d/ and a consent-gated shell-rc block (both pointing at a stable, venv-independent CA bundle at ~/.halton-meter/cacert.pem), makes a best-effort system-trust install (update-ca-certificates / p11-kit / update-ca-trust, admin-free degradation), and the post-install self-test now validates the Linux wiring.
Linux: systemd lifecycle repaired. StartLimitIntervalSec=0 on the units (a crash-looping edge no longer latches into a permanent unbound-port outage); start brings up daemon + edge (not a phantom watchdog) and reset-faileds first; status reports the edge with a STOPPED fail-open rescue and exits 0 on a healthy box; stop leaves the edge bound; install survives non-systemd distros.
Windows: The daemon no longer hijacks the machine-wide WinINET proxy. It is now sentinel-gated (apps mode is a no-op, apps inherit HTTPS_PROXY from HKCU\Environment), listener-guarded, and snapshots the user's prior proxy before any change, restoring it on uninstall instead of stranding a dead-port proxy.
Windows: Install establishes TLS trust (generates the CA, trusts it in the Windows store, patches certifi, writes the full env incl. NODE_EXTRA_CA_CERTS) so Claude Code (Node) and Python SDKs no longer hard-fail TLS. Uninstall untrusts the root CA by thumbprint (it previously passed a file path to certutil -delstore, leaving the universal MITM root trusted forever). init now installs (was a no-op stub); real start/stop/status; schtasks quoting survives spaced paths; doctor gains a Windows branch; the CA private key is ACL-restricted.
Security: uninstall --include-logs now removes ~/.mitmproxy/ (the universal MITM CA cert and private key) on all OSes, and macOS untrusts the cert from the keychain on uninstall; previously the forge-any-TLS key survived an "uninstall."

v0.3.11 · 2026-06-07

Cloud body uploader policy-rejection handling, new pricing/body status commands, and a covering index for fast body upload status.

Added

CloudBodyPolicyRejected exception type. New typed exception (errors.py) for the 200 + stored=false case, distinct from transient failures (retry with backoff), quarantine targets (terminal 4xx), and auth errors (pause). Never quarantines, never backs off, never pauses; just skips the record and continues draining. Wires into the same exception-dispatch table as CloudNotReady (425) and CloudSchemaMismatch (4xx).
records_policy_rejected counter on BodyDrainResult. Tick-level count of records skipped due to workspace policy. Surfaced in run_forever as a WARNING log when non-zero: cloud.body_uploader.policy_rejected_tick with count and hint.
halton-meter cloud bodies status command. Per-day body upload progress table showing uploaded / pending counts with done and in-progress indicators. Queries local SQLite directly, no daemon or network needed. Accepts --days N to widen or narrow the window (default 30 days).
"Body last upload" row in halton-meter cloud status. Shows MAX(uploaded_at) from local SQLite, a real upload-activity signal independent of the tick stamp. During a heavy backlog drain (where the tick stamp goes stale while 400 POSTs run for 10+ minutes) this row stays green and prevents a false "uploader may be dead" alarm.

Changed

Performance: Covering index (captured_at, uploaded_at) on request_bodies (migration 13 to 14). cloud bodies status was taking 20-53 s because SQLite had to read every row's data pages (body text up to 1 MiB each) to retrieve two small timestamps. The covering index lets the query run entirely from index leaf pages: 17 ms vs 19 s (1000x). Existing installs get the index on first daemon start after upgrade.

Fixed

Cloud body uploader now respects workspace store_prompt_content policy. Previously, when a workspace had body storage disabled, the cloud backend returned 200 OK (accepted but silently discarded), and the daemon treated this identically to a successful 204 No Content and stamped uploaded_at, permanently losing the body. The uploader now distinguishes 204 (stored: mark and advance) from 200 + {"stored": false, "reason": "store_prompt_content_disabled"} (policy rejected: do NOT mark uploaded_at). Bodies rejected by workspace policy remain pending and are automatically retried when the policy is re-enabled, no manual reset required.
Body uploader staleness threshold raised 90 s to 300 s. The prior threshold (3 × 30 s metadata interval) fired false positives during any non-trivial backlog drain. The body tick row now also clears to ● when uploads are actively flowing within the threshold window, even if the cycle-start stamp is older.

v0.3.9 · 2026-06-07

Attribution-correctness fixes: opportunistic eviction TTL, and auto-classification so new AI tools no longer fall to misc.

Fixed

Opportunistic eviction TTL corrected (attribution correctness). The attribution_store write path called evict_stale() with the 300 s default instead of the daemon prune-loop's 86 400 s (24 h). On a busy machine, any attribution row older than 5 minutes could be silently evicted mid-session, causing the remainder of a long Claude Code session to fall through to misc. Fixed to pass max_age_s=86400.0 explicitly, matching the prune loop.
Devin and other new AI tools no longer fall to misc. Sandboxed/Electron helper processes (Devin, future agents) report cwd=/, which exhausts every cwd-based attribution tier. Two fixes: (1) devin added to the default process_mappings; (2) new auto_classify_unknown_processes config flag (default True), when a process name is found via psutil but is not in process_mappings, the daemon derives a slug from the process name (e.g. Devin Helper (Plugin) becomes devin-helper-plugin) and uses it as the project label instead of falling through to misc. Unknown new tools are now attributed automatically; process_mappings overrides still win.

v0.3.8 · 2026-06-05

Pricing rate auto-pull (fetch + apply), default-off, so a rate change no longer requires a daemon version bump and PyPI republish. Includes fail-open behaviour and audit hardening folded into the unreleased feature.

Added

Pricing rate auto-pull (fetch + apply), default-off. The daemon can now refresh its pricing_rates table from the cloud's published rates out-of-band. A new supervised background worker (pricing.worker) runs under the existing CloudWorkerSupervisor (respawn + bounded backoff + per-tick heartbeat) and, every [pricing].pull_interval (default 6h, floor 1h), fetches the published rates over one of two channels: paired daemons use the authenticated cloud GET /v1/pricing/rates; unpaired daemons fall back to a public [pricing].pull_url, then run the same pure validate (pricing/pull.py) + upsert (pricing_cli.apply_rates, source='auto') path. No signing / no new dependency. Default OFF ([pricing].auto_pull = false); opt-in this release, with HALTON_METER_NO_RATES_PULL=1 as a hard opt-out.
halton-meter pricing refresh, one-shot fetch + validate + apply against the DB now (prints updated / skipped-operator-override / unchanged).
halton-meter pricing status, shows auto-pull on/off, pull URL, the channel that would be used, last pull timestamp, the applied watermark (version/date), active-row counts by source (bundled/auto/operator), and the last error.
halton-meter pricing auto-pull on|off, flips [pricing].auto_pull in config.toml.
Fail-open: Auto-pull is fully fail-soft. auto_pull=false never spawns the worker; a network/timeout/non-2xx fetch logs pricing.pull_failed, stamps last_error, and keeps the last rates; a malformed body logs pricing.pull_invalid and leaves the DB untouched; a stale-vs-bundled or downgrade payload is a no-op; an apply_rates failure rolls the whole batch back; a worker death is respawned by the supervisor. Outbound HTTP uses httpx.Client(trust_env=False). The cost path itself makes no network call and is unchanged.

Fixed

Hardening (audit fixes): Relative >10x fat-finger guard on auto-pulls. The coarse absolute $0-1000/MTok ceiling let a published $300-where-$3.00-was-meant typo (100x) pass (300 < 1000). apply_rates now rejects the whole auto pull (all-or-nothing, transaction rolled back, last_error stamped) if any priced field moves more than 10x in either direction versus the current active row. The guard is source='auto' only, bundled (human-reviewed wheel) and operator (pricing set) writes are exempt. A 0-to-nonzero or nonzero-to-0 move on input/output also trips the guard.
Hardening (audit fixes): Date-aware bundled-vs-auto precedence (silent-downgrade fix). The newer-auto guard was date-blind: it protected an auto row unconditionally, so after a wheel upgrade a stale auto row could shadow newer bundled rates. Precedence is now: operator always wins; within the non-operator tier the newer provenance date wins. A newer-dated bundled refresh supersedes a stale auto row; a fresher auto row still beats an older bundled refresh.
Hardening (audit fixes): Bundled refresh no longer pollutes the auto-pull heartbeat. A source='bundled' refresh (runs on every fresh install + wheel upgrade via refresh_bundled_if_newer) now updates only the provenance watermark (applied_bundled_date/applied_rates_version); it no longer stamps last_pull_at or clears last_error. Those belong to the auto-pull worker, so pricing status on a never-pulled machine no longer shows a phantom "Last pull at" or silently clears a prior auto-pull error. Only source='auto' touches them.
Hardening (audit fixes): HTTPS-only enforced on the rates fetch. Channel 2 ships unsigned this release, so plain HTTPS + sanity bounds + the monotonic gate are the entire transport trust story. fetch_rates now refuses any non-https:// resolved URL before sending (an http:// Channel-1 endpoint would have leaked the pairing token as a cleartext Authorization: Bearer), and [pricing].pull_url is validated at config-load (fail-open: an unsafe value is dropped to no Channel-2 fetch, never crashes the daemon).
Hardening (audit fixes): Deterministic startup jitter. The fleet-spread startup jitter now derives from a SHA-256 of the machine fingerprint instead of the builtin hash() (which is PYTHONHASHSEED-salted and varied run-to-run), so the "reproducible per machine" property actually holds.

Removed

Removed a stray committed Claude Code session lock (.claude/scheduled_tasks.lock) and gitignored .claude/*.lock.

v0.3.7 · 2026-06-05

Bundled pricing matrix refreshed to the 2026-05-31 cloud canonical, correcting several stale or drifting per-token rates across Claude, OpenAI, Gemini, and Grok.

Changed

Bundled pricing matrix refreshed to the 2026-05-31 cloud canonical. BUNDLED_RATES_DATE bumped 2026-05-01 to 2026-05-31; rates-manifest.json bumped in lockstep. All values cross-referenced against the cloud single source of truth (halton-meter-cloud rates.json, effective 2026-05-31).
Grok rates refreshed (finding D-3). Added grok-4.3 / grok-4.3-latest (flagship $1.25 in / $2.50 out / $0.20 cache_read) and the grok-4.20-* family. Re-pointed legacy ids grok-4, grok-4-mini, grok-3, grok-3-fast, grok-2-vision-1212 to the grok-4.3 rate per the canonical. (grok-3-mini, grok-3-mini-fast are not in the canonical and are left unchanged; the grok-build CLI SKUs already matched.)
Added the missing Gemini Flash models (finding D-4). gemini-3.5-flash ($1.50/$9.00), gemini-3.1-flash-lite ($0.25/$1.50), gemini-2.5-flash ($0.30/$2.50), gemini-2.5-flash-lite ($0.10/$0.40), all with cache_write=0 per the Gemini caching convention.

Fixed

claude-opus-4-8 now prices correctly in the local report (finding D-1). The bundled rate matrix had no claude-opus-4-8 entry, so compute_cost_millicents('claude-opus-4-8', …) returned None and the terminal report showed ~$0 for today's flagship Claude model. Added claude-opus-4-8 and the dated alias claude-opus-4-8-20260219 at the published Standard-tier rate ($5 in / $25 out / $0.50 cache_read / $6.25 cache_write).
Stale OpenAI flagship rates corrected (finding D-2). gpt-5.5 was a ~2.5x undercount at $2/$8, corrected to its published $5 in / $30 out / $0.50 cache_read. gpt-5.4 corrected to $2.50/$15/$0.25. gpt-5.4-mini corrected from the gpt-4.1-mini mapping to its own published $0.75/$4.50/$0.075. (gpt-5.2, gpt-5.3-codex, codex-auto-review, codex-unknown keep their closest-tier mapping, still no published per-token rate.)
Four drifting Gemini standard-tier rates reconciled to the cloud canonical. Most critically, gemini-3-flash-preview (the on-the-wire Code Assist SKU) was a live 2x undercount, priced at $0.25/$1.50, exactly half its published Standard rate of $0.50 in / $3.00 out / $0.05 cache_read. Now corrected (and the gemini-code-assist-unknown fallback row that mirrors it). The other three were cache_read/cache_write drift: gemini-3.1-pro-preview cache_read $0.50 to $0.20; gemini-2.5-pro cache_read $0.31 to $0.125; gemini-3.1-flash-live-preview cache_read $0.075 to $0 (no published per-MTok context-caching line). All four now carry cache_write=0 per the Gemini caching convention. Input/output and the >200k tiered/modal rows were already correct and are unchanged.

v0.3.6 · 2026-06-02

Large request bodies no longer silently vanish from the cloud, with byte-accurate truncation at capture and upload time plus added headroom on the per-upload cap.

Changed

The default per-upload body cap now has headroom over the store cap. [cloud.bodies] max_body_bytes_per_upload defaults to 1 MiB (was 512 KiB, exactly equal to the 512 KiB store cap, which left zero headroom for UTF-8 inflation). Decoupling them makes the drop impossible on a fresh install even before the truncation fixes matter.
max_body_bytes_per_upload is re-read each drain tick. Previously captured once at worker spawn, so a cap change needed a daemon restart, inconsistent with the bodies.upload_from watermark, which already re-reads per tick. A cap change now takes effect on the next tick without a restart. No DB migration.

Fixed

Large request bodies no longer silently vanish from the cloud. When a captured body leg exceeded the per-upload size cap, the cloud body uploader logged skip_oversize_* and uploaded nothing for that leg, yet still stamped the row uploaded_at, so the dropped leg was never retried and permanently lost (the dashboard showed "Request body: Not captured" for large LLM requests while the smaller response leg uploaded fine). The uploader now truncates an oversized leg to the cap on a valid UTF-8 boundary and uploads it, never dropping it; the wire envelope keeps the true original size so the dashboard can render "truncated (N of M bytes)". A multibyte character straddling the cap is never split. The skip_oversize_* logs are replaced by upload_truncated_* (info).
Stored bodies no longer exceed the store cap in bytes. Capture-time truncation sliced the body by character count (text[:max_bytes]), so a multibyte body could be stored a few bytes-to-KB OVER [bodies] max_body_bytes in encoded UTF-8. It now truncates by encoded UTF-8 byte count on a char boundary, so a stored body never exceeds the cap. (This byte/char mismatch, combined with the equal default caps below, is what made oversized legs hit the upload drop in the first place.)

v0.3.5 · 2026-06-01

Timeline-selective body upload via a `bodies.upload-from` watermark, and a privacy-by-default change so enabling body sync uploads from now rather than backfilling all history.

Added

Timeline-selective body upload (bodies.upload-from watermark). The cloud body uploader can now be scoped to only upload request/response bodies captured on or after a chosen point in time, instead of always draining the full local history oldest-first. Set it with halton-meter cloud privacy set bodies.upload-from now (only bodies from here forward), … bodies.upload-from 2026-05-15 (an ISO date or datetime, normalised to UTC), or … bodies.upload-from none (upload all history, the legacy behaviour). Surfaced in cloud privacy show. Config-driven ([cloud.bodies] upload_from), re-read each drain tick so a change takes effect without a daemon restart; no DB migration. Fail-open: a malformed value falls back to no filter and logs.

Changed

Enabling body sync now defaults to "from now," not a full-history backfill. halton-meter cloud privacy set bodies.enabled true now stamps upload_from = <now> on the off-to-on transition and prints a notice, so turning body sync on uploads new traffic going forward rather than retroactively shipping the entire local body store. Already-enabled installs are unaffected (no retroactive change). Run cloud privacy set bodies.upload-from none to opt into uploading history. Rationale: privacy-by-default and no surprise bulk upload (see memory/decisions.md).

v0.3.4 · 2026-06-01

The continuous cloud-sync workers now self-recover under a supervisor instead of silently dying, with per-worker heartbeats so cloud status can distinguish a dead worker from an idle one.

Fixed

The continuous cloud-sync worker now recovers itself instead of silently dying. Both background workers (the metadata sync worker and the body uploader) were spawned once and never supervised: if the task exited for any reason (a stray cancellation, a BaseException, even a clean return) it died with no log line and was never respawned, so cloud sync stalled until the next daemon restart. This surfaced with no error and no failure-counter movement (Consecutive failures: 0, Last error: (none)), because those fields only change on a failed sync; a dead worker writes nothing at all. Both workers are now wrapped in a CloudWorkerSupervisor that detects unexpected exits, logs cloud.supervisor.worker_exited (retrieving the exception so it's never swallowed), and respawns with exponential backoff (1s to 60s, capped at 8 consecutive fast restarts then degrade-not-crash; a healthy run resets the counter). Shutdown cancellations are classified and never trigger a respawn.
cloud status can now tell a dead worker from an idle one. Added per-worker heartbeats: Worker last tick and Body uploader last tick, stamped on every loop iteration including idle ones (new last_tick_at / last_body_tick_at columns, schema user_version 10 to 12, additive verify-before-bump migrations). Previously a stalled worker and a caught-up worker were indistinguishable in the status output, which is why a 47-minute stall went unnoticed.
Body-uploader teardown gap closed. The body-uploader task was never added to the daemon's teardown set, so it wasn't cleanly cancelled/awaited on shutdown. It now is.

v0.3.3 · 2026-05-31

Reconcile no longer 422s by sending calendar dates over a like-for-like window, plus macOS test-suite isolation fixes.

Fixed

halton-meter cloud reconcile no longer 422s. The command built full ISO datetimes (…T10:19:06Z) for the cloud's daemon-totals endpoint, but that endpoint types from_date/to_date as calendar date, so every reconcile was rejected 422. It now sends plain ISO dates, and the local daemon-side total is summed over the same inclusive calendar-date window so the daemon-vs-cloud variance line compares like-for-like (instead of a rolling 24h x N window vs N calendar days). Added a command-level regression test (the prior test only exercised the client method, not the command's date construction).
macOS test-suite isolation. Cleared two cross-file global-state leaks surfaced when the full suite runs on a macOS dev box with a live daemon: structlog's cache_logger_on_first_use bind cache (which reset_defaults() doesn't clear, silently bypassing capture_logs()), and the _macos_interfaces TTL cache (which seeded the live machine's interfaces into later mocked tests). Updated several FAIL-OPEN-1 tests to expect the blank-then-disable behaviour and to stub the listener/daemon-listener probes so they don't depend on a running daemon. Test-only; no product change.

v0.3.2 · 2026-05-31

Machine-identity stability: stop the daemon from re-identifying as a new machine (and accumulating sync keys).

Changed

halton-meter cloud connect is now idempotent. If valid credentials already exist and still authenticate (a whoami probe), it reports the existing connection and skips pairing, so re-running connect no longer mints a fresh key and revokes the prior one on every invocation (which left a trail of revoked keys). Pass --force to re-pair on demand (switch workspace, or recover after revocation). Only a confirmed whoami short-circuits; missing/invalid creds or an unreachable cloud fall through to normal pairing.

Fixed

Persist the machine fingerprint to ~/.halton-meter/machine-id (mode 0600, atomic write). The fingerprint was already derived from a stable hardware id (macOS IOPlatformUUID / Linux /etc/machine-id / Windows MachineGuid) but was recomputed live on every pair and never cached, so a transient probe failure made it return None, the daemon omitted it, and the cloud fell back to hostname+os dedup and minted a new machine + key. Now the derived value is cached and read first, so a probe blip or hostname rename can never change identity. The derivation basis is unchanged, so existing machines keep the same fingerprint, they just stop drifting. (Pairs with halton-meter-cloud #150: revoke-on-mint + one-active-sync-key-per-machine unique index.)

v0.3.1 · 2026-05-31

Production-readiness hardening: a system-proxy fail-open failsafe, activation of cloud quarantine, and the coordinated 425/404 body-upload contract.

Added

FAIL-OPEN-1, macOS system-proxy failsafe. The system proxy can no longer be left enabled pointing at a dead loopback port (the fail-CLOSED reinstall trap a user hit: uninstall left 127.0.0.1:8081 retained, so pip/uv got Connection refused). Four independent mechanisms: a sentinel-independent watchdog reaper that clears any enabled loopback proxy with no listener; an enable-guard that refuses to point the OS at an unbound port; an ungated edge atexit/signal-trap disable; and uninstall/stop/reset-proxy now BLANK the server:port, not just toggle state. See memory/decisions.md 2026-05-31.
Cloud quarantine activated (H7/M2). A single contract-incompatible record can no longer freeze metadata sync (H7) or silently vanish a body (M2). The supervisor now wires a quarantine writer into both the metadata worker and the body uploader; terminally-rejected records are recorded in cloud_quarantine and skipped past so the queue drains. halton-meter cloud status shows a Quarantined: N count (a contract-drift early-warning signal; healthy = 0).

Changed

Coordinated 425/404 body-upload contract. A 425 Too Early from POST /v1/requests/{id}/body (request not synced yet) is treated as transient, the not-ready body is skipped so it can never wedge the bodies behind it, and retried next tick, while 404 (wrong-tenant / genuine orphan) is terminal and quarantined (recorded, never silently discarded). The cloud worker's inter-tick sleep now wakes promptly on shutdown. Deploy order (load-bearing): the cloud's 425 split must be live BEFORE this daemon reaches production. A 0.3.1 daemon maps 404 to quarantine; against an older cloud that still returns 404 for the unsynced race it would wrongly quarantine early bodies. Deploy cloud-first.

Fixed

PR #46 prod-audit Tier 1/2 hardening (merged): migration verify-before-bump (no user_version stamp on a failed ALTER), _pending capture-dict eviction (no unbounded growth), busy_timeout, redact-before-truncate, edge connect/idle timeouts, shutdown drain + WAL checkpoint.
Added pytest-timeout (dev) with a global per-test timeout so a rare async-deadlock test fails by name instead of hanging the whole suite/CI.

v0.3.0 · 2026-05-29

End-to-end error classification across all four providers (Anthropic, Gemini, OpenAI, Grok/xAI).

Added

End-to-end error classification on outgoing cloud log records: error_class, provider_error_code, http_status, retryable. Shipped for Anthropic, Gemini, OpenAI, and Grok/xAI traffic.
OpenAI error classifier (daemon/halton_meter/adapters/openai.py) covering the full HTTP 4xx/5xx surface. 429 responses are bucketed by error.type then error.code: rate_limit_error becomes rate_limit (retryable), insufficient_quota becomes auth (non-retryable). This matches operator-remediation semantics, not raw HTTP status.
Grok/xAI provider routed through the OpenAI classifier, no separate adapter. xAI is OpenAI-SDK compatible at the error envelope level.

Changed

Internal error-class enum split timeout_or_network into distinct timeout and network values for end-to-end fidelity. Wire-side bucket vocabulary remains the canonical seven: rate_limit | server_error | bad_request | auth | timeout | network | unknown. Unknown local values fall through to unknown on the wire (forward-compat).
Anthropic overloaded_error (HTTP 529) classified as server_error with retryable=true, a provider-availability signal, not quota.
Wire contract: New fields on cloud log records (all nullable): error_class, provider_error_code, http_status, retryable. Backend tolerates older daemons emitting none of them. error_message_hash remains local-only and is never serialised to the wire.
Notes: Daemon version bump (pyproject.toml) and v0.3.0 git tag intentionally NOT included in this changelog entry's commit, gated on cloud migration 0040 deploying to prod RDS and the four new fields landing successfully in the requests table via smoke-test. Add the version bump in a follow-up commit immediately before tagging.

v0.2.15 · 2026-05-28

Restore Grok CLI metering; self-heal missing keychain trust settings.

Added

Tests: 5 new tests in tests/test_setup.py::TestHasAdminTrustSettingsMacos covering the populated/empty/timeout/missing-binary/substring-false-positive cases for the new helper. 1 new test test_s1b_drift_verify_ok_but_no_admin_trust_settings_returns_false reproducing the 2026-05-28 Grok incident at unit-test level. test_s1_* updated to mock both gates via side_effect; pins gate-2 argv to ["security", "dump-trust-settings", "-d"]. test_s2_* pins mock_run.call_count == 1 so a future gate-reorder cannot accidentally pass for the wrong reason.

Fixed

Grok CLI metering restored. Re-adds cli-chat-proxy.grok.com to LLM_INTERCEPT_HOSTS. The 0.2.14 exclusion was based on a misdiagnosis: Grok CLI's binary is linked against rustls-platform-verifier + rustls-native-certs (verified via strings), which consult the macOS keychain, they do not bundle WebPKI roots. The real cause of the tlsv1 alert unknown ca failure was a drift state in the local keychain (see next item), not a fundamental TLS-stack incompatibility. End-to-end verified on macOS 15.x with grok 0.2.3: four grok-build rows captured with full token + cost attribution.
trust_cert_macos() now self-heals missing admin trust settings. _is_cert_trusted_macos() previously short-circuited based on security verify-cert -p ssl alone, which is more lenient than the SecTrust settings enumerator (SecTrustSettingsCopyCertificates) used by stricter modern verifiers like rustls-platform-verifier. On a machine where the mitm CA was imported into /Library/Keychains/System.keychain but missing from security dump-trust-settings -d (a drift state observed on 2026-05-28), trust_cert_macos() reported "already trusted" and skipped re-running add-trusted-cert. The check is now two-gated: both verify-cert -p ssl rc=0 and an explicit Cert N: mitmproxy entry in dump-trust-settings -d are required. When the drift state is detected, the existing admin-dialog re-trust path is triggered automatically on the next halton-meter init.

v0.2.14 · 2026-05-28

Fix Grok CLI broken by TLS interception.

Fixed

Removed cli-chat-proxy.grok.com from the intercept allowlist. The Grok CLI is built with Rust's reqwest/rustls, which bundles its own WebPKI CA roots and ignores the macOS system keychain. It rejected the mitmproxy CA cert with tlsv1 alert unknown ca, breaking every Grok CLI request with "Retry failed: reqwest error stream". Removing the host from LLM_INTERCEPT_HOSTS restores fail-open passthrough, Grok CLI traffic tunnels through unmetered. The GrokCLIAdapter is retained in the registry; re-enable the host if Grok CLI adopts system-CA trust in a future release.

v0.2.13 · 2026-05-28

Hardware fingerprint for machine deduplication in pairing.

Added

Hardware fingerprint in pairing start. POST /v1/pairing/start now includes a fingerprint field, a 32-char hex string (SHA-256 of a platform-specific hardware ID, truncated). Source: IOPlatformUUID on macOS, /etc/machine-id on Linux, MachineGuid registry key on Windows, falling back to SHA-256(hostname + MAC). The backend uses the fingerprint to upsert rather than insert, preventing duplicate machine rows when a daemon re-pairs after a reinstall or version upgrade. Fingerprint derivation never raises and returns None gracefully so pairing is never blocked.
Tests: 5 new tests in tests/test_fingerprint.py, shape validation, macOS UUID parsing, Linux machine-id reading, total-failure None path, and determinism.

v0.2.11 · 2026-05-24

Windows apps-mode support (phase 0) plus cloud-connect stale task fix.

Added

Windows apps-mode support. uvx halton-meter now works on Windows 10 / 11. No admin rights required for apps-mode. The daemon and edge ship as Task Scheduler ONLOGON user-level tasks. The mitmproxy CA cert is installed into the user cert store via certutil -user -addstore Root. The system proxy is written to HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings (ProxyEnable, ProxyServer) with a WM_SETTINGCHANGE broadcast so running Electron / WinINet apps pick up the change without restart. HTTPS_PROXY is written to HKCU\Environment. Four new platform modules: system_proxy/_windows.py, install/_windows.py, lifecycle/_windows.py, setup/_windows.py. All winreg / ctypes.windll calls are fail-open and guarded so the module parses cleanly on macOS / Linux. Full-mode (machine-wide proxy, NSSM service, MDM cert) is deferred to post-v1.0.
Tests: 27 new tests in tests/test_windows_apps_mode.py, all mock winreg so they pass on macOS / Linux CI; cover registry proxy writes, env-var management, non-raising failure handling, snapshot/restore round-trip, dispatch wiring.

Fixed

POST /v1/cloud/connect no longer leaves a zombie task on re-pair. A previous handshake that settled (denied, expired, failed) left its background asyncio task referenced in _active.task. A new POST /v1/cloud/connect call now cancels the old task before replacing _active, preventing a rare race where two concurrent background tasks could both call _persist_credentials.

v0.2.10 · 2026-05-24

Attribution correctness, body capture memory gate, and audit hardening.

Added

Tests: 7 new tests in tests/test_attribution_store_source_workdir.py, write + lookup 2-tuple, workdir propagation, ALTER TABLE migration on old DB, tagging step 0 workdir pass-through. 8 new tests in tests/test_body_capture_size_gate.py, request gate fires at exactly 4 MiB, logs structured event, passes through normally below threshold; symmetric tests for response side.

Fixed

source_workdir now stored in the attribution cache. attribution_log gains a source_workdir TEXT column (fail-open ALTER TABLE migration for existing DBs). The edge process's _resolve_via_psutil propagates the resolved cwd through resolve_client_project to attribution_store.write. tagging.py Step 0 (edge-store cache hit) now returns the real source_workdir instead of always None. Every cached row written from v0.2.10 forward carries the originating directory; historical rows show NULL.
Body capture size gate. proxy.py skips capture_body() entirely for request or response bodies larger than 4 MiB. Logs addon.body_capture.skip with reason=body_too_large, size_bytes, and gate_bytes. Prevents a second memory spike from decoding large multi-modal payloads that mitmproxy already buffered. flow.request.content / flow.response.content access is isolated in its own try/except so a mitmproxy internal decode error cannot swallow the attribution block for that flow.
Attribution store schema-init lock corrected. _open_connection now holds _LOCK across the full DDL bootstrap block rather than claiming the flag before doing the work. The previous "claim early" approach created a TOCTOU window where concurrent threads skipped bootstrap while the table didn't exist yet, causing no such table: attribution_log failures under high concurrency at process startup.
Dead psutil.net_connections() scan removed. tagging._get_process_name assigned psutil.Process(psutil.net_connections()[0].pid) then immediately shadowed the variable in the for proc in psutil.process_iter() loop, a full connection-table scan that fired on every call and discarded the result.
cloud.worker.sync_paused log event name. Was cloud.worker.paused_unauthorised, which was inaccurate for the HTTP 403 to paused_forbidden terminal path. Renamed to cloud.worker.sync_paused (reason-agnostic; the pause reason lives in the DB row and the run_once drain log).
Proxy content access isolated. flow.request.content / flow.response.content reads in the size-gate path are wrapped in try/except so a mitmproxy decode exception cannot propagate to the outer request hook and drop attribution for the flow.

v0.2.9 · 2026-05-24

Daemon hardening and report row-cap removal.

Added

Loopback bind guard (halton_meter/security.py). On startup, the daemon resolves its configured listen_host and api_host via socket.getaddrinfo and hard-exits (SystemExit(1)) if either resolves to a non-loopback address. Prevents accidental LAN exposure if a user edits the config to 0.0.0.0. Override via HALTON_METER_ALLOW_NON_LOOPBACK=1 env var (logs a warning) for container / VPN topologies that legitimately bind on non-loopback.
Dependency upper-bound caps. mitmproxy>=10.0,<13, pydantic>=2.0,<3, httpx>=0.27,<1, click>=8.1,<9. Prevents silent breakage when those libraries ship major-version API changes.

Fixed

halton-meter report no longer caps at 10,000 rows. storage.read_records accepted a hard limit=10_000 default; the report command passed that value explicitly. read_records now accepts limit: int | None = None and omits the LIMIT clause when None. Reports across large databases are now complete.
Log rotation enabled. Daemon log files now rotate at 100 MiB with 5 backups retained (RotatingFileHandler). Previously a long-running daemon on a busy machine could produce an unbounded single log file.

v0.2.8 · 2026-05-24

Sync pause-classification + recovery fix. Three items.

Added

halton-meter cloud resume is a real recovery command. Reads the current pause reason; on paused_manual clears immediately; on paused_unauthorised / paused_forbidden / anything else hits GET /v1/daemon/whoami once with the stored token. If 200, clears pause + counters + last_error on both cloud_state and cloud_sync_state so cloud status shows green on the next read. If real 401, prints "API key is genuinely invalid, re-pair required" and exits non-zero. If transient, surfaces the real status. Unblocks the daemon without burning a cloud connect (which would revoke the still-valid existing key and grow the cloud's revoked-devices list).
CloudForbidden exception class. Distinct from CloudUnauthorised. HTTP 403 maps here; worker writes paused_reason='paused_forbidden'. The status banner explains that the recovery path is "have the workspace owner re-invite this machine, then cloud resume", not "re-pair".
cloud_sync_state.last_error_at column (user_version 6 to 7, additive ALTER TABLE). Stamped together with last_error by both pause and error writers. Surfaced in halton-meter cloud status as the "Last error at" row so a stored error can be told apart as fresh vs stale.
Tests: New tests/cloud/test_pause_classification.py, 12 tests covering real 401 to paused_unauthorised, real 403 to paused_forbidden, and that 500/502/503/RemoteProtocolError/ReadTimeout/ConnectTimeout/ConnectError/422/400/429 all skip the pause writer and route through error_writer. Belt-and-braces: each transient-failure test asserts both that pauses == [] AND that errors != []. Existing test_worker_drain.py continues to pass without modification, the worker's API gained an optional error_writer parameter that defaults to None. Storage user_version bumped 6 to 7 in engine.CURRENT_USER_VERSION; test_user_version_stamped updated implicitly via the constant.

Fixed

sync.paused_unauthorised triggers strictly on HTTP 401. Previously, any failure that escaped the worker's retry envelope inside cloud sync could be misclassified as 401, including the ALB-side connection reset seen during the 2026-05-24 rolling deploy. The daemon would then write paused_unauthorised, force the user to re-pair, and accumulate revoked entries in the cloud's paired-devices list. v0.2.8: only an actual HTTP 401 response on /v1/requests/batch writes that pause class. Transient errors (5xx, RemoteProtocolError, ReadTimeout, ConnectTimeout, httpx.ConnectError / DNS failure, HTTP 429, HTTP 422/400) route through a new error_writer seam that bumps consecutive_failures and stamps last_error + last_error_at but never touches paused_reason. Each pause-causing branch now emits a structured log line (sync.paused reason=… http_status=… last_request_id=…) so the cause is visible. A pause-classification table maps each wire condition to its before-v0.2.8 vs from-v0.2.8 behaviour (e.g. HTTP 403 now becomes paused_forbidden; 5xx, conn reset, timeouts, ConnectError, 429, and 400/422 are now transient with no pause).

v0.2.7 · 2026-05-23

`base_url` auto-heal for placeholder TLDs.

Fixed

Self-heal [cloud].base_url when it contains a placeholder TLD (.test, .example, .invalid, .local): rewrites to https://api.haltonmeter.com on next boot and logs once. Catches stale config from spike scripts.

v0.2.6 · 2026-05-23

Cloud-onboarding loopback API for the dashboard's first-login flow.

Added

GET /v1/cloud/state, returns { paired, version, hostname, port }. Lets the dashboard's onboarding shell detect a local daemon and skip the install/start steps.
POST /v1/cloud/connect, triggers a pairing-code mint without the user copying anything; returns { code }. GET /v1/cloud/connect/status polls for approval.
Chrome PNA Access-Control-Allow-Private-Network: true header on the loopback API so cross-origin browser fetches from https://app.haltonmeter.com work without flags.

v0.2.5 · 2026-05-22

SaaS-launch release. Six items: Cursor cold-start fix, edge attribution leak fix, CI Python 3.14 matrix, backfill v2 script, friendly transport-error UX, and official backend host lock.

Added

Official backend host lock (cloud/constants.py). HALTON_METER_CLOUD_URL = "https://api.haltonmeter.com", production default used by load_cloud_config when [cloud].base_url is absent or empty. HALTON_METER_CLOUD_URL env var + --base-url CLI flag remain as dev/staging overrides. cloud connect updated to use the production URL as its default.
Backfill v2 script (daemon/scripts/backfill_body_paths.py). Recovers Cursor-style rows by scanning file:// URIs and bare absolute paths in captured request bodies. Dry-run by default; --apply gated same as v1. Tags recovered rows with attribution_method='backfill_body_paths'. Does not auto-run --apply.
CI matrix for Python 3.14 (experimental). New daemon-py314 job in .github/workflows/ci.yml with continue-on-error: true. Main matrix covers 3.11, 3.12, 3.13.
Tests: New tests/cloud/test_cloud_constants.py, tests/cloud/test_transport_error_ux.py. Extended tests/attribution/test_resolver.py (lenient-slug cases), tests/test_edge_attribution.py (Step 8 fallback). Full daemon suite: 1906 passed, 35 skipped, 0 failed. Ruff clean.

Fixed

Cursor cold-start lenient-slug scan. find_project_root_by_slug in attribution/layers.py fires on Tier 4b/4c registry miss, an O(1-level) scan of ~/Documents and ~ warms EdgeAttributionRegistry for the duration of the process. Tier telemetry distinguishes cold-start recoveries (4b_lenient / 4c_lenient) from normal strict corroborations; attribution_method in DB is unchanged.
Edge unattributed/edge_store leak. edge_attribution._resolve_via_psutil Step 8 calls resolve_unified as a final fallback after its standalone chain (Steps 1-7.5) exhausts all options. Lazy import preserves the stdlib-only module-import invariant. Emits edge_attribution.resolver_fallback at INFO on recovery.
Friendly cloud.transport_error log. Transport error message now includes the base URL and config file path hint so each line is actionable. _check_base_url_not_placeholder raises immediately for placeholder TLDs (.test, .example, .invalid, .local) rather than burning the full retry budget; bypassed when a test transport is injected.

v0.2.4 · 2026-05-18

Unified, IDE-agnostic attribution resolver. Closes the edge-side unattributed / edge_store leak from v0.2.3 live verification. Widens Python pin to 3.14.

Added

Unified Tier 0-8 resolver (attribution/resolver.py) shared by daemon + edge; single entry point attribution.resolver.resolve_unified.
attribution/registry.py, per-edge slug to abspath map from rcfile-resolved workspaces.
ide_env_label layer, reads CURSOR_WORKSPACE_LABEL / VSCODE_WORKSPACE_LABEL from process environ, strict-slug corroborated via registry.
ide_argv_label layer, Electron-helper trailing-token regex r'.*\s(\S+)\s\[\d+-\d+\]\s*$' against argv, strict-slug.
IDE workspace recovery for Cursor / VSCode / Kiro / Zed / IntelliJ.
Per-tier attribution.tier_hit structured telemetry (hits + 4b_miss / 4c_miss).
Tests: 32 new attribution tests (6 registry + 26 resolver) including empirical Cursor argv fixtures, p99 perf < 3 ms, telemetry shape, source-audit of v0.2.3 helper deletions. Full daemon suite: 1877 passed, 35 skipped, 0 failed. Ruff clean.
Followups (tracked for v0.2.5): CI matrix entry for Python 3.14 (pin widened in v0.2.4; CI runner row is a bookkeeping change). Historical misc backfill (--apply), v0.2.4 ships the dry-run script daemon/scripts/backfill_misc_attribution.py only; --apply remains gated behind explicit operator approval and is not auto-run.

Changed

7 new attribution_method values: ide_argv_label, ide_env_label, parent_rcfile, parent_git, parent_ide_sniff, parent_ide_argv_label, parent_ide_env_label. All fit existing String(32); no cloud migration.
Renamed claude_parent_rcfile / claude_parent_workdir to parent_rcfile / parent_workdir (family-agnostic).
Parent walk in edge_attribution._resolve_via_psutil moved from Step 3.5 to Step 7.5 so the originating process's own signals win over an ancestor's.
Replaced IDE-family process-name allow-list with cross-uid / cross-session boundary stop in the parent walk.
requires-python = ">=3.11, <3.14" to ">=3.11, <3.15" (3.14 wheels available for pydantic_core, mitmproxy, aiosqlite).

Removed

v0.2.3 daemon Step 7.25 and its four helpers in tagging.py (~280 lines deleted, ~70 re-added calling the resolver).
_is_chromium_helper gate and IDE-family allow-list in edge_attribution.py.

v0.2.3 · 2026-05-18

Two correctness fixes: long-lived CONNECT tunnel attribution and edge plist atomic-write hardening.

Added

daemon/scripts/backfill_misc_attribution.py dry-run script, proposes recovered slugs from embedded /Users/.../CLAUDE.md paths. --apply gated behind operator approval; not auto-run.
daemon/tests/test_macos_install.py, 9 new tests covering plist-builder shape (RunAtLoad, KeepAlive, ThrottleInterval, ExitTimeOut, ProgramArguments, Label) and atomic-write paths (replace, fresh create, interrupted write, failed replace).

Changed

Known limitations (intentionally not shipped): Boot-time auto-start stays off (v0.1.21.1 manual-start stands). Cold-boot zombie has not been reproduced; re-enable deferred pending a reproducer + fresh-boot log capture.
Other patches deferred: Wire-field rename cost_usd_minor_units to cost_usd_millicents (breaking; needs coordinated daemon + cloud + dashboard + Alembic cycle). Wire contract SHA refresh, daemon/memory/plans/phase2-wire-contract.md still pinned to pre-v0.2.2 e8f252a.

Fixed

attribution_store.lookup dropped its 5-min read-side TTL filter. Long-lived CONNECT tunnels (multi-hour Claude Code sessions) now stay attributed for the life of the TCP socket. Write-side eviction loop unchanged; max_age_s= kept on the signature for backwards-compat but no longer gates hits.
tagging.py parent-PID walk for {claude, claude-code} family. Bounded 5-hop walk when originating cwd resolves to None or /; re-runs rcfile / git / mac_sandbox / workdir-basename layers against the first ancestor's cwd. (Superseded in v0.2.4 by the unified resolver.)
smart_default fall-through escalated debug to structured warning with edge_src_port, cwd, parent_pid, parent_name, a tripwire for future regressions.
Prune loop horizon raised to 24h in cli._run_daemon to match the read-side TTL removal. DEFAULT_MAX_AGE_SECONDS = 300.0 left untouched for other callers.
Edge plist atomic-write hardening. All four macOS plist writes (daemon, watchdog, edge, userenv) routed through new _atomic_write_bytes helper in install/_macos.py (tmp-file to fsync to os.replace). Prevents the 8-byte <plist/> stub corruption observed during same-day reinstalls.

v0.2.2 · 2026-05-12

End-to-end body sync. Daemon now pushes captured request and response bodies to the cloud on the same opt-in posture as metadata: disabled by default, per-project overrides win over the master switch.

Added

[cloud.bodies] config block. enabled (bool, default false), sync_interval_seconds (default 60), max_body_bytes_per_upload (default 524288, 512 KiB), per_project (dict of slug to bool). Bodies do not leave the machine until bodies.enabled = true.
halton-meter cloud privacy set bodies.enabled true|false, flips the master body-sync switch.
halton-meter cloud privacy set bodies.upload false --project SLUG, drops body uploads for one project while leaving the global switch on.
halton-meter cloud privacy show now renders a "Body sync (v0.2.2)" section: master state, interval, byte cap, per-project rules.
BodyUploader worker. Sibling to the metadata worker; reads request_bodies WHERE uploaded_at IS NULL, POSTs two envelopes per row (request + response) to POST /v1/requests/{id}/body, stamps uploaded_at on success. Project-skipped rows still advance to keep the cursor moving. Pause/last-error tracked independently in cloud_body_sync_state so a body 401 doesn't pause metadata sync.
Supervisor wiring. cli.py spawns the uploader next to the existing metadata worker with the same fail-open posture (gate misses are silent, spawn failures absorbed).
Schema migration 5 to 6. Adds request_bodies.uploaded_at DATETIME NULL plus a supporting index. Idempotent; safe on greenfield and upgrade-in-place installs.
Tests: Cloud (halton-meter-cloud): tests/test_request_bodies.py, 6 tests (happy path, wrong workspace, unknown id, idempotent re-post, base64 decode error, missing auth). Daemon: tests/cloud/test_body_uploader.py, 5 tests (happy path, per-project skip, 401 pause, transport error, oversize leg skipped). tests/cloud/test_supervisor_spawns_body_uploader.py, 4 tests (three gate misses + happy path spawn).

Changed

Wire contract: POST /v1/requests/{id}/body is now Live in PHASE2_CONTRACT.md. Envelope: BodyEnvelopeV021 (class name retained from the Wave-0 forward-declaration to keep daemon imports stable). Two POSTs per request (direction=request, direction=response) land in one cloud row keyed on request_id, with redaction_applied OR-merged across directions.
Documentation: daemon/PRIVACY.md gains a "Body sync" section (defaults, how to enable, per-project override pattern). daemon/CLOUD.md references cloud privacy set bodies.enabled true as the activation step.

v0.2.1 · 2026-05-12

Patch release. Operational bugs plus CLI UX polish for the cloud subgroup.

Changed

CLI UX for the cloud subgroup. The cloud commands previously printed raw Python dicts and one-line success messages. They now match the look-and-feel of halton-meter status (Rich Panels + Tables + state icons). cloud connect, pairing code rendered in a brand panel; a dots spinner runs during the approve-poll wait so the CLI doesn't appear hung; success message is a green panel. cloud status, state banner (one of ACTIVE / DEGRADED / PAUSED / NOT-CONFIGURED) plus a per-field table with health icons, relative ages, and unsynced-count highlighting. cloud whoami, labeled key-value panel. cloud reconcile, per-row cloud-side table, totals panel, and a daemon-vs-cloud variance line (zero variance = green check; <0.5% = yellow tolerance; otherwise red "investigate" copy). --json flag on cloud status still emits the machine-readable shape for scripting.

Fixed

cloud connect --base-url <URL> now persists the URL to ~/.halton-meter/config.toml. Pre-0.2.1, the flag was a transient override, pairing would succeed but every subsequent command (cloud status, cloud sync, the supervisor's worker-spawn gate) would read [cloud].base_url from TOML, find it empty, and report the daemon as un-paired. Users had to hand-edit the TOML after every cloud connect. Now the connect flow writes base_url and enabled=true next to the 0600 credentials write. Other top-level config sections ([daemon], [storage], [cloud.upload]) are preserved by the TOML rewriter.
halton-meter cloud reconcile no longer 422s. Daemon sent from/to query params; cloud's reconciliation router expects from_date/to_date. Reconcile returned Field required instead of variance data.

v0.2.0 · 2026-05-12

First minor release. Closes the Phase 2 cloud-sync arc (daemon to halton-meter-cloud) and ships upload-privacy controls. The daemon's local cost computation is unchanged from v0.1.x, it has been correctly charging Anthropic prompt-cache-write tokens all along.

Added

Cloud-sync (Phase 2): halton-meter cloud CLI group. connect, disconnect, status, whoami, sync, reconcile, pause, resume. Pairing handshake (pairing/start to user approves in dashboard to pairing/poll) mints a single-shot hm_sync_… token. Token stored in ~/.halton-meter/cloud-credentials.json (chmod 0600) with a Fernet-encrypted mirror in SQLite (cloud_state.api_key_ciphertext, key at ~/.halton-meter/cloud.key).
Cloud-sync (Phase 2): Cloud worker supervised as part of the daemon process. Same-process spawn alongside the proxy / API / heartbeat tasks; gated on cloud_state.enabled so users who don't pair see zero change. Crash-isolated, a cloud-side failure (transport, 401, schema mismatch) cannot block the proxy hot path. 401 sets cloud_sync_state.paused_reason='paused_unauthorised' and stops the worker until the operator re-runs cloud connect.
Cloud-sync (Phase 2): Default off. The daemon ships with cloud.enabled = false. Cloud sync is strictly opt-in.
Upload privacy: Tiered consent presets. [cloud.upload] section in ~/.halton-meter/config.toml: preset = "minimal" | "standard" | "full". Default for v0.2.0 is standard with source_workdir = false, the high-leak field (local path) is off by default. Cloud sees provider, model, tokens, cost, project slug, hostname, prompt hash. Cloud does NOT see your workdir paths unless you explicitly opt in.
Upload privacy: Per-field global overrides. fields.source_workdir = true (or any gateable field) overrides the preset.
Upload privacy: Per-project rules. [cloud.upload.per_project."acme-secret"] can set upload = false (the project is local-only) or lock down specific fields for that project. The daemon redacts before serialisation; the cloud cannot see what the daemon never sends.
Upload privacy: CLI halton-meter cloud privacy show|set. show prints the resolved policy; set preset standard / set field.source_workdir false / set upload false --project acme-secret mutate the TOML in place (other sections preserved).
Observability + lifecycle: Daemon supervisor logs cloud.supervisor.spawned / cloud.supervisor.skip with structured reason so it's obvious whether sync is active. cloud_state row schema: workspace name, machine id, hostname snapshot, last-connected-at, paused_reason, used by cloud status. Worker retry envelope: exponential backoff capped at 300s for transport + 5xx + 429 (with Retry-After honoured). 401 never retried.

Changed

Note for operators running halton-meter-cloud: Wave 3 soak on 2026-05-12 surfaced a cost-recompute bug in the cloud side (_compute_cost in services/ingest.py) that silently dropped cache_write_tokens from the sum, producing ~17% reconcile variance on cache-heavy Claude workloads against an otherwise-correct daemon. The fix lives in halton-meter-cloud. No daemon change was needed.
Daemon binds to the same proxy/edge ports as before, no migration needed. halton-meter sync is a one-shot alias for halton-meter cloud sync (single drain pass + exit).
Notes for operators: Upgrade with pipx upgrade halton-meter; the daemon needs a restart for the new code to take effect. After upgrade, historical cost totals will not retroactively update, only requests captured after the upgrade carry the corrected cache-write cost; run halton-meter recompute-costs for historical correction. Cloud sync stays OFF unless you run halton-meter cloud connect. Known limits: production hosted cloud requires Clerk (until then, only self-hosted cloud with AUTH_MODE=dev is supported); body-capture sync is deferred to v0.2.1.

v0.1.16.dev0 · 2026-05-02

Trial-run wheel. Built locally for soak; NOT published. Stability cycle responding to 9 bugs surfaced during the 0.1.15 evening soak. Carries everything from 0.1.15 plus the items below.

Added

P0 stability + observability: launchd NumberOfFiles soft+hard limits raised to 10240 (b6404c3), was 256 (macOS default). Bug 4: FD exhaustion under system-proxy load was instant. Applies to daemon, edge, and watchdog plists; takes effect on next halton-meter init.
P0 stability + observability: Watchdog probes the daemon's MITM port directly (0dafff3, port-fixed in 5bf2864), TCP-connect every 1.5s. 3 consecutive failures + /health still up gives ERROR watchdog.mitm_unhealthy. 5 consecutive failures kickstarts daemon via launchctl. Capped at 3 restarts/hour. Bug 1: silent MITM listener death is now caught and self-healed.
P0 stability + observability: halton-meter status row for MITM listener (8833482), TCP-connects to cfg.daemon.internal_port. Banner flips to BROKEN when daemon /health is up but MITM is dead, with explicit remediation message naming the port.
P0 stability + observability: Daemon listener heartbeat + asyncio task crash hook (0215511), heartbeat emits daemon.heartbeat mitm_port=:8090 mitm_listening=True api_listening=True db_writeable=True open_fds=N every 30s. Asyncio task crashes now surface via daemon.task_crashed instead of being silently swallowed. Two previously-nameless create_task calls got names.

Changed

Notes for operators: This wheel REPLACES 0.1.15 for the soak (Gemini fix + FD limits + watchdog auto-restart + observability). Rollback path is 0.1.15 (still in dist/). After install, run halton-meter init --apps again to apply the new launchd FD limits. New halton-meter status will show one extra row (MITM listener). Pending follow-ups (Sprint 2 + 3): P1.1 chain re-resolve audit (Bug 2), P1.2 doctor proxy auto-disable detection (Bug 5), P2 passthrough DNS / concurrency limits (Bug 3), P3 INSTALL.md/README.md gotchas for gemini-cli, Codex CLI, Antigravity bypass (Bugs 7/8/9). Suite: 1104 passed, 3 skipped, ruff clean.

Fixed

Gemini response-side adapter lookup (4a30183), request hook fired correctly with provider=gemini then immediately addon.response.no_adapter. Pass flow.request.path to response-side find_adapter. Zero Gemini rows had been landing in the DB.
Watchdog probes daemon's internal_port (8090), not the edge port (5bf2864), earlier 0dafff3 wired the new MITM probe against the wrong port. The edge keeps accepting connections even when the daemon's MITM is dead, so probing edge would never have detected Bug 1.

v0.1.15 · 2026-05-02

Release-shape, NOT published. Promoted from 0.1.15.dev0 after the joint Smart-Attribution-v0.3 + multi-provider build proved clean. Wheel sits in `daemon/dist/` per the locked distribution-sequencing decision (no public PyPI publish until the binary track lands at the final stage). Same scope as 0.1.15.dev0.

Changed

One operator follow-up still owed: XAI_RATES were set from training-cutoff knowledge of xAI's pricing. Spot-check against https://docs.x.ai/docs/models and patch in-place if any model rate has drifted. Fix is a tiny edit to pricing/matrix.py and does not require a new wheel.

v0.1.15.dev0 · 2026-05-02

Trial-run wheel. Built locally; NOT published to PyPI per the cadence rule. Carries everything from 0.1.14.dev0 plus the multi-provider adapter cycle (OpenAI / Gemini / xAI) plus the start-command health-probe fix.

Added

OpenAI adapter (api.openai.com): /v1/chat/completions, /v1/responses, /v1/embeddings. Streaming + non-streaming. o-series reasoning tokens routed to the thinking_tokens field and billed at the output rate. Prompt-cache reads via usage.prompt_tokens_details.cached_tokens. /v1/moderations is free, adapter declines via matches_url so $0 rows don't clutter the DB. 14 models priced (gpt-4o, gpt-4o-mini, gpt-4.1 family, o1/o3/o4-mini, embeddings models).
Gemini adapter (generativelanguage.googleapis.com): :generateContent, :streamGenerateContent, :embedContent, :batchEmbedContents. Model extracted from URL path (regex). Streaming is JSON-array (not SSE). Tiered pricing >200k input tokens already correct via GEMINI_TIERED_RATES. Multimodal under-counts logged.
xAI / Grok adapter (api.x.ai): /v1/chat/completions (OpenAI shape) + /v1/messages (Anthropic shape). URL-dispatches to sibling adapter parsers, rewrites provider to "xai". 7 models priced (grok-4, grok-4-mini, grok-3 family, grok-2-vision-1212).
ProviderAdapter.matches_url(host, path) added to the adapter Protocol, an additive bump with a getattr fallback in dispatch so existing adapters keep working without forced edits. Adapters can now opt out of specific URL paths (used by OpenAI to skip /v1/moderations).
Redaction patterns extended to OpenAI (sk-…, sk-proj-…) and xAI (xai-…) API keys. Google API keys (AIza…) were already covered. Body capture for the new providers inherits redaction for free.

Changed

Notes for operators: Body capture default is ON for all three new providers; per-project opt-out via halton-meter project <slug> set body-capture off works identically across providers. XAI_RATES were set from training-cutoff knowledge of xAI's pricing page, not a live fetch; spot-check against https://docs.x.ai/docs/models before promoting 0.1.15. This wheel REPLACES 0.1.14.dev0 for soak. Suite: 1055 passed, 3 skipped, ruff clean.

Fixed

halton-meter start's health probe now reads runtime.toml instead of hard-coded port defaults. Previously, when ports fell back from 8765 to 8766, start would print "daemon loaded but /health never came up on :8765 within 15s" and exit non-zero even though the daemon was healthy on the fallback port. Same shape as the watchdog port-fallback fix in 0.1.14.dev0.

v0.1.14.dev0 · 2026-05-02

Trial-run wheel. Built locally for soak; NOT published to PyPI per the cadence rule. Ships Smart Attribution v0.3, the biggest behavioural shift to project tagging since the edge attribution store landed.

Added

Smart Attribution v0.3, three new attribution layers in the resolver chain (both edge and daemon): Layer 4, Git repo basename (Option A, branch info intentionally dropped). ~/code/halton-meter becomes halton-meter. Per-cwd memo, no TTL. Layer 6, Container detection, emits k8s:<HOSTNAME> or docker:<HOSTNAME> only when actual container signals are present (/.dockerenv, KUBERNETES_SERVICE_HOST, or /proc/1/cgroup containing docker/kubepods/containerd); cannot fire on a developer laptop, pinned by a regression test. Layer 6.5, macOS sandbox bundle id, sandboxed Mac apps (Perplexity, ChatGPT desktop, etc.) whose cwd is ~/Library/Containers/<bundle>/Data now write mac:<bundle-id> (e.g. mac:ai.perplexity.mac) instead of collapsing to project Data.
New shared module daemon/halton_meter/attribution/layers.py, pure-function home for every per-layer resolver. tagging.py and edge_attribution.py delegate.
clear_git_cache() test helper for layer 4's per-cwd memo.
Daily background body-capture retention sweep inside the daemon process. Logs bodies.retention_sweep deleted=N retention_days=D elapsed_ms=X. Default retention_days = 90, configurable via ~/.halton-meter/config.toml [bodies] retention_days = ….
.haltonrc body_capture flag is now plumbed end-to-end. Drop body_capture: off (or = off) into a project's .haltonrc and the daemon stops persisting request_bodies rows for traffic from that cwd. Precedence: master switch to project_settings (CLI) to attribution_log (rcfile) to default ON.
attribution_log.body_capture_enabled BOOLEAN NOT NULL DEFAULT 1 column. PRAGMA user_version 2 to 3.
halton-meter bodies show <id> rich panelled output, header, redaction badges, request and response panels. (Was raw JSON.)
halton-meter bodies stats polish, totals, per-project breakdown, oldest/newest, redaction-applied %, retention horizon.

Changed

Slug normaliser is now stricter and lower-cased. [^a-z0-9:_/-.] -> -, collapse repeats, strip leading/trailing -, truncate to 64. . is allowed (required so mac:ai.perplexity.mac survives). Project tags from rcfiles with internal whitespace (e.g. "my project") start persisting as my-project. Existing live slugs are valid under the new rules, no rename. Old DB rows are NOT retroactively normalised.
Notes for operators: No PyPI publish in this release (per the distribution-sequencing ADR, the public PyPI publish is bundled with the signed-binary track at the final pre-scale stage). Soak this wheel locally before promoting; suggested 5-7 days on the daily driver plus at least one Linux machine. Suite: 943 passed, 3 skipped, ruff clean.

Fixed

Watchdog now reads the runtime ports file (~/.halton-meter/runtime.toml) at startup AND per tick. Previously hard-coded to defaults 8081/8765, which produced a noisy watchdog.no_interfaces_pointing_at_us log loop whenever the daemon fell back to 8082/8766.
Three pre-existing red edge tests (test_edge_chain_resolve.py x2, test_edge_e2e.py::test_edge_lifecycle_chain_then_passthrough_then_chain_then_invalidate) now pass, closed out indirectly by the dev2 close-path + IDE-args-sniff fixes.

v0.1.13 · 2026-05-02

Built 2026-05-02 (release-shape, NOT published). Built into `daemon/dist/halton_meter-0.1.13-py3-none-any.whl` and held under the same no-publish gate as 0.1.14.dev0. Superseded for soak purposes by 0.1.14.dev0; will be removed from the dist archive when 0.1.14 promotes.

Added

Body-capture PR2 + PR3 (hot-path capture, redaction, opt-out via CLI, daily retention sweep).
Edge attribution sniffs IDE workspace from cmdline (Windsurf / Cursor / VS Code).

Fixed

Body-capture race fix, chained counts then body writes.
Edge connection close path no longer awaits wait_closed (Python 3.14 compat).

v0.1.11 · 2026-05-02

Last public PyPI release (https://pypi.org/project/halton-meter/0.1.11/). No date is given in the changelog; the date shown here is inferred from surrounding entries and should be treated as approximate.

Added

Last public PyPI release. https://pypi.org/project/halton-meter/0.1.11/