v0.4.1 · 2026-06-08
Adds pricing for Cursor Composer 2.5 (Fast tier), which the existing Grok-CLI adapter already captured but left unpriced.
Added
- Cursor Composer 2.5 pricing. Composer 2.5 (Fast tier) is served via the
grokCLI oncli-chat-proxy.grok.com/v1/responsesas modelgrok-composer-2.5-fast(api_backend=responses,agent_type=cursor), so it was already captured by the existing Grok-CLI adapter; it was just unpriced (NULL cost). Added the operator-supplied Fast-tier rate ($3.00 / $15.00 per MTok input / output); usage attributes to the xAI / Grok-CLI surface in reports.
v0.4.0 · 2026-06-08
First-class Linux and Windows support, plus cross-platform safety fixes. Driven by a multi-agent audit of the per-OS install/uninstall/lifecycle paths. macOS behaviour is unchanged except for the L-M7 stable-bundle adoption and the X1/X2 safety fixes, which apply to all platforms. Linux/Windows remain validated by mocked unit tests only; real-hardware soak is the next gate (a non-blocking macOS + windows-latest CI axis was added to start exercising them).
Added
- Non-blocking
macos-latest+windows-latestCI axes; a central test fixture that heals cross-test PEP-562 dispatch-package monkeypatch leaks.
Fixed
- Cross-platform (all OSes):
uninstall --purgecould destroy the cost database (X1). When$TMPDIR/%TEMP%was on a different volume than$HOME, the move-aside ofdb.sqliteraised a cross-volumeEXDEV, the file was skipped from the preserve list, and the subsequent wipe deleted it. The purge now stages on the same volume and hard-aborts (restoring and refusing the wipe) if any DB file cannot be moved aside intact. - Cross-platform (all OSes): Storage migration robustness. A covering-index migration step ran
CREATE INDEX … ON request_bodiesbefore that table existed on very old databases, rolling the whole migration back; it is now guarded. The migration chain is now fail-stop: a step that swallows a transientALTERfailure halts the chain instead of letting later steps advanceuser_versionpast the gap and permanently mask the missing column. - Linux:
init --appsactually meters now. Previously it wired noHTTPS_PROXYand no cert trust (all macOS-gated) while the success panel claimed success, a silent no-op. It now persists the proxy + CA env via~/.config/environment.d/and a consent-gated shell-rc block (both pointing at a stable, venv-independent CA bundle at~/.halton-meter/cacert.pem), makes a best-effort system-trust install (update-ca-certificates/ p11-kit /update-ca-trust, admin-free degradation), and the post-install self-test now validates the Linux wiring. - Linux: systemd lifecycle repaired.
StartLimitIntervalSec=0on the units (a crash-looping edge no longer latches into a permanent unbound-port outage);startbrings up daemon + edge (not a phantom watchdog) andreset-faileds first;statusreports the edge with a STOPPED fail-open rescue and exits 0 on a healthy box;stopleaves the edge bound;installsurvives non-systemd distros. - Windows: The daemon no longer hijacks the machine-wide WinINET proxy. It is now sentinel-gated (apps mode is a no-op, apps inherit
HTTPS_PROXYfromHKCU\Environment), listener-guarded, and snapshots the user's prior proxy before any change, restoring it on uninstall instead of stranding a dead-port proxy. - Windows: Install establishes TLS trust (generates the CA, trusts it in the Windows store, patches certifi, writes the full env incl.
NODE_EXTRA_CA_CERTS) so Claude Code (Node) and Python SDKs no longer hard-fail TLS. Uninstall untrusts the root CA by thumbprint (it previously passed a file path tocertutil -delstore, leaving the universal MITM root trusted forever).initnow installs (was a no-op stub); realstart/stop/status;schtasksquoting survives spaced paths; doctor gains a Windows branch; the CA private key is ACL-restricted. - Security:
uninstall --include-logsnow removes~/.mitmproxy/(the universal MITM CA cert and private key) on all OSes, and macOS untrusts the cert from the keychain on uninstall; previously the forge-any-TLS key survived an "uninstall."
v0.3.11 · 2026-06-07
Cloud body uploader policy-rejection handling, new pricing/body status commands, and a covering index for fast body upload status.
Added
CloudBodyPolicyRejectedexception type. New typed exception (errors.py) for the200 + stored=falsecase, distinct from transient failures (retry with backoff), quarantine targets (terminal 4xx), and auth errors (pause). Never quarantines, never backs off, never pauses; just skips the record and continues draining. Wires into the same exception-dispatch table asCloudNotReady(425) andCloudSchemaMismatch(4xx).records_policy_rejectedcounter onBodyDrainResult. Tick-level count of records skipped due to workspace policy. Surfaced inrun_foreveras aWARNINGlog when non-zero:cloud.body_uploader.policy_rejected_tickwith count and hint.halton-meter cloud bodies statuscommand. Per-day body upload progress table showing uploaded / pending counts with done and in-progress indicators. Queries local SQLite directly, no daemon or network needed. Accepts--days Nto widen or narrow the window (default 30 days).- "Body last upload" row in
halton-meter cloud status. ShowsMAX(uploaded_at)from local SQLite, a real upload-activity signal independent of the tick stamp. During a heavy backlog drain (where the tick stamp goes stale while 400 POSTs run for 10+ minutes) this row stays green and prevents a false "uploader may be dead" alarm.
Changed
- Performance: Covering index
(captured_at, uploaded_at)onrequest_bodies(migration 13 to 14).cloud bodies statuswas taking 20-53 s because SQLite had to read every row's data pages (body text up to 1 MiB each) to retrieve two small timestamps. The covering index lets the query run entirely from index leaf pages: 17 ms vs 19 s (1000x). Existing installs get the index on first daemon start after upgrade.
Fixed
- Cloud body uploader now respects workspace
store_prompt_contentpolicy. Previously, when a workspace had body storage disabled, the cloud backend returned200 OK(accepted but silently discarded), and the daemon treated this identically to a successful204 No Contentand stampeduploaded_at, permanently losing the body. The uploader now distinguishes204(stored: mark and advance) from200 + {"stored": false, "reason": "store_prompt_content_disabled"}(policy rejected: do NOT markuploaded_at). Bodies rejected by workspace policy remain pending and are automatically retried when the policy is re-enabled, no manual reset required. - Body uploader staleness threshold raised 90 s to 300 s. The prior threshold (3 × 30 s metadata interval) fired false positives during any non-trivial backlog drain. The body tick row now also clears to ● when uploads are actively flowing within the threshold window, even if the cycle-start stamp is older.
v0.3.9 · 2026-06-07
Attribution-correctness fixes: opportunistic eviction TTL, and auto-classification so new AI tools no longer fall to misc.
Fixed
- Opportunistic eviction TTL corrected (attribution correctness). The
attribution_storewrite path calledevict_stale()with the 300 s default instead of the daemon prune-loop's 86 400 s (24 h). On a busy machine, any attribution row older than 5 minutes could be silently evicted mid-session, causing the remainder of a long Claude Code session to fall through tomisc. Fixed to passmax_age_s=86400.0explicitly, matching the prune loop. - Devin and other new AI tools no longer fall to
misc. Sandboxed/Electron helper processes (Devin, future agents) reportcwd=/, which exhausts every cwd-based attribution tier. Two fixes: (1)devinadded to the defaultprocess_mappings; (2) newauto_classify_unknown_processesconfig flag (defaultTrue), when a process name is found via psutil but is not inprocess_mappings, the daemon derives a slug from the process name (e.g.Devin Helper (Plugin)becomesdevin-helper-plugin) and uses it as the project label instead of falling through tomisc. Unknown new tools are now attributed automatically;process_mappingsoverrides still win.
v0.3.8 · 2026-06-05
Pricing rate auto-pull (fetch + apply), default-off, so a rate change no longer requires a daemon version bump and PyPI republish. Includes fail-open behaviour and audit hardening folded into the unreleased feature.
Added
- Pricing rate auto-pull (fetch + apply), default-off. The daemon can now refresh its
pricing_ratestable from the cloud's published rates out-of-band. A new supervised background worker (pricing.worker) runs under the existingCloudWorkerSupervisor(respawn + bounded backoff + per-tick heartbeat) and, every[pricing].pull_interval(default 6h, floor 1h), fetches the published rates over one of two channels: paired daemons use the authenticated cloudGET /v1/pricing/rates; unpaired daemons fall back to a public[pricing].pull_url, then run the same pure validate (pricing/pull.py) + upsert (pricing_cli.apply_rates,source='auto') path. No signing / no new dependency. Default OFF ([pricing].auto_pull = false); opt-in this release, withHALTON_METER_NO_RATES_PULL=1as a hard opt-out. halton-meter pricing refresh, one-shot fetch + validate + apply against the DB now (prints updated / skipped-operator-override / unchanged).halton-meter pricing status, shows auto-pull on/off, pull URL, the channel that would be used, last pull timestamp, the applied watermark (version/date), active-row counts by source (bundled/auto/operator), and the last error.halton-meter pricing auto-pull on|off, flips[pricing].auto_pullinconfig.toml.- Fail-open: Auto-pull is fully fail-soft.
auto_pull=falsenever spawns the worker; a network/timeout/non-2xx fetch logspricing.pull_failed, stampslast_error, and keeps the last rates; a malformed body logspricing.pull_invalidand leaves the DB untouched; a stale-vs-bundled or downgrade payload is a no-op; anapply_ratesfailure rolls the whole batch back; a worker death is respawned by the supervisor. Outbound HTTP useshttpx.Client(trust_env=False). The cost path itself makes no network call and is unchanged.
Fixed
- Hardening (audit fixes): Relative >10x fat-finger guard on auto-pulls. The coarse absolute $0-1000/MTok ceiling let a published $300-where-$3.00-was-meant typo (100x) pass (300 < 1000).
apply_ratesnow rejects the whole auto pull (all-or-nothing, transaction rolled back,last_errorstamped) if any priced field moves more than 10x in either direction versus the current active row. The guard issource='auto'only, bundled (human-reviewed wheel) and operator (pricing set) writes are exempt. A 0-to-nonzero or nonzero-to-0 move on input/output also trips the guard. - Hardening (audit fixes): Date-aware bundled-vs-auto precedence (silent-downgrade fix). The newer-auto guard was date-blind: it protected an auto row unconditionally, so after a wheel upgrade a stale auto row could shadow newer bundled rates. Precedence is now: operator always wins; within the non-operator tier the newer provenance date wins. A newer-dated bundled refresh supersedes a stale auto row; a fresher auto row still beats an older bundled refresh.
- Hardening (audit fixes): Bundled refresh no longer pollutes the auto-pull heartbeat. A
source='bundled'refresh (runs on every fresh install + wheel upgrade viarefresh_bundled_if_newer) now updates only the provenance watermark (applied_bundled_date/applied_rates_version); it no longer stampslast_pull_ator clearslast_error. Those belong to the auto-pull worker, sopricing statuson a never-pulled machine no longer shows a phantom "Last pull at" or silently clears a prior auto-pull error. Onlysource='auto'touches them. - Hardening (audit fixes): HTTPS-only enforced on the rates fetch. Channel 2 ships unsigned this release, so plain HTTPS + sanity bounds + the monotonic gate are the entire transport trust story.
fetch_ratesnow refuses any non-https://resolved URL before sending (anhttp://Channel-1 endpoint would have leaked the pairing token as a cleartextAuthorization: Bearer), and[pricing].pull_urlis validated at config-load (fail-open: an unsafe value is dropped to no Channel-2 fetch, never crashes the daemon). - Hardening (audit fixes): Deterministic startup jitter. The fleet-spread startup jitter now derives from a SHA-256 of the machine fingerprint instead of the builtin
hash()(which isPYTHONHASHSEED-salted and varied run-to-run), so the "reproducible per machine" property actually holds.
Removed
- Removed a stray committed Claude Code session lock (
.claude/scheduled_tasks.lock) and gitignored.claude/*.lock.
v0.3.7 · 2026-06-05
Bundled pricing matrix refreshed to the 2026-05-31 cloud canonical, correcting several stale or drifting per-token rates across Claude, OpenAI, Gemini, and Grok.
Changed
- Bundled pricing matrix refreshed to the 2026-05-31 cloud canonical.
BUNDLED_RATES_DATEbumped 2026-05-01 to 2026-05-31;rates-manifest.jsonbumped in lockstep. All values cross-referenced against the cloud single source of truth (halton-meter-cloudrates.json, effective 2026-05-31). - Grok rates refreshed (finding D-3). Added
grok-4.3/grok-4.3-latest(flagship $1.25 in / $2.50 out / $0.20 cache_read) and thegrok-4.20-*family. Re-pointed legacy idsgrok-4,grok-4-mini,grok-3,grok-3-fast,grok-2-vision-1212to the grok-4.3 rate per the canonical. (grok-3-mini,grok-3-mini-fastare not in the canonical and are left unchanged; thegrok-buildCLI SKUs already matched.) - Added the missing Gemini Flash models (finding D-4).
gemini-3.5-flash($1.50/$9.00),gemini-3.1-flash-lite($0.25/$1.50),gemini-2.5-flash($0.30/$2.50),gemini-2.5-flash-lite($0.10/$0.40), all withcache_write=0per the Gemini caching convention.
Fixed
claude-opus-4-8now prices correctly in the local report (finding D-1). The bundled rate matrix had noclaude-opus-4-8entry, socompute_cost_millicents('claude-opus-4-8', …)returnedNoneand the terminal report showed ~$0 for today's flagship Claude model. Addedclaude-opus-4-8and the dated aliasclaude-opus-4-8-20260219at the published Standard-tier rate ($5 in / $25 out / $0.50 cache_read / $6.25 cache_write).- Stale OpenAI flagship rates corrected (finding D-2).
gpt-5.5was a ~2.5x undercount at $2/$8, corrected to its published $5 in / $30 out / $0.50 cache_read.gpt-5.4corrected to $2.50/$15/$0.25.gpt-5.4-minicorrected from the gpt-4.1-mini mapping to its own published $0.75/$4.50/$0.075. (gpt-5.2,gpt-5.3-codex,codex-auto-review,codex-unknownkeep their closest-tier mapping, still no published per-token rate.) - Four drifting Gemini standard-tier rates reconciled to the cloud canonical. Most critically,
gemini-3-flash-preview(the on-the-wire Code Assist SKU) was a live 2x undercount, priced at $0.25/$1.50, exactly half its published Standard rate of $0.50 in / $3.00 out / $0.05 cache_read. Now corrected (and thegemini-code-assist-unknownfallback row that mirrors it). The other three were cache_read/cache_write drift:gemini-3.1-pro-previewcache_read $0.50 to $0.20;gemini-2.5-procache_read $0.31 to $0.125;gemini-3.1-flash-live-previewcache_read $0.075 to $0 (no published per-MTok context-caching line). All four now carrycache_write=0per the Gemini caching convention. Input/output and the >200k tiered/modal rows were already correct and are unchanged.
v0.3.6 · 2026-06-02
Large request bodies no longer silently vanish from the cloud, with byte-accurate truncation at capture and upload time plus added headroom on the per-upload cap.
Changed
- The default per-upload body cap now has headroom over the store cap.
[cloud.bodies] max_body_bytes_per_uploaddefaults to 1 MiB (was 512 KiB, exactly equal to the 512 KiB store cap, which left zero headroom for UTF-8 inflation). Decoupling them makes the drop impossible on a fresh install even before the truncation fixes matter. max_body_bytes_per_uploadis re-read each drain tick. Previously captured once at worker spawn, so a cap change needed a daemon restart, inconsistent with thebodies.upload_fromwatermark, which already re-reads per tick. A cap change now takes effect on the next tick without a restart. No DB migration.
Fixed
- Large request bodies no longer silently vanish from the cloud. When a captured body leg exceeded the per-upload size cap, the cloud body uploader logged
skip_oversize_*and uploaded nothing for that leg, yet still stamped the rowuploaded_at, so the dropped leg was never retried and permanently lost (the dashboard showed "Request body: Not captured" for large LLM requests while the smaller response leg uploaded fine). The uploader now truncates an oversized leg to the cap on a valid UTF-8 boundary and uploads it, never dropping it; the wire envelope keeps the true original size so the dashboard can render "truncated (N of M bytes)". A multibyte character straddling the cap is never split. Theskip_oversize_*logs are replaced byupload_truncated_*(info). - Stored bodies no longer exceed the store cap in bytes. Capture-time truncation sliced the body by character count (
text[:max_bytes]), so a multibyte body could be stored a few bytes-to-KB OVER[bodies] max_body_bytesin encoded UTF-8. It now truncates by encoded UTF-8 byte count on a char boundary, so a stored body never exceeds the cap. (This byte/char mismatch, combined with the equal default caps below, is what made oversized legs hit the upload drop in the first place.)
v0.3.5 · 2026-06-01
Timeline-selective body upload via a `bodies.upload-from` watermark, and a privacy-by-default change so enabling body sync uploads from now rather than backfilling all history.
Added
- Timeline-selective body upload (
bodies.upload-fromwatermark). The cloud body uploader can now be scoped to only upload request/response bodies captured on or after a chosen point in time, instead of always draining the full local history oldest-first. Set it withhalton-meter cloud privacy set bodies.upload-from now(only bodies from here forward),… bodies.upload-from 2026-05-15(an ISO date or datetime, normalised to UTC), or… bodies.upload-from none(upload all history, the legacy behaviour). Surfaced incloud privacy show. Config-driven ([cloud.bodies] upload_from), re-read each drain tick so a change takes effect without a daemon restart; no DB migration. Fail-open: a malformed value falls back to no filter and logs.
Changed
- Enabling body sync now defaults to "from now," not a full-history backfill.
halton-meter cloud privacy set bodies.enabled truenow stampsupload_from = <now>on the off-to-on transition and prints a notice, so turning body sync on uploads new traffic going forward rather than retroactively shipping the entire local body store. Already-enabled installs are unaffected (no retroactive change). Runcloud privacy set bodies.upload-from noneto opt into uploading history. Rationale: privacy-by-default and no surprise bulk upload (seememory/decisions.md).
v0.3.4 · 2026-06-01
The continuous cloud-sync workers now self-recover under a supervisor instead of silently dying, with per-worker heartbeats so cloud status can distinguish a dead worker from an idle one.
Fixed
- The continuous cloud-sync worker now recovers itself instead of silently dying. Both background workers (the metadata sync worker and the body uploader) were spawned once and never supervised: if the task exited for any reason (a stray cancellation, a
BaseException, even a clean return) it died with no log line and was never respawned, so cloud sync stalled until the next daemon restart. This surfaced with no error and no failure-counter movement (Consecutive failures: 0,Last error: (none)), because those fields only change on a failed sync; a dead worker writes nothing at all. Both workers are now wrapped in aCloudWorkerSupervisorthat detects unexpected exits, logscloud.supervisor.worker_exited(retrieving the exception so it's never swallowed), and respawns with exponential backoff (1s to 60s, capped at 8 consecutive fast restarts then degrade-not-crash; a healthy run resets the counter). Shutdown cancellations are classified and never trigger a respawn. cloud statuscan now tell a dead worker from an idle one. Added per-worker heartbeats:Worker last tickandBody uploader last tick, stamped on every loop iteration including idle ones (newlast_tick_at/last_body_tick_atcolumns, schemauser_version10 to 12, additive verify-before-bump migrations). Previously a stalled worker and a caught-up worker were indistinguishable in the status output, which is why a 47-minute stall went unnoticed.- Body-uploader teardown gap closed. The body-uploader task was never added to the daemon's teardown set, so it wasn't cleanly cancelled/awaited on shutdown. It now is.
v0.3.3 · 2026-05-31
Reconcile no longer 422s by sending calendar dates over a like-for-like window, plus macOS test-suite isolation fixes.
Fixed
halton-meter cloud reconcileno longer 422s. The command built full ISO datetimes (…T10:19:06Z) for the cloud'sdaemon-totalsendpoint, but that endpoint typesfrom_date/to_dateas calendardate, so every reconcile was rejected 422. It now sends plain ISO dates, and the local daemon-side total is summed over the same inclusive calendar-date window so the daemon-vs-cloud variance line compares like-for-like (instead of a rolling 24h x N window vs N calendar days). Added a command-level regression test (the prior test only exercised the client method, not the command's date construction).- macOS test-suite isolation. Cleared two cross-file global-state leaks surfaced when the full suite runs on a macOS dev box with a live daemon: structlog's
cache_logger_on_first_usebindcache (whichreset_defaults()doesn't clear, silently bypassingcapture_logs()), and the_macos_interfacesTTL cache (which seeded the live machine's interfaces into later mocked tests). Updated several FAIL-OPEN-1 tests to expect the blank-then-disable behaviour and to stub the listener/daemon-listener probes so they don't depend on a running daemon. Test-only; no product change.
v0.3.2 · 2026-05-31
Machine-identity stability: stop the daemon from re-identifying as a new machine (and accumulating sync keys).
Changed
halton-meter cloud connectis now idempotent. If valid credentials already exist and still authenticate (awhoamiprobe), it reports the existing connection and skips pairing, so re-runningconnectno longer mints a fresh key and revokes the prior one on every invocation (which left a trail of revoked keys). Pass--forceto re-pair on demand (switch workspace, or recover after revocation). Only a confirmedwhoamishort-circuits; missing/invalid creds or an unreachable cloud fall through to normal pairing.
Fixed
- Persist the machine fingerprint to
~/.halton-meter/machine-id(mode 0600, atomic write). The fingerprint was already derived from a stable hardware id (macOSIOPlatformUUID/ Linux/etc/machine-id/ WindowsMachineGuid) but was recomputed live on every pair and never cached, so a transient probe failure made it returnNone, the daemon omitted it, and the cloud fell back to hostname+os dedup and minted a new machine + key. Now the derived value is cached and read first, so a probe blip or hostname rename can never change identity. The derivation basis is unchanged, so existing machines keep the same fingerprint, they just stop drifting. (Pairs with halton-meter-cloud #150: revoke-on-mint + one-active-sync-key-per-machine unique index.)
v0.3.1 · 2026-05-31
Production-readiness hardening: a system-proxy fail-open failsafe, activation of cloud quarantine, and the coordinated 425/404 body-upload contract.
Added
- FAIL-OPEN-1, macOS system-proxy failsafe. The system proxy can no longer be left enabled pointing at a dead loopback port (the fail-CLOSED reinstall trap a user hit: uninstall left
127.0.0.1:8081retained, sopip/uvgotConnection refused). Four independent mechanisms: a sentinel-independent watchdog reaper that clears any enabled loopback proxy with no listener; an enable-guard that refuses to point the OS at an unbound port; an ungated edgeatexit/signal-trap disable; anduninstall/stop/reset-proxynow BLANK the server:port, not just toggle state. Seememory/decisions.md2026-05-31. - Cloud quarantine activated (H7/M2). A single contract-incompatible record can no longer freeze metadata sync (H7) or silently vanish a body (M2). The supervisor now wires a quarantine writer into both the metadata worker and the body uploader; terminally-rejected records are recorded in
cloud_quarantineand skipped past so the queue drains.halton-meter cloud statusshows aQuarantined: Ncount (a contract-drift early-warning signal; healthy = 0).
Changed
- Coordinated 425/404 body-upload contract. A
425 Too EarlyfromPOST /v1/requests/{id}/body(request not synced yet) is treated as transient, the not-ready body is skipped so it can never wedge the bodies behind it, and retried next tick, while404(wrong-tenant / genuine orphan) is terminal and quarantined (recorded, never silently discarded). The cloud worker's inter-tick sleep now wakes promptly on shutdown. Deploy order (load-bearing): the cloud's 425 split must be live BEFORE this daemon reaches production. A 0.3.1 daemon maps 404 to quarantine; against an older cloud that still returns 404 for the unsynced race it would wrongly quarantine early bodies. Deploy cloud-first.
Fixed
- PR #46 prod-audit Tier 1/2 hardening (merged): migration verify-before-bump (no
user_versionstamp on a failedALTER),_pendingcapture-dict eviction (no unbounded growth),busy_timeout, redact-before-truncate, edge connect/idle timeouts, shutdown drain + WAL checkpoint. - Added
pytest-timeout(dev) with a global per-test timeout so a rare async-deadlock test fails by name instead of hanging the whole suite/CI.
v0.3.0 · 2026-05-29
End-to-end error classification across all four providers (Anthropic, Gemini, OpenAI, Grok/xAI).
Added
- End-to-end error classification on outgoing cloud log records:
error_class,provider_error_code,http_status,retryable. Shipped for Anthropic, Gemini, OpenAI, and Grok/xAI traffic. - OpenAI error classifier (
daemon/halton_meter/adapters/openai.py) covering the full HTTP 4xx/5xx surface. 429 responses are bucketed byerror.typethenerror.code:rate_limit_errorbecomesrate_limit(retryable),insufficient_quotabecomesauth(non-retryable). This matches operator-remediation semantics, not raw HTTP status. - Grok/xAI provider routed through the OpenAI classifier, no separate adapter. xAI is OpenAI-SDK compatible at the error envelope level.
Changed
- Internal error-class enum split
timeout_or_networkinto distincttimeoutandnetworkvalues for end-to-end fidelity. Wire-side bucket vocabulary remains the canonical seven:rate_limit | server_error | bad_request | auth | timeout | network | unknown. Unknown local values fall through tounknownon the wire (forward-compat). - Anthropic
overloaded_error(HTTP 529) classified asserver_errorwithretryable=true, a provider-availability signal, not quota. - Wire contract: New fields on cloud log records (all nullable):
error_class,provider_error_code,http_status,retryable. Backend tolerates older daemons emitting none of them.error_message_hashremains local-only and is never serialised to the wire. - Notes: Daemon version bump (
pyproject.toml) andv0.3.0git tag intentionally NOT included in this changelog entry's commit, gated on cloud migration 0040 deploying to prod RDS and the four new fields landing successfully in therequeststable via smoke-test. Add the version bump in a follow-up commit immediately before tagging.
v0.2.15 · 2026-05-28
Restore Grok CLI metering; self-heal missing keychain trust settings.
Added
- Tests: 5 new tests in
tests/test_setup.py::TestHasAdminTrustSettingsMacoscovering the populated/empty/timeout/missing-binary/substring-false-positive cases for the new helper. 1 new testtest_s1b_drift_verify_ok_but_no_admin_trust_settings_returns_falsereproducing the 2026-05-28 Grok incident at unit-test level.test_s1_*updated to mock both gates viaside_effect; pins gate-2 argv to["security", "dump-trust-settings", "-d"].test_s2_*pinsmock_run.call_count == 1so a future gate-reorder cannot accidentally pass for the wrong reason.
Fixed
- Grok CLI metering restored. Re-adds
cli-chat-proxy.grok.comtoLLM_INTERCEPT_HOSTS. The 0.2.14 exclusion was based on a misdiagnosis: Grok CLI's binary is linked againstrustls-platform-verifier+rustls-native-certs(verified viastrings), which consult the macOS keychain, they do not bundle WebPKI roots. The real cause of thetlsv1 alert unknown cafailure was a drift state in the local keychain (see next item), not a fundamental TLS-stack incompatibility. End-to-end verified on macOS 15.x with grok 0.2.3: fourgrok-buildrows captured with full token + cost attribution. trust_cert_macos()now self-heals missing admin trust settings._is_cert_trusted_macos()previously short-circuited based onsecurity verify-cert -p sslalone, which is more lenient than the SecTrust settings enumerator (SecTrustSettingsCopyCertificates) used by stricter modern verifiers likerustls-platform-verifier. On a machine where the mitm CA was imported into/Library/Keychains/System.keychainbut missing fromsecurity dump-trust-settings -d(a drift state observed on 2026-05-28),trust_cert_macos()reported "already trusted" and skipped re-runningadd-trusted-cert. The check is now two-gated: bothverify-cert -p sslrc=0 and an explicitCert N: mitmproxyentry indump-trust-settings -dare required. When the drift state is detected, the existing admin-dialog re-trust path is triggered automatically on the nexthalton-meter init.
v0.2.14 · 2026-05-28
Fix Grok CLI broken by TLS interception.
Fixed
- Removed
cli-chat-proxy.grok.comfrom the intercept allowlist. The Grok CLI is built with Rust'sreqwest/rustls, which bundles its own WebPKI CA roots and ignores the macOS system keychain. It rejected the mitmproxy CA cert withtlsv1 alert unknown ca, breaking every Grok CLI request with "Retry failed: reqwest error stream". Removing the host fromLLM_INTERCEPT_HOSTSrestores fail-open passthrough, Grok CLI traffic tunnels through unmetered. TheGrokCLIAdapteris retained in the registry; re-enable the host if Grok CLI adopts system-CA trust in a future release.
v0.2.13 · 2026-05-28
Hardware fingerprint for machine deduplication in pairing.
Added
- Hardware fingerprint in pairing start.
POST /v1/pairing/startnow includes afingerprintfield, a 32-char hex string (SHA-256 of a platform-specific hardware ID, truncated). Source: IOPlatformUUID on macOS,/etc/machine-idon Linux,MachineGuidregistry key on Windows, falling back toSHA-256(hostname + MAC). The backend uses the fingerprint to upsert rather than insert, preventing duplicate machine rows when a daemon re-pairs after a reinstall or version upgrade. Fingerprint derivation never raises and returnsNonegracefully so pairing is never blocked. - Tests: 5 new tests in
tests/test_fingerprint.py, shape validation, macOS UUID parsing, Linux machine-id reading, total-failureNonepath, and determinism.
v0.2.11 · 2026-05-24
Windows apps-mode support (phase 0) plus cloud-connect stale task fix.
Added
- Windows apps-mode support.
uvx halton-meternow works on Windows 10 / 11. No admin rights required for apps-mode. The daemon and edge ship as Task SchedulerONLOGONuser-level tasks. The mitmproxy CA cert is installed into the user cert store viacertutil -user -addstore Root. The system proxy is written toHKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings(ProxyEnable,ProxyServer) with aWM_SETTINGCHANGEbroadcast so running Electron / WinINet apps pick up the change without restart.HTTPS_PROXYis written toHKCU\Environment. Four new platform modules:system_proxy/_windows.py,install/_windows.py,lifecycle/_windows.py,setup/_windows.py. Allwinreg/ctypes.windllcalls are fail-open and guarded so the module parses cleanly on macOS / Linux. Full-mode (machine-wide proxy, NSSM service, MDM cert) is deferred to post-v1.0. - Tests: 27 new tests in
tests/test_windows_apps_mode.py, all mockwinregso they pass on macOS / Linux CI; cover registry proxy writes, env-var management, non-raising failure handling, snapshot/restore round-trip, dispatch wiring.
Fixed
POST /v1/cloud/connectno longer leaves a zombie task on re-pair. A previous handshake that settled (denied,expired,failed) left its background asyncio task referenced in_active.task. A newPOST /v1/cloud/connectcall now cancels the old task before replacing_active, preventing a rare race where two concurrent background tasks could both call_persist_credentials.
v0.2.10 · 2026-05-24
Attribution correctness, body capture memory gate, and audit hardening.
Added
- Tests: 7 new tests in
tests/test_attribution_store_source_workdir.py, write + lookup 2-tuple, workdir propagation,ALTER TABLEmigration on old DB, tagging step 0 workdir pass-through. 8 new tests intests/test_body_capture_size_gate.py, request gate fires at exactly 4 MiB, logs structured event, passes through normally below threshold; symmetric tests for response side.
Fixed
source_workdirnow stored in the attribution cache.attribution_loggains asource_workdir TEXTcolumn (fail-openALTER TABLEmigration for existing DBs). The edge process's_resolve_via_psutilpropagates the resolvedcwdthroughresolve_client_projecttoattribution_store.write.tagging.pyStep 0 (edge-store cache hit) now returns the realsource_workdirinstead of alwaysNone. Every cached row written from v0.2.10 forward carries the originating directory; historical rows showNULL.- Body capture size gate.
proxy.pyskipscapture_body()entirely for request or response bodies larger than 4 MiB. Logsaddon.body_capture.skipwithreason=body_too_large,size_bytes, andgate_bytes. Prevents a second memory spike from decoding large multi-modal payloads that mitmproxy already buffered.flow.request.content/flow.response.contentaccess is isolated in its owntry/exceptso a mitmproxy internal decode error cannot swallow the attribution block for that flow. - Attribution store schema-init lock corrected.
_open_connectionnow holds_LOCKacross the full DDL bootstrap block rather than claiming the flag before doing the work. The previous "claim early" approach created a TOCTOU window where concurrent threads skipped bootstrap while the table didn't exist yet, causingno such table: attribution_logfailures under high concurrency at process startup. - Dead
psutil.net_connections()scan removed.tagging._get_process_nameassignedpsutil.Process(psutil.net_connections()[0].pid)then immediately shadowed the variable in thefor proc in psutil.process_iter()loop, a full connection-table scan that fired on every call and discarded the result. cloud.worker.sync_pausedlog event name. Wascloud.worker.paused_unauthorised, which was inaccurate for theHTTP 403 to paused_forbiddenterminal path. Renamed tocloud.worker.sync_paused(reason-agnostic; the pause reason lives in the DB row and therun_oncedrain log).- Proxy content access isolated.
flow.request.content/flow.response.contentreads in the size-gate path are wrapped intry/exceptso a mitmproxy decode exception cannot propagate to the outer request hook and drop attribution for the flow.
v0.2.9 · 2026-05-24
Daemon hardening and report row-cap removal.
Added
- Loopback bind guard (
halton_meter/security.py). On startup, the daemon resolves its configuredlisten_hostandapi_hostviasocket.getaddrinfoand hard-exits (SystemExit(1)) if either resolves to a non-loopback address. Prevents accidental LAN exposure if a user edits the config to0.0.0.0. Override viaHALTON_METER_ALLOW_NON_LOOPBACK=1env var (logs a warning) for container / VPN topologies that legitimately bind on non-loopback. - Dependency upper-bound caps.
mitmproxy>=10.0,<13,pydantic>=2.0,<3,httpx>=0.27,<1,click>=8.1,<9. Prevents silent breakage when those libraries ship major-version API changes.
Fixed
halton-meter reportno longer caps at 10,000 rows.storage.read_recordsaccepted a hardlimit=10_000default; the report command passed that value explicitly.read_recordsnow acceptslimit: int | None = Noneand omits theLIMITclause whenNone. Reports across large databases are now complete.- Log rotation enabled. Daemon log files now rotate at 100 MiB with 5 backups retained (
RotatingFileHandler). Previously a long-running daemon on a busy machine could produce an unbounded single log file.
v0.2.8 · 2026-05-24
Sync pause-classification + recovery fix. Three items.
Added
halton-meter cloud resumeis a real recovery command. Reads the current pause reason; onpaused_manualclears immediately; onpaused_unauthorised/paused_forbidden/ anything else hitsGET /v1/daemon/whoamionce with the stored token. If 200, clears pause + counters +last_erroron bothcloud_stateandcloud_sync_statesocloud statusshows green on the next read. If real 401, prints "API key is genuinely invalid, re-pair required" and exits non-zero. If transient, surfaces the real status. Unblocks the daemon without burning acloud connect(which would revoke the still-valid existing key and grow the cloud's revoked-devices list).CloudForbiddenexception class. Distinct fromCloudUnauthorised. HTTP 403 maps here; worker writespaused_reason='paused_forbidden'. The status banner explains that the recovery path is "have the workspace owner re-invite this machine, thencloud resume", not "re-pair".cloud_sync_state.last_error_atcolumn (user_version 6 to 7, additiveALTER TABLE). Stamped together withlast_errorby both pause and error writers. Surfaced inhalton-meter cloud statusas the "Last error at" row so a stored error can be told apart as fresh vs stale.- Tests: New
tests/cloud/test_pause_classification.py, 12 tests covering real 401 topaused_unauthorised, real 403 topaused_forbidden, and that 500/502/503/RemoteProtocolError/ReadTimeout/ConnectTimeout/ConnectError/422/400/429 all skip the pause writer and route througherror_writer. Belt-and-braces: each transient-failure test asserts both thatpauses == []AND thaterrors != []. Existingtest_worker_drain.pycontinues to pass without modification, the worker's API gained an optionalerror_writerparameter that defaults toNone. Storageuser_versionbumped 6 to 7 inengine.CURRENT_USER_VERSION;test_user_version_stampedupdated implicitly via the constant.
Fixed
sync.paused_unauthorisedtriggers strictly on HTTP 401. Previously, any failure that escaped the worker's retry envelope insidecloud synccould be misclassified as 401, including the ALB-side connection reset seen during the 2026-05-24 rolling deploy. The daemon would then writepaused_unauthorised, force the user to re-pair, and accumulate revoked entries in the cloud's paired-devices list. v0.2.8: only an actual HTTP 401 response on/v1/requests/batchwrites that pause class. Transient errors (5xx,RemoteProtocolError,ReadTimeout,ConnectTimeout,httpx.ConnectError/ DNS failure, HTTP 429, HTTP 422/400) route through a newerror_writerseam that bumpsconsecutive_failuresand stampslast_error+last_error_atbut never touchespaused_reason. Each pause-causing branch now emits a structured log line (sync.paused reason=… http_status=… last_request_id=…) so the cause is visible. A pause-classification table maps each wire condition to its before-v0.2.8 vs from-v0.2.8 behaviour (e.g. HTTP 403 now becomespaused_forbidden; 5xx, conn reset, timeouts, ConnectError, 429, and 400/422 are now transient with no pause).
v0.2.7 · 2026-05-23
`base_url` auto-heal for placeholder TLDs.
Fixed
- Self-heal
[cloud].base_urlwhen it contains a placeholder TLD (.test,.example,.invalid,.local): rewrites tohttps://api.haltonmeter.comon next boot and logs once. Catches stale config from spike scripts.
v0.2.6 · 2026-05-23
Cloud-onboarding loopback API for the dashboard's first-login flow.
Added
GET /v1/cloud/state, returns{ paired, version, hostname, port }. Lets the dashboard's onboarding shell detect a local daemon and skip the install/start steps.POST /v1/cloud/connect, triggers a pairing-code mint without the user copying anything; returns{ code }.GET /v1/cloud/connect/statuspolls for approval.- Chrome PNA
Access-Control-Allow-Private-Network: trueheader on the loopback API so cross-origin browser fetches fromhttps://app.haltonmeter.comwork without flags.
v0.2.5 · 2026-05-22
SaaS-launch release. Six items: Cursor cold-start fix, edge attribution leak fix, CI Python 3.14 matrix, backfill v2 script, friendly transport-error UX, and official backend host lock.
Added
- Official backend host lock (
cloud/constants.py).HALTON_METER_CLOUD_URL = "https://api.haltonmeter.com", production default used byload_cloud_configwhen[cloud].base_urlis absent or empty.HALTON_METER_CLOUD_URLenv var +--base-urlCLI flag remain as dev/staging overrides.cloud connectupdated to use the production URL as its default. - Backfill v2 script (
daemon/scripts/backfill_body_paths.py). Recovers Cursor-style rows by scanningfile://URIs and bare absolute paths in captured request bodies. Dry-run by default;--applygated same as v1. Tags recovered rows withattribution_method='backfill_body_paths'. Does not auto-run--apply. - CI matrix for Python 3.14 (experimental). New
daemon-py314job in.github/workflows/ci.ymlwithcontinue-on-error: true. Main matrix covers 3.11, 3.12, 3.13. - Tests: New
tests/cloud/test_cloud_constants.py,tests/cloud/test_transport_error_ux.py. Extendedtests/attribution/test_resolver.py(lenient-slug cases),tests/test_edge_attribution.py(Step 8 fallback). Full daemon suite: 1906 passed, 35 skipped, 0 failed. Ruff clean.
Fixed
- Cursor cold-start lenient-slug scan.
find_project_root_by_sluginattribution/layers.pyfires on Tier 4b/4c registry miss, an O(1-level) scan of~/Documentsand~warmsEdgeAttributionRegistryfor the duration of the process. Tier telemetry distinguishes cold-start recoveries (4b_lenient/4c_lenient) from normal strict corroborations;attribution_methodin DB is unchanged. - Edge
unattributed/edge_storeleak.edge_attribution._resolve_via_psutilStep 8 callsresolve_unifiedas a final fallback after its standalone chain (Steps 1-7.5) exhausts all options. Lazy import preserves the stdlib-only module-import invariant. Emitsedge_attribution.resolver_fallbackat INFO on recovery. - Friendly
cloud.transport_errorlog. Transport error message now includes the base URL and config file path hint so each line is actionable._check_base_url_not_placeholderraises immediately for placeholder TLDs (.test,.example,.invalid,.local) rather than burning the full retry budget; bypassed when a test transport is injected.
v0.2.4 · 2026-05-18
Unified, IDE-agnostic attribution resolver. Closes the edge-side unattributed / edge_store leak from v0.2.3 live verification. Widens Python pin to 3.14.
Added
- Unified Tier 0-8 resolver (
attribution/resolver.py) shared by daemon + edge; single entry pointattribution.resolver.resolve_unified. attribution/registry.py, per-edgeslug to abspathmap from rcfile-resolved workspaces.ide_env_labellayer, readsCURSOR_WORKSPACE_LABEL/VSCODE_WORKSPACE_LABELfrom process environ, strict-slug corroborated via registry.ide_argv_labellayer, Electron-helper trailing-token regexr'.*\s(\S+)\s\[\d+-\d+\]\s*$'against argv, strict-slug.- IDE workspace recovery for Cursor / VSCode / Kiro / Zed / IntelliJ.
- Per-tier
attribution.tier_hitstructured telemetry (hits +4b_miss/4c_miss). - Tests: 32 new attribution tests (6 registry + 26 resolver) including empirical Cursor argv fixtures, p99 perf < 3 ms, telemetry shape, source-audit of v0.2.3 helper deletions. Full daemon suite: 1877 passed, 35 skipped, 0 failed. Ruff clean.
- Followups (tracked for v0.2.5): CI matrix entry for Python 3.14 (pin widened in v0.2.4; CI runner row is a bookkeeping change). Historical
miscbackfill (--apply), v0.2.4 ships the dry-run scriptdaemon/scripts/backfill_misc_attribution.pyonly;--applyremains gated behind explicit operator approval and is not auto-run.
Changed
- 7 new
attribution_methodvalues:ide_argv_label,ide_env_label,parent_rcfile,parent_git,parent_ide_sniff,parent_ide_argv_label,parent_ide_env_label. All fit existingString(32); no cloud migration. - Renamed
claude_parent_rcfile/claude_parent_workdirtoparent_rcfile/parent_workdir(family-agnostic). - Parent walk in
edge_attribution._resolve_via_psutilmoved from Step 3.5 to Step 7.5 so the originating process's own signals win over an ancestor's. - Replaced IDE-family process-name allow-list with cross-uid / cross-session boundary stop in the parent walk.
requires-python = ">=3.11, <3.14"to">=3.11, <3.15"(3.14 wheels available forpydantic_core,mitmproxy,aiosqlite).
Removed
- v0.2.3 daemon Step 7.25 and its four helpers in
tagging.py(~280 lines deleted, ~70 re-added calling the resolver). _is_chromium_helpergate and IDE-family allow-list inedge_attribution.py.
v0.2.3 · 2026-05-18
Two correctness fixes: long-lived CONNECT tunnel attribution and edge plist atomic-write hardening.
Added
daemon/scripts/backfill_misc_attribution.pydry-run script, proposes recovered slugs from embedded/Users/.../CLAUDE.mdpaths.--applygated behind operator approval; not auto-run.daemon/tests/test_macos_install.py, 9 new tests covering plist-builder shape (RunAtLoad,KeepAlive,ThrottleInterval,ExitTimeOut,ProgramArguments,Label) and atomic-write paths (replace, fresh create, interrupted write, failed replace).
Changed
- Known limitations (intentionally not shipped): Boot-time auto-start stays off (v0.1.21.1 manual-start stands). Cold-boot zombie has not been reproduced; re-enable deferred pending a reproducer + fresh-boot log capture.
- Other patches deferred: Wire-field rename
cost_usd_minor_unitstocost_usd_millicents(breaking; needs coordinated daemon + cloud + dashboard + Alembic cycle). Wire contract SHA refresh,daemon/memory/plans/phase2-wire-contract.mdstill pinned to pre-v0.2.2e8f252a.
Fixed
attribution_store.lookupdropped its 5-min read-side TTL filter. Long-lived CONNECT tunnels (multi-hour Claude Code sessions) now stay attributed for the life of the TCP socket. Write-side eviction loop unchanged;max_age_s=kept on the signature for backwards-compat but no longer gates hits.tagging.pyparent-PID walk for{claude, claude-code}family. Bounded 5-hop walk when originating cwd resolves toNoneor/; re-runs rcfile / git / mac_sandbox / workdir-basename layers against the first ancestor's cwd. (Superseded in v0.2.4 by the unified resolver.)smart_defaultfall-through escalateddebugto structuredwarningwithedge_src_port,cwd,parent_pid,parent_name, a tripwire for future regressions.- Prune loop horizon raised to 24h in
cli._run_daemonto match the read-side TTL removal.DEFAULT_MAX_AGE_SECONDS = 300.0left untouched for other callers. - Edge plist atomic-write hardening. All four macOS plist writes (daemon, watchdog, edge, userenv) routed through new
_atomic_write_byteshelper ininstall/_macos.py(tmp-file tofsynctoos.replace). Prevents the 8-byte<plist/>stub corruption observed during same-day reinstalls.
v0.2.2 · 2026-05-12
End-to-end body sync. Daemon now pushes captured request and response bodies to the cloud on the same opt-in posture as metadata: disabled by default, per-project overrides win over the master switch.
Added
[cloud.bodies]config block.enabled(bool, defaultfalse),sync_interval_seconds(default60),max_body_bytes_per_upload(default524288, 512 KiB),per_project(dict of slug to bool). Bodies do not leave the machine untilbodies.enabled = true.halton-meter cloud privacy set bodies.enabled true|false, flips the master body-sync switch.halton-meter cloud privacy set bodies.upload false --project SLUG, drops body uploads for one project while leaving the global switch on.halton-meter cloud privacy shownow renders a "Body sync (v0.2.2)" section: master state, interval, byte cap, per-project rules.BodyUploaderworker. Sibling to the metadata worker; readsrequest_bodies WHERE uploaded_at IS NULL, POSTs two envelopes per row (request + response) toPOST /v1/requests/{id}/body, stampsuploaded_aton success. Project-skipped rows still advance to keep the cursor moving. Pause/last-error tracked independently incloud_body_sync_stateso a body 401 doesn't pause metadata sync.- Supervisor wiring.
cli.pyspawns the uploader next to the existing metadata worker with the same fail-open posture (gate misses are silent, spawn failures absorbed). - Schema migration 5 to 6. Adds
request_bodies.uploaded_at DATETIME NULLplus a supporting index. Idempotent; safe on greenfield and upgrade-in-place installs. - Tests: Cloud (
halton-meter-cloud):tests/test_request_bodies.py, 6 tests (happy path, wrong workspace, unknown id, idempotent re-post, base64 decode error, missing auth). Daemon:tests/cloud/test_body_uploader.py, 5 tests (happy path, per-project skip, 401 pause, transport error, oversize leg skipped).tests/cloud/test_supervisor_spawns_body_uploader.py, 4 tests (three gate misses + happy path spawn).
Changed
- Wire contract:
POST /v1/requests/{id}/bodyis now Live inPHASE2_CONTRACT.md. Envelope:BodyEnvelopeV021(class name retained from the Wave-0 forward-declaration to keep daemon imports stable). Two POSTs per request (direction=request,direction=response) land in one cloud row keyed onrequest_id, withredaction_appliedOR-merged across directions. - Documentation:
daemon/PRIVACY.mdgains a "Body sync" section (defaults, how to enable, per-project override pattern).daemon/CLOUD.mdreferencescloud privacy set bodies.enabled trueas the activation step.
v0.2.1 · 2026-05-12
Patch release. Operational bugs plus CLI UX polish for the cloud subgroup.
Changed
- CLI UX for the cloud subgroup. The cloud commands previously printed raw Python dicts and one-line success messages. They now match the look-and-feel of
halton-meter status(Rich Panels + Tables + state icons).cloud connect, pairing code rendered in a brand panel; a dots spinner runs during the approve-poll wait so the CLI doesn't appear hung; success message is a green panel.cloud status, state banner (one of ACTIVE / DEGRADED / PAUSED / NOT-CONFIGURED) plus a per-field table with health icons, relative ages, and unsynced-count highlighting.cloud whoami, labeled key-value panel.cloud reconcile, per-row cloud-side table, totals panel, and a daemon-vs-cloud variance line (zero variance = green check; <0.5% = yellow tolerance; otherwise red "investigate" copy).--jsonflag oncloud statusstill emits the machine-readable shape for scripting.
Fixed
cloud connect --base-url <URL>now persists the URL to~/.halton-meter/config.toml. Pre-0.2.1, the flag was a transient override, pairing would succeed but every subsequent command (cloud status,cloud sync, the supervisor's worker-spawn gate) would read[cloud].base_urlfrom TOML, find it empty, and report the daemon as un-paired. Users had to hand-edit the TOML after everycloud connect. Now the connect flow writesbase_urlandenabled=truenext to the 0600 credentials write. Other top-level config sections ([daemon],[storage],[cloud.upload]) are preserved by the TOML rewriter.halton-meter cloud reconcileno longer 422s. Daemon sentfrom/toquery params; cloud's reconciliation router expectsfrom_date/to_date. Reconcile returnedField requiredinstead of variance data.
v0.2.0 · 2026-05-12
First minor release. Closes the Phase 2 cloud-sync arc (daemon to halton-meter-cloud) and ships upload-privacy controls. The daemon's local cost computation is unchanged from v0.1.x, it has been correctly charging Anthropic prompt-cache-write tokens all along.
Added
- Cloud-sync (Phase 2):
halton-meter cloudCLI group.connect,disconnect,status,whoami,sync,reconcile,pause,resume. Pairing handshake (pairing/startto user approves in dashboard topairing/poll) mints a single-shothm_sync_…token. Token stored in~/.halton-meter/cloud-credentials.json(chmod 0600) with a Fernet-encrypted mirror in SQLite (cloud_state.api_key_ciphertext, key at~/.halton-meter/cloud.key). - Cloud-sync (Phase 2): Cloud worker supervised as part of the daemon process. Same-process spawn alongside the proxy / API / heartbeat tasks; gated on
cloud_state.enabledso users who don't pair see zero change. Crash-isolated, a cloud-side failure (transport, 401, schema mismatch) cannot block the proxy hot path. 401 setscloud_sync_state.paused_reason='paused_unauthorised'and stops the worker until the operator re-runscloud connect. - Cloud-sync (Phase 2): Default off. The daemon ships with
cloud.enabled = false. Cloud sync is strictly opt-in. - Upload privacy: Tiered consent presets.
[cloud.upload]section in~/.halton-meter/config.toml:preset = "minimal" | "standard" | "full". Default for v0.2.0 isstandardwithsource_workdir = false, the high-leak field (local path) is off by default. Cloud sees provider, model, tokens, cost, project slug, hostname, prompt hash. Cloud does NOT see your workdir paths unless you explicitly opt in. - Upload privacy: Per-field global overrides.
fields.source_workdir = true(or any gateable field) overrides the preset. - Upload privacy: Per-project rules.
[cloud.upload.per_project."acme-secret"]can setupload = false(the project is local-only) or lock down specific fields for that project. The daemon redacts before serialisation; the cloud cannot see what the daemon never sends. - Upload privacy: CLI
halton-meter cloud privacy show|set.showprints the resolved policy;set preset standard/set field.source_workdir false/set upload false --project acme-secretmutate the TOML in place (other sections preserved). - Observability + lifecycle: Daemon supervisor logs
cloud.supervisor.spawned/cloud.supervisor.skipwith structured reason so it's obvious whether sync is active.cloud_staterow schema: workspace name, machine id, hostname snapshot, last-connected-at, paused_reason, used bycloud status. Worker retry envelope: exponential backoff capped at 300s for transport + 5xx + 429 (withRetry-Afterhonoured). 401 never retried.
Changed
- Note for operators running halton-meter-cloud: Wave 3 soak on 2026-05-12 surfaced a cost-recompute bug in the cloud side (
_compute_costinservices/ingest.py) that silently droppedcache_write_tokensfrom the sum, producing ~17% reconcile variance on cache-heavy Claude workloads against an otherwise-correct daemon. The fix lives inhalton-meter-cloud. No daemon change was needed. - Daemon binds to the same proxy/edge ports as before, no migration needed.
halton-meter syncis a one-shot alias forhalton-meter cloud sync(single drain pass + exit). - Notes for operators: Upgrade with
pipx upgrade halton-meter; the daemon needs a restart for the new code to take effect. After upgrade, historical cost totals will not retroactively update, only requests captured after the upgrade carry the corrected cache-write cost; runhalton-meter recompute-costsfor historical correction. Cloud sync stays OFF unless you runhalton-meter cloud connect. Known limits: production hosted cloud requires Clerk (until then, only self-hosted cloud withAUTH_MODE=devis supported); body-capture sync is deferred to v0.2.1.
v0.1.16.dev0 · 2026-05-02
Trial-run wheel. Built locally for soak; NOT published. Stability cycle responding to 9 bugs surfaced during the 0.1.15 evening soak. Carries everything from 0.1.15 plus the items below.
Added
- P0 stability + observability: launchd
NumberOfFilessoft+hard limits raised to 10240 (b6404c3), was 256 (macOS default). Bug 4: FD exhaustion under system-proxy load was instant. Applies to daemon, edge, and watchdog plists; takes effect on nexthalton-meter init. - P0 stability + observability: Watchdog probes the daemon's MITM port directly (
0dafff3, port-fixed in5bf2864), TCP-connect every 1.5s. 3 consecutive failures +/healthstill up gives ERRORwatchdog.mitm_unhealthy. 5 consecutive failures kickstarts daemon via launchctl. Capped at 3 restarts/hour. Bug 1: silent MITM listener death is now caught and self-healed. - P0 stability + observability:
halton-meter statusrow for MITM listener (8833482), TCP-connects tocfg.daemon.internal_port. Banner flips to BROKEN when daemon/healthis up but MITM is dead, with explicit remediation message naming the port. - P0 stability + observability: Daemon listener heartbeat + asyncio task crash hook (
0215511), heartbeat emitsdaemon.heartbeat mitm_port=:8090 mitm_listening=True api_listening=True db_writeable=True open_fds=Nevery 30s. Asyncio task crashes now surface viadaemon.task_crashedinstead of being silently swallowed. Two previously-namelesscreate_taskcalls got names.
Changed
- Notes for operators: This wheel REPLACES 0.1.15 for the soak (Gemini fix + FD limits + watchdog auto-restart + observability). Rollback path is 0.1.15 (still in
dist/). After install, runhalton-meter init --appsagain to apply the new launchd FD limits. Newhalton-meter statuswill show one extra row (MITM listener). Pending follow-ups (Sprint 2 + 3): P1.1 chain re-resolve audit (Bug 2), P1.2 doctor proxy auto-disable detection (Bug 5), P2 passthrough DNS / concurrency limits (Bug 3), P3 INSTALL.md/README.md gotchas for gemini-cli, Codex CLI, Antigravity bypass (Bugs 7/8/9). Suite: 1104 passed, 3 skipped, ruff clean.
Fixed
- Gemini response-side adapter lookup (
4a30183), request hook fired correctly withprovider=geminithen immediatelyaddon.response.no_adapter. Passflow.request.pathto response-sidefind_adapter. Zero Gemini rows had been landing in the DB. - Watchdog probes daemon's
internal_port(8090), not the edge port (5bf2864), earlier0dafff3wired the new MITM probe against the wrong port. The edge keeps accepting connections even when the daemon's MITM is dead, so probing edge would never have detected Bug 1.
v0.1.15 · 2026-05-02
Release-shape, NOT published. Promoted from 0.1.15.dev0 after the joint Smart-Attribution-v0.3 + multi-provider build proved clean. Wheel sits in `daemon/dist/` per the locked distribution-sequencing decision (no public PyPI publish until the binary track lands at the final stage). Same scope as 0.1.15.dev0.
Changed
- One operator follow-up still owed:
XAI_RATESwere set from training-cutoff knowledge of xAI's pricing. Spot-check againsthttps://docs.x.ai/docs/modelsand patch in-place if any model rate has drifted. Fix is a tiny edit topricing/matrix.pyand does not require a new wheel.
v0.1.15.dev0 · 2026-05-02
Trial-run wheel. Built locally; NOT published to PyPI per the cadence rule. Carries everything from 0.1.14.dev0 plus the multi-provider adapter cycle (OpenAI / Gemini / xAI) plus the start-command health-probe fix.
Added
- OpenAI adapter (
api.openai.com):/v1/chat/completions,/v1/responses,/v1/embeddings. Streaming + non-streaming. o-series reasoning tokens routed to thethinking_tokensfield and billed at the output rate. Prompt-cache reads viausage.prompt_tokens_details.cached_tokens./v1/moderationsis free, adapter declines viamatches_urlso $0 rows don't clutter the DB. 14 models priced (gpt-4o,gpt-4o-mini,gpt-4.1family,o1/o3/o4-mini, embeddings models). - Gemini adapter (
generativelanguage.googleapis.com)::generateContent,:streamGenerateContent,:embedContent,:batchEmbedContents. Model extracted from URL path (regex). Streaming is JSON-array (not SSE). Tiered pricing >200k input tokens already correct viaGEMINI_TIERED_RATES. Multimodal under-counts logged. - xAI / Grok adapter (
api.x.ai):/v1/chat/completions(OpenAI shape) +/v1/messages(Anthropic shape). URL-dispatches to sibling adapter parsers, rewritesproviderto"xai". 7 models priced (grok-4, grok-4-mini, grok-3 family, grok-2-vision-1212). ProviderAdapter.matches_url(host, path)added to the adapter Protocol, an additive bump with agetattrfallback in dispatch so existing adapters keep working without forced edits. Adapters can now opt out of specific URL paths (used by OpenAI to skip/v1/moderations).- Redaction patterns extended to OpenAI (
sk-…,sk-proj-…) and xAI (xai-…) API keys. Google API keys (AIza…) were already covered. Body capture for the new providers inherits redaction for free.
Changed
- Notes for operators: Body capture default is ON for all three new providers; per-project opt-out via
halton-meter project <slug> set body-capture offworks identically across providers.XAI_RATESwere set from training-cutoff knowledge of xAI's pricing page, not a live fetch; spot-check againsthttps://docs.x.ai/docs/modelsbefore promoting 0.1.15. This wheel REPLACES 0.1.14.dev0 for soak. Suite: 1055 passed, 3 skipped, ruff clean.
Fixed
halton-meter start's health probe now readsruntime.tomlinstead of hard-coded port defaults. Previously, when ports fell back from 8765 to 8766,startwould print "daemon loaded but /health never came up on :8765 within 15s" and exit non-zero even though the daemon was healthy on the fallback port. Same shape as the watchdog port-fallback fix in 0.1.14.dev0.
v0.1.14.dev0 · 2026-05-02
Trial-run wheel. Built locally for soak; NOT published to PyPI per the cadence rule. Ships Smart Attribution v0.3, the biggest behavioural shift to project tagging since the edge attribution store landed.
Added
- Smart Attribution v0.3, three new attribution layers in the resolver chain (both edge and daemon): Layer 4, Git repo basename (Option A, branch info intentionally dropped).
~/code/halton-meterbecomeshalton-meter. Per-cwd memo, no TTL. Layer 6, Container detection, emitsk8s:<HOSTNAME>ordocker:<HOSTNAME>only when actual container signals are present (/.dockerenv,KUBERNETES_SERVICE_HOST, or/proc/1/cgroupcontainingdocker/kubepods/containerd); cannot fire on a developer laptop, pinned by a regression test. Layer 6.5, macOS sandbox bundle id, sandboxed Mac apps (Perplexity, ChatGPT desktop, etc.) whose cwd is~/Library/Containers/<bundle>/Datanow writemac:<bundle-id>(e.g.mac:ai.perplexity.mac) instead of collapsing to projectData. - New shared module
daemon/halton_meter/attribution/layers.py, pure-function home for every per-layer resolver.tagging.pyandedge_attribution.pydelegate. clear_git_cache()test helper for layer 4's per-cwd memo.- Daily background body-capture retention sweep inside the daemon process. Logs
bodies.retention_sweep deleted=N retention_days=D elapsed_ms=X. Defaultretention_days = 90, configurable via~/.halton-meter/config.toml[bodies] retention_days = …. .haltonrc body_captureflag is now plumbed end-to-end. Dropbody_capture: off(or= off) into a project's.haltonrcand the daemon stops persistingrequest_bodiesrows for traffic from that cwd. Precedence: master switch toproject_settings(CLI) toattribution_log(rcfile) to default ON.attribution_log.body_capture_enabled BOOLEAN NOT NULL DEFAULT 1column. PRAGMAuser_version2 to 3.halton-meter bodies show <id>rich panelled output, header, redaction badges, request and response panels. (Was raw JSON.)halton-meter bodies statspolish, totals, per-project breakdown, oldest/newest, redaction-applied %, retention horizon.
Changed
- Slug normaliser is now stricter and lower-cased.
[^a-z0-9:_/-.] -> -, collapse repeats, strip leading/trailing-, truncate to 64..is allowed (required somac:ai.perplexity.macsurvives). Project tags from rcfiles with internal whitespace (e.g."my project") start persisting asmy-project. Existing live slugs are valid under the new rules, no rename. Old DB rows are NOT retroactively normalised. - Notes for operators: No PyPI publish in this release (per the distribution-sequencing ADR, the public PyPI publish is bundled with the signed-binary track at the final pre-scale stage). Soak this wheel locally before promoting; suggested 5-7 days on the daily driver plus at least one Linux machine. Suite: 943 passed, 3 skipped, ruff clean.
Fixed
- Watchdog now reads the runtime ports file (
~/.halton-meter/runtime.toml) at startup AND per tick. Previously hard-coded to defaults 8081/8765, which produced a noisywatchdog.no_interfaces_pointing_at_uslog loop whenever the daemon fell back to 8082/8766. - Three pre-existing red edge tests (
test_edge_chain_resolve.pyx2,test_edge_e2e.py::test_edge_lifecycle_chain_then_passthrough_then_chain_then_invalidate) now pass, closed out indirectly by the dev2 close-path + IDE-args-sniff fixes.
v0.1.13 · 2026-05-02
Built 2026-05-02 (release-shape, NOT published). Built into `daemon/dist/halton_meter-0.1.13-py3-none-any.whl` and held under the same no-publish gate as 0.1.14.dev0. Superseded for soak purposes by 0.1.14.dev0; will be removed from the dist archive when 0.1.14 promotes.
Added
- Body-capture PR2 + PR3 (hot-path capture, redaction, opt-out via CLI, daily retention sweep).
- Edge attribution sniffs IDE workspace from cmdline (Windsurf / Cursor / VS Code).
Fixed
- Body-capture race fix, chained counts then body writes.
- Edge connection close path no longer awaits
wait_closed(Python 3.14 compat).
v0.1.11 · 2026-05-02
Last public PyPI release (https://pypi.org/project/halton-meter/0.1.11/). No date is given in the changelog; the date shown here is inferred from surrounding entries and should be treated as approximate.
Added
- Last public PyPI release. https://pypi.org/project/halton-meter/0.1.11/