Skip to content
FonteumThe Graph
DataResearchCare CompareThe DifferAttestAPI
See the proof
  • Data
  • Research
  • Care Compare
  • The Differ
  • Attest
  • API
See the proof
Docs · Source-cache mirror

Source-cache mirror.

Fonteum mirrors every federal-source snapshot to S3-compatible storage with 90-day rolling retention. When a source goes down — DOGE-driven access disruption, scheduled CMS portal maintenance, DNS issues, anything — verifiers can still re-download the original archive bytes from Fonteum’s mirror, recompute SHA-256, and confirm the hash matches the public attestation. The cache is both a trust signal and a liability shield.

Why a cache mirror?

Per Q2 2026 Strategic Research (failure mode #1), DOGE-related CMS access disruption is modeled at 40–60% probability over a 12-month window. CMS contingency plans exist (issued February 2026) but exclude the survey/certification systems Care Compare depends on. Even outside the DOGE-specific risk, federal data portals have routine outages, planned maintenance windows, and occasional URL rotations that interrupt access.

The cache mirror is a defensive layer: when the upstream source is unavailable, verifiers can still complete the byte-for-byte hash-match flow against Fonteum’s published attestation. The integrity contract — “hash of the bytes we ingested matches what we published” — stays open even when the bytes themselves are temporarily unreachable upstream.

90-day rolling retention

Every cached archive carries a retention_expires timestamp set to cached_at + 90 days at write time. A separate cleanup cron (queued §sprint3-cache-cleanup-cron) deletes the S3 object + drops the cache log row when the time arrives.

Why 90 days: long enough to span any plausible upstream outage (CMS scheduled maintenance windows are typically measured in days; even an extended DOGE-style policy disruption is unlikely to exceed a quarter without generating its own legal-system response). Short enough that aggregate storage cost stays bounded for a small healthcare-data publisher; rolling expiry keeps the bucket from accumulating decade-old snapshots that no verifier would re-fetch anyway.

SHA-256 hash-match flow

Every snapshot exposes a SHA-256 hash via /verify/[snapshot_id]. The response carries content_hash, source_archive_url (upstream), and cache_url (Fonteum’s S3 mirror). Two equivalent hash-match paths:

  1. Upstream path — fetch source_archive_url + shasum -a 256 + compare to content_hash. Works when the upstream source is healthy.
  2. Cache path — fetch cache_url + shasum -a 256 + compare to content_hash. Works regardless of upstream health (within the 90-day retention window).
# Path 1: upstream
SOURCE=$(curl -s https://fonteum.com/verify/123 | jq -r '.source_archive_url')
EXPECTED=$(curl -s https://fonteum.com/verify/123 | jq -r '.content_hash')
ACTUAL=$(curl -sL "$SOURCE" | shasum -a 256 | awk '{print $1}')
[ "$ACTUAL" = "$EXPECTED" ] && echo "verified" || echo "MISMATCH"

# Path 2: cache mirror
CACHE=$(curl -s https://fonteum.com/verify/123 | jq -r '.cache_url')
EXPECTED=$(curl -s https://fonteum.com/verify/123 | jq -r '.content_hash')
ACTUAL=$(curl -sL "$CACHE" | shasum -a 256 | awk '{print $1}')
[ "$ACTUAL" = "$EXPECTED" ] && echo "verified" || echo "MISMATCH"

Both paths produce the same hash because content_sha256 in the cache log is computed from the exact bytes that produced content_hash in the attestation log. The two tables are written in the same code path (the puller cron) from the same archive buffer; byte-for-byte equality is guaranteed at write time, not at read time.

Degraded-state surfacing

A 15-minute cron HEAD-probes every registered source URL and writes the current health status to source_health_status. The /freshness page reads from there + surfaces an amber “⚠ source degraded” pill on each affected source card. Hover the pill for the underlying reason: 404 (URL rotated), 401/403 (auth changed — DOGE-disruption candidate), 5xx (upstream outage), network throw (DNS or connectivity).

The freshness band (fresh / stale / overdue) and the degraded badge are independent signals. A source can be fresh (last snapshot is recent) and degraded(upstream is currently down) at the same time — that’s the cache mirror earning its keep. The opposite case (overdue + degraded) is the alarming one; operators know to check Sentry + the Inngest dashboard.

Storage backend

Fonteum’s upload helper writes via plain PUT + AWS Signature v4 against any S3-compatible store: AWS S3, Cloudflare R2, MinIO, Backblaze B2, etc. The endpoint is configurable per environment via the AWS_S3_ENDPOINT / AWS_S3_PUBLIC_BASE env vars; the upstream choice is a deployment decision, not a contract change for verifiers.

Public-readable bucket policy (or a CDN that reads from a private bucket) is the default — verifiers fetch cache_url directly without signed URLs.Fonteum never logs or returns the secret key associated with the upload role.

Cross-links
  • /freshness — live snapshot age + status band + per-source upstream health badge.
  • /trust/integrity — public SHA-256 attestation index per snapshot.
  • /methodology/changelog — methodology version-bump log.
  • /docs/source-coverage — schedule of the puller crons that produce the snapshots this cache mirrors.
Fonteum
Products
The DifferAttestAPIFHIR API
Data
Care CompareResearchData catalogSources
Company
AboutPressEditorial policyCorrections
Legal
Privacy policyTerms of serviceMedical disclaimer

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

© 2026 Fonteum, Inc. All rights reserved.

The U.S. healthcare graph AI can cite — every fact carries its source.

Request access→