EARLY ACCESS · v0.9

Clean-room your LLM tool use.

Smoke-test agents against deterministic replays of your real upstream traffic. No live API calls. No surprises.

Recorded upstream behavior, not synthesized. Sturdier architecture, not smarter models. Gostly proxies your real API traffic, learns from it, and replays it deterministically across environments or in your CI/CD — for tests, for local development, for agent runtimes calling production at machine speed.

─ architectureinside your perimeter
─ CUSTOMER INFRASTRUCTURE · ON-PREM OR PRIVATE VPCyour_appCI · DEV · TESTSHTTPagentLEARN · MOCK · PASSTHROUGH · TRANSITIONINGupstreamLEARN onlyevent streamcontrol plane22 row-level-isolated tables · state of recordDRIFT DETECTIONschema-diff loopFIDELITY SCOREF1 · 5-min recomputeAI MOCK REPAIRJSON Patch · operator-gatedWEBHOOK CAPTURE+replay · SSRF-guardedCOLD-START SEEDHAR · Postman · OpenAPIMCP SERVERteam tier · model-contextRLS · 22 TABLESRBAC · 4 ROLESAUDIT · APPEND-ONLYinferencelocal modeloptional cloudlicense5-min fresh4-hour staleGOSTLY-MANAGEDSTRUCTURAL REDACTION FLOORapplied before any I/O · one-way scrub seal
the readiness gap
83% / 29%

of organizations plan agentic AI deployment; 29% feel secure enough to ship it.

Cisco · State of AI Security 2026

the silent rot
70%

of production API failures pass CI checks against hand-written mocks.

InstaTunnel

the verification tax
$14.2K

annual per-developer cost — the time spent proving things actually work.

Stack Overflow Developer Survey 2025

─ THE FAILURE MODE

Mocks are an unmaintained interface contract you wrote by hand. Then forgot about.

The contract between your service and someone else's API is the most load-bearing thing in your test suite — and the part with the least engineering rigor. Four failure modes show up on every team that grows past two services.

FIXTURES

Hand-written stubs go stale before the PR merges.

Every team writes the same brittle JSON. “Mocks are static, but reality evolves. The mock becomes a lie.”

DRIFT

Tests pass locally. Production fails because the API changed.

Three-week-old stubs against an upstream that added a new field. CI is green; reality isn’t. 75% of APIs don’t conform to their own spec (APIContext).

COVERAGE

5xx, 429, timeouts, partial responses get skipped.

Edge-case fixtures are written when a flaky test forces them — at 2 AM, by whoever’s on call. 70% of production API failures pass CI checks (InstaTunnel).

AGENTS

Multi-step flows are where stateful mocking falls apart.

An agent retrying the same upstream four ways needs the response trajectory to evolve. A static stub set says yes to everything, then the orchestration silently diverges.

─ THE MATCH CASCADE

Five stages, deterministic-first. The model is reached only at the cascade edge — and only on Pro and Team.

Every request in MOCK mode runs through this cascade. The first stage that produces a match short-circuits the rest. Stages 01 and 02 require zero inference and zero randomness — the same input produces the same byte on every run. The model is a gap-fill at the edge of the contract, not the spine of it.

FIG. B · MATCH CASCADE · SHORT-CIRCUITS AT FIRST HITNo LLM in the deterministic match cascade.The model is on the far side of a wall the architecture, not the marketing, enforces.REQRESPONSE →01DETERMINISTICexact_matchbyte-stable replayALL TIERSMISS →02DETERMINISTICsmart_swapstructural · value swapFREE OPT-IN · PRO/TEAMMISS →03AI · PRO/TEAMinferencelocal model · gatedPRO · TEAMMISS →04AI · PRO/TEAMgenerativeschema-inferred synthesisPRO · TEAMMISS →05FALLBACKunmatchedconfigurable 404 · loggedALL TIERS↳ SERVEDTHE WALLAI ONLY BEYOND THIS LINEstages 01–02 byte-stable · 03–04 model-assistedFor most production calls, the cascade short-circuits at stage 01 or 02. The model is the last resort, not the first answer.

─ DRIFT DETECTION

When the upstream changes, the diff is a row. Not a Slack thread.

A schema-diff loop compares the current capture window against the prior accepted baseline per (method, route). Every change becomes a drift event that’s actionable and audit-logged.

─ DRIFT DETECTION v1 · SCHEMA DIFFcurrent_sessionlast recording windowper (method, route)baseline_sessionprior accepted recordingstored in postgres─ REFINEschema_difffields · types · status codesdrift_events · rowchange_type:fields_added · fields_removedtype_changed · status_code_changedseverity:MINORMAJORrls_scoped · acked_at · NULL─ SYSTEMoperatorPOST /v1/drift/{id}/ackdashboard surfacepolled by repair_proposer─ EVALUATErepair_proposerLLM-assisted · opt-in loopemits JSON Patch · RFC 6902─ SYSTEMoperator_approvalapprove · rejectno auto-apply · audit-logged─ HARDENrepair_appliermock library writescrubbed_at seal respected↳ mock library · /ghost/reload※ severity = MAJOR when fields_removed (breaking) or type changed on a required field. MINOR otherwise.

─ WEBHOOK CAPTURE + REPLAY

Webhooks aren’t requests you control. They’re requests you have to prove you handled.

The agent exposes a per-tenant capture endpoint gated by a shared secret with constant-time comparison. Captured payloads land in a tenant-isolated store, tagged with a signature kind — Stripe, GitHub, or Standard-Webhooks. Operator-triggered replay runs the target URL through an SSRF guard before any socket opens.

─ WEBHOOK CAPTURE + REPLAY · SSRF-GUARDEDPROVIDERSstripegithubstandard-webhooksPOST─ HARDEN · CAPTURE GATEper-tenant capture endpointshared-secret gate · bounded bodyredaction floor applied before diskevent log─ REFINE · PERSISTtenant-isolated capture storesignature kind taggedstripegithubstandard-webhooksreplay─ SYSTEM · OPERATORreplay → target_urlcaller-supplied URL─ SSRF GUARD · BEFORE SOCKET OPENresolve · classify · gateREJECT · DESTINATION· loopback & private networks· cloud metadata endpoints· link-local & CGNAT ranges· known database & service portsDNS resolved once · every address classifiedgateEXTERNAL TARGETPOST · re-signedonly if guard passes※ The capture endpoint is tenant-scoped and gated by a shared secret. Replay only ever reaches the external target if the SSRF guard passes.※ Operator-triggered. Scheduled fanout + re-signing are on the v2 roadmap.

§ 08   BUILT FOR THREE ROOMS

Three audiences. One architecture. The same answer in every room.

The platform team wants tests that don't depend on someone else's status page. The agent operator wants a deterministic substitute for the runtime that's suddenly calling production APIs. The regulated buyer wants the architecture to be the answer, not a Trust Pack PDF. Gostly is the same product to all three.

platform team

Tests that don’t depend on someone else’s status page.

Industry surveys put 70 % of production API failures inside changes that passed CI against stale fixtures. Gostly replays the recording the operator approved — bit-for-bit. The 2 a.m. flaky run because a third-party rate-limited a build slave stops happening.

API MOCKING USE CASES

agent operator

Verifiable replay for agent loops at machine speed.

Cisco’s 2026 State of AI Security found 83 % of organizations plan agentic AI deployments, but only 29 % feel secure enough to ship. The gap is verifiability. Gostly gives the operator a deterministic substitute — record the conversation once, replay it in evaluations forever. The LLM is not in the substitute’s hot path.

AGENT RUNTIME PATTERNS

regulated buyer

Structural redaction the procurement reviewer can read.

Twenty-two RLS tables. Sixteen-header floor. SAML, OIDC, RBAC, audit log — shipped, not roadmap. Four-hour offline grace through platform outages. The architecture, not a Trust Pack PDF, is the answer to “what happens if your platform is down?”

SECURITY MODEL

─ PRICING

Self-hosted free. Pro for the gap-fill. Team for the enterprise controls.

SAML, OIDC, RBAC, and the append-only audit log all ship today on Team and above — none of these are roadmap items. The full feature matrix lives on the register page.

Free

$0forever

Self-hosted. No license key required.

  • Recording + replay
  • LEARN / MOCK / PASSTHROUGH modes
  • Smart-swap structural matching
  • Unlimited services
  • Dashboard UI
Get started free

Pro

$10/ month

Early-access — locked through Dec 2026 for accounts created before Jul 1.

  • Everything in Free
  • AI gap-fill on unrecorded fields
  • LoRA fine-tuning on your traffic
  • Drift detection
  • Priority support
Start free trial

Team

$79/ seat / month

For shipping orgs. SAML + OIDC + audit log included.

  • Everything in Pro
  • Multi-user workspace
  • SAML SSO + OIDC
  • RBAC (owner / admin / member / viewer)
  • Team activity log
Start with Team

SELF-HOST COMMERCIAL

$499 / month flat

≤10 seats, 2 environments. For compliance-driven buyers who need to run Gostly entirely inside their perimeter.

Talk to us →

ENTERPRISE

Custom · floor $25K

Custom MSA, data residency, dedicated support. Talk to sales for the trust pack.

Contact sales →

※ The model is optional. With AI generation enabled the inference container needs a few gigabytes of RAM; with it disabled the rest of the cascade runs comfortably in 1 GB. Customers in latency-sensitive or regulated environments often disable generation and rely on exact + structural match alone.

─ READY WHEN YOU ARE

Recorded once. Replayed verbatim. Verifiable by construction.

Drop the proxy in. Run a real workload through it once. Have a byte-stable mock library by the end of the afternoon — plus drift detection, an SSRF-guarded webhook replay, and the audit trail to defend all of it.