Production crashes deserve more than a 45-minute repro.

Your production crash just got a failing test.

An AI agent investigates every crash. Deterministic code writes the failing test and sealed audit record. Your team gets a draft PR to review — with evidence compliance teams can verify.

Get started Read the docs

Open source · Install in minutes · Python 3.11+

UC Berkeley AgentBeats · 1st place, testing track

500+ automated checks on every release

AI investigates. Code writes the proof.

SOC2 CC7.3 / CC7.4 · PCI DSS 12.10.5

~/checkout-service·logomesh reprorunning

$ logomesh repro https://sentry.io/issues/4582/events/9a3c…

▸ fetching Sentry event ········· ✓ 312ms

▸ extracting innermost frame ···· ✓ checkout.py:42

▸ redacting PII (PAN, emails) ··· ✓ 4 fields scrubbed

▸ synthesizing pytest from locals ✓ deterministic

▸ running in airgapped sandbox ·· ✓ 8.4s

✗ FAIL test_repro_negative_qty_bypass

Property : order total should always be ≥ 0

Called : checkout(item_id=1, qty=-5)

Got : Order created with total -$49.95

Location : checkout.py, line 42

✓ artifact written → ./logomesh/4582-repro.json

signed · PCI DSS 12.10.5 · SOC2 CC7.3 / CC7.4

1 failing testaudit artifactelapsed · 9.1s

For your security team

Built for security and compliance review.

No production DB access
Reproduction reads frame locals from the crash event. We never connect to your live database.
Airgapped sandbox
Docker container, unprivileged user, memory and PID limits, no outbound network.
No LLM in evidence path
The audit artifact is deterministic from redacted frame locals — the LLM never touches the seal.
PCI DSS 12.10.5 · SOC2 CC7.3 / CC7.4
Each record maps reproduction to post-incident response controls — ready for your next audit.

UC Berkeley AgentBeats — 1st place, testing track·Python 3.11+·SOC2 Type I in progress · Type II Q3 2026

Every reproduction runs in isolation with strict boundaries — so your team can trust the results.

60-second reproduction
From a Sentry issue URL to a verified failing test.
Deterministic evidence
Audit records are generated from captured runtime values — never from model output.
Isolated execution
Every run uses a hardened Docker sandbox with no network access and strict resource limits.
500+ quality checks
Automated tests run on every release to keep the engine reliable.
Structured audit records
Each run produces JSON evidence designed for post-incident review and compliance handoff.
Python, today
Purpose-built for Python backend incidents where runtime context captures the failure.

What you get back

A failing test. The exact inputs. Audit-ready evidence.

When Sentry captures a crash, logomesh returns a failing test built from the real inputs your users hit, a draft pull request for your repository, and a sealed compliance record — typically within 60 seconds.

01 · failing pytest

tests/repro/test_4582_negative_qty.py

def test_repro_negative_qty_bypass():
    # synthesized from Sentry event 4582 frame locals
    order = checkout(
        item_id=1,
        qty=-5,  # observed in production
        currency="USD",
    )
    assert order.total >= 0, f"total leaked: {order.total}"

FAIL · total = -$49.95

02 · frame locals (verbatim)

checkout.py:42 · innermost app frame

{
  "item_id": 1,
  "qty": -5,
  "currency": "USD",
  "customer_email": "⟨redacted⟩",
  "card_pan": "⟨redacted⟩"
}

PII redacted before any LLM or test code

03 · audit artifact (signed)

logomesh/4582-repro.json

{
  "sentry_event": "4582/9a3c…",
  "property_violated": "order.total >= 0",
  "repro_test": "tests/repro/test_4582_negative_qty.py",
  "controls": ["PCI DSS 12.10.5", "SOC2 CC7.3", "SOC2 CC7.4"],
  "evidence_hash": "sha256:bf17…e2",
  "llm_in_evidence_path": false
}

deterministic from frame locals · no LLM

The failing test
A pytest that reproduces the crash on your current branch, delivered as a draft pull request for your team to review.
Captured runtime values
The exact inputs present when the failure occurred. Tests are synthesized deterministically from those values, with sensitive data redacted before any model sees them.
The audit record
A sealed JSON evidence chain mapped to PCI DSS 12.10.5 and SOC 2 CC7.3 / CC7.4 — the post-incident controls your compliance reviewers expect.

Who it's for

Built for high-impact backend incidents.

Ideal for deterministic failures — billing math, validation edge cases, and rounding errors that can be replayed from captured runtime state.

Repro time

60s

from Sentry URL to a failing pytest

Engineer time saved

30–40 min

per crash vs. manual state reconstruction

AI in evidence path

audit records are deterministic

Quality checks

500+

automated tests on every release

Common crash paths

Crashes in Python business-logic modules — where the bad input is in frame locals — reproduce reliably.

billing/Subscription totals, proration, invoice math.
checkout/Cart validation, quantity rules, currency conversion.
pricing/Tier resolution, coupon stacking, tax calculation.
refund/Refund amounts, partial refunds, off-by-one on totals.
payments/Charge intent, idempotency, status reconciliation.

Off-by-one on a refund total
Negative quantity slipping past validation
Float rounding on a tax calculation
Wrong tier resolved for a coupon stack
Currency mismatch on a partial charge

Where we draw the line

When the root cause depends on shared state, timing, or external systems, logomesh reports that clearly instead of claiming a match.

Race conditions across async tasks
Distributed transactions that span services
Bugs requiring a live database row to reproduce
Failures from external API timeouts or rate limits

In these cases, logomesh returns a structured explanation so your team can triage with confidence — never a false positive.

Why logomesh

Why teams choose logomesh

Give incident responders a repeatable starting point in minutes — so teams spend less time reconstructing state and more time shipping fixes.

Debug from facts, not guesswork
The same crash input produces the same failing test output, so teams can reproduce issues consistently.
Isolated by default
Runs in a hardened sandbox with clear boundaries for safer execution in production-minded workflows.
You stay in control
logomesh reproduces incidents. Your team decides root cause, remediation, and every code change.
Clear incident trail
Every run includes a structured artifact for internal review, handoff, and post-incident documentation.

How it works

From crash to verified test, automatically

Four things happen between a Sentry alert and the failing test landing in your PR queue.

01
Every alert gets investigated
When Sentry captures a crash, the agent starts immediately — no manual triage and no queue to babysit.
02
The agent reproduces the failure
It reads the crash, locates the relevant code, and determines how to trigger the same failure. When reproduction isn't possible, you get a clear explanation instead of a false positive.
03
Evidence is written deterministically
A deterministic function generates the failing test and audit record. No model output enters your compliance evidence path.
04
Your team reviews a draft PR
A real failing test and sealed audit record mapped to SOC 2 and PCI controls. logomesh never merges code — your team owns the fix.

Security

Enterprise-grade isolation

Every reproduction runs in an isolated sandbox with the same security boundaries whether you use the CLI locally or a managed deployment.

Isolated sandbox

Each run executes in a hardened Docker container with no outbound network, an unprivileged user, and strict memory and process limits.

Your credentials, encrypted

Sentry and GitHub credentials are encrypted at rest and scoped to your installation. Rotate keys anytime from your dashboard.

No production database access

Reproduction uses runtime values captured at the moment of the crash. logomesh never connects to your live database.

PII redacted by default

Payment card numbers, government IDs, email addresses, tokens, and common API-key patterns are removed before any model call or audit record is written.

Full security and compliance documentation is available for vendor review.

FAQ

Common questions.

When a Python crash lands in Sentry, an AI agent investigates it, identifies the failing code path, and produces a deterministic failing test that reproduces the issue. You receive a draft pull request with the test and a sealed audit record suitable for compliance review — typically within a minute.

Get started

From Sentry alert to
verified evidence.

Install logomesh, point it at a production crash, and receive a failing test plus an audit record your team can stand behind.

Install logomesh
Connect your Sentry project
Reproduce your first crash
Export audit-ready evidence

Get started View on GitHub

Open source · MIT license · Python 3.11+

Your production crash just got a failing test.

Reliable, secure, and built for production.

60-second reproduction

Deterministic evidence

Isolated execution

500+ quality checks

Structured audit records

Python, today

A failing test. The exact inputs. Audit-ready evidence.

Built for high-impact backend incidents.

Why teams choose logomesh

Debug from facts, not guesswork

Isolated by default

You stay in control

Clear incident trail