Pilot docs

Sealed artifact

What the compliance JSON envelope contains, what llm_in_evidence_path: false means, and how an auditor verifies it.

logomesh repro <url> --artifact writes a JSON envelope alongside the failing pytest. It is designed for a human auditor, not for a dashboard badge. Every field has a deterministic definition an auditor can verify independently.

What the artifact is

A signed JSON document that records the exact inputs, the synthesized test, the sandbox output, and the exception match verdict for a single reproduction run. It does not contain an AI opinion. It contains hashes and types.

{
  "control_mappings": ["SOC2 CC7.3", "SOC2 CC7.4", "PCI DSS 12.10.5"],
  "evidence_path_seal": {
    "llm_in_evidence_path": false,
    "synthesizer": "frame-locals → pytest (deterministic)",
    "sandbox_exception_type": "ValueError",
    "expected_exception_type": "ValueError",
    "verified_exception_match": true,
    "test_sha256_first16": "203ba3cdb6803e85"
  },
  "git": {
    "branch": "fix/checkout-qty",
    "commit": "abc1234"
  }
}

llm_in_evidence_path: false

The call expression, the test code, and the sandbox output in this artifact came from the deterministic synthesizer only. The synthesizer reads frame locals directly from the Sentry event and emits a pytest file without calling any language model.

An auditor can verify this by hashing the frame-locals input and the generated test file and comparing both against the values in the artifact. test_sha256_first16 is the first 16 hex characters of the SHA-256 of the test file as written to disk before sandbox execution.

Auxiliary LLM calls — hypothesis suggesters, context enrichers — are tagged in_evidence_path: false and excluded from the seal. They never appear inside evidence_path_seal.

verified_exception_match: true

“Reproduced” has a precise definition. The sandbox must raise the same exception type that the Sentry event recorded. A pytest exit code with failed > 0 alone is not sufficient — a NameError from a broken import also produces a failure, and that is not a reproduction.

verified_exception_match is only true when sandbox_exception_type equals expected_exception_type exactly. If they diverge, the artifact sets needs_human_review: true with the reason. No green verdict ships on a wrong match.

Control mappings

The three controls are post-incident response controls, not pre-release code-review controls.

  • SOC2 CC7.3 — evaluation of security events to identify those that are security incidents. The sealed reproduction is machine-witnessed evidence that the event was evaluated and the root-cause call was identified.
  • SOC2 CC7.4 — response to identified security incidents. The artifact + draft PR pair documents both the incident evidence and the remediation branch.
  • PCI DSS 12.10.5 — incident response plan includes monitoring and responding to alerts from security monitoring systems. The artifact provides audit-ready evidence that the Sentry alert was acted on with a verifiable response.

Do not confuse these with CC8.1 (change management) or PCI DSS 6.3.2 (pre-release code review). Those govern controls applied before a change ships. The artifact records what happened after a crash was captured in production.

needs_human_review: true

If exception types do not match, or the sandbox produced no output, the artifact sets needs_human_review: true and includes a review_reason string explaining what diverged. The artifact is still written to disk so there is a complete record. It is never promoted to a draft PR automatically when this flag is set.