Commit Graph

2 Commits

Author SHA1 Message Date
BattleTag
8b964b5dec docs(strategist) S8/9: DESIGN.md updates + DESIGN_STRATEGIST.md spec
DESIGN_STRATEGIST.md §11. The strategist refit is the first sub-design
big enough to need its own document, so it lives as a sibling to
DESIGN.md rather than inline.

DESIGN_STRATEGIST.md (new, 543 lines) covers:
  §0  Scope, non-goals, invariants preserved
  §1  Data model (Lead extension, InvestigationRound)
  §2  Six tools (graph_overview / source_coverage / marginal_yield /
      budget_status / propose_lead / declare_investigation_complete)
      with full input_schema
  §3  InvestigationStrategist agent class
  §4  Orchestrator Phase 3 loop pseudocode
  §5  Persistence + resume strategy
  §6  config schema
  §7  Test plan (8 scenarios)
  §8  9-step build order (matches commit history)
  §9  Risks + mitigations
  §10 Open questions
  §11 Required DESIGN.md updates (applied here)
  §12 What this design does NOT solve (exam-test coverage, vision-
      capable LLM, blockchain explorer, etc.)

DESIGN.md updates per §11:
  §4.5  Note harmonic damping is now landed
  §4.9  Phase 3 table row now points at the strategist loop +
        inline summary
  §5    Lead + InvestigationRound rows added to the data-model
        summary table

This commit closes the strategist refit. All 174 tests pass / 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:28:06 -10:00
BattleTag
81ade8f7ac feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source
Consolidates the long-running refit work (DESIGN.md as authoritative spec)
into a single baseline commit. Six stages landed together:

  S1  Case + EvidenceSource abstraction; tools parameterised by source_id
      (case.py, main.py multi-source bootstrap, .bin extension support)
  S2  Grounding gateway in add_phenomenon: verified_facts cite real
      ToolInvocation ids; substring / normalised match enforced; agent +
      task scope checked. Phenomenon.description split into verified_facts
      (grounded) + interpretation (free text). [invocation: inv-xxx]
      prefix on every wrapped tool result so the LLM can cite.
  S3  Confidence as additive log-odds: edge_type → log10(LR) calibration
      table; commutative updates; supported / refuted thresholds derived
      from log_odds; hypothesis × evidence matrix view.
  S4  iOS plugin: unzip_archive + parse_plist / sqlite_tables /
      sqlite_query / parse_ios_keychain / read_idevice_info;
      IOSArtifactAgent; SOURCE_TYPE_AGENTS routing.
  S5  Cross-source entity resolution: typed identifiers on Entity,
      observe_identity gateway, auto coref hypothesis with shared /
      conflicting strong/weak LR edges, reversible same_as edges,
      actor_clusters() view.
  S6  Android partition probe + AndroidArtifactAgent; MediaAgent with
      OCR fallback; orchestrator Phase 1 iterates every analysable
      source; platform-aware get_triage_agent_type; ReportAgent renders
      actor clusters + per-source breakdown.

142 unit tests / 1 skipped — full coverage of the new gateway, log-odds
math, coref hypothesis fall-out, and orchestrator multi-source dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:12:10 -10:00