44 lines
1.8 KiB
Markdown
44 lines
1.8 KiB
Markdown
# EGPv2 Fix Checklist
|
|
|
|
## Goal
|
|
|
|
Reduce the current "always-anomaly" bias in EGPv2 without leaking benchmark labels.
|
|
|
|
## Checklist
|
|
|
|
- [x] Fix Matter value presentation in `signals.py`
|
|
- Format `TemperatureMeasurement.MeasuredValue` as human-readable Celsius.
|
|
- Add semantic formatting for common Matter attributes such as occupancy, on/off, and lock state.
|
|
- Expose short protocol notes to downstream prompts.
|
|
|
|
- [x] Add Matter-aware prompt guidance in `prompts.py`
|
|
- Tell the model these are Matter-style logs.
|
|
- Warn that raw protocol values must not be interpreted naively.
|
|
- Require direct evidence before claiming device faults.
|
|
|
|
- [x] Tighten `Triage`
|
|
- Move from free-form mixed task labels to a primary task with optional secondary task.
|
|
- Cap first-round focus chunk selection to a small set.
|
|
- Encourage query-anchored focus rather than generic anomaly hunting.
|
|
|
|
- [x] Strengthen `Supervisor`
|
|
- Add explicit `risk_of_false_alarm`.
|
|
- Add explicit `recommended_action` so code can enforce the result.
|
|
- Ask supervisor to flag protocol-format misunderstandings.
|
|
|
|
- [x] Turn the pipeline into a bounded 2-round workflow in `pipeline.py`
|
|
- Round 1: Triage -> Investigator -> Supervisor
|
|
- If needed, expand chunks and rerun Investigator + Supervisor once
|
|
- No unbounded looping
|
|
|
|
- [x] Enforce supervisor authority in code
|
|
- If supervisor requests refinement, the code must refine once.
|
|
- If the final supervisor says `abstain`, the code must not let verifier override it.
|
|
- If evidence is still insufficient and false-alarm risk is high, default to a conservative non-anomaly output.
|
|
|
|
- [x] Preserve traceability
|
|
- Keep round-by-round investigator and supervisor traces.
|
|
- Keep backward-compatible top-level trace fields where practical.
|
|
|
|
- [ ] Re-run the 60-episode diagnosis subset after implementation
|