Lesson #1459

or-loop + wf-fix-loop + wf-survey-then-fix tooling stack (built 2026-05-05) Medium authority: 75

ID: 1459
Author: ai
Agent: agent-claude
Reviewed: ✓ Yes
Source authority: 75 / 100
Source: Reliable Python OR-orchestrator stack — replaces flaky or-agent npm SDK
Source issue: —
Created at: 2026-05-12T10:00:23.307662+00:00
Valid until: —
Deprecated at: —
Supersedes: —
Obsidian path: /root/.claude/projects/-nvmetank1-projects/memory/feedback_or_loop_tooling.md
Obsidian hash: 39580fab7a5c681f2319dd36128dc0cf
Tags: claude-memory,feedback

Content

**Built 2026-05-05 to bypass or-agent npm SDK reliability issues.**

## /usr/local/bin/or-loop (~280 LOC Python)

Direct OR-API tool-loop. Tools: `read_file`, `write_file`, `edit_file` (single-replace), `list_dir`, `run_bash`. System prompt enforces "wrap-up discipline" (commit before stop).

**Usage:**
```sh
or-loop --cwd /repo/path --max-turns 15 --max-cost 0.30 --model qwen/qwen3-coder-plus "prompt"
```

**Defaults:**
- model: `qwen/qwen3-coder-plus` ($0.20/M in, $0.80/M out)
- max-turns: 25
- max-cost: $1.00

**Sandbox-locked to --cwd.** Won't escape via `..`.

## /usr/local/bin/wf-fix-loop (~180 LOC bash)

End-to-end pipeline: worktree → or-loop → push → PR → merge → redeploy → verify HTTP 200.

**Usage:**
```sh
PR_TITLE="fix(area): ..." wf-fix-loop /nvmetank1/projects/<repo> <branch-suffix> /tmp/prompt.md [--rebuild]
```

Repo-aware: yoga rebuilds image (Dockerfile.slim), glug live-mount restart, rag-stack restart.

**Failure rate ~50%** when prompt asks "find and fix N bugs" — qwen3-coder-plus over-investigates.

## /usr/local/bin/wf-survey-then-fix (~200 LOC bash) — RECOMMENDED

2-stage workflow that splits investigation from fix-application. Lower failure rate.

**Usage:**
```sh
wf-survey-then-fix /nvmetank1/projects/<repo> <branch-suffix> "<bug-pattern-description>" [--rebuild]
```

**Stage 1 (5-7 turns, $0.10):** or-loop produces `SURVEY.md` with 3 candidates (file:line, evidence, suggested-fix). Commits SURVEY.md.

**Stage 2 (12-15 turns, $0.20):** or-loop reads SURVEY.md, applies each fix surgically. Commits.

**Always produces a committable artifact** even if stage 2 stalls (SURVEY.md remains as audit trail).

**When to use:**
- Prefer for "find-and-fix N bugs" prompts (wf-survey-then-fix wins ~70% vs wf-fix-loop ~50%)
- Use plain wf-fix-loop only when target is **specific and known** (e.g. "add csrf_check to these 3 specific routes")

**Smoke-test result 2026-05-05:** PR yoga#88 (href-hash-placeholders), $0.03, 3 files +47/-7, 3 SURVEY.md candidates → all 3 fix-attempts in stage 2.

## Cost snapshot (typical runs)

| Run-type | Cost | Wall-clock |
|---|---|---|
| or-loop trivial (smoke-test) | $0.001 | 30s |
| wf-fix-loop simple-fix (1 file) | $0.04 | 4-6 min |
| wf-fix-loop hunt-and-fix (3-5 files) | $0.05-0.10 | 8-12 min |
| wf-survey-then-fix end-to-end | $0.03-0.10 | 4-8 min |

OR balance check: `agent-quota-poll get openrouter`.

## Anti-patterns (learned the hard way)

1. **Don't ask qwen3-coder-plus to "find AND fix" in one prompt** — split via wf-survey-then-fix.
2. **Don't run multiple or-loop instances on overlapping repos** — same-file race conditions.
3. **Don't trust max-turns alone** — also set --max-cost (some prompts burn budget on 5-turn loops with huge contexts).
4. **Don't forget the wrap-up commit** — system prompt enforces it but qwen sometimes skips. Orchestrator wrappers (wf-fix-loop / wf-survey-then-fix) auto-commit-on-exit as safety net.

## Tool-call truncation fix (2026-05-05)

**Symptom:** qwen (and other models on Alibaba/OR providers) returned `<400> InvalidParameter: function.arguments must be in JSON format` after `[turn N] out=4096 finish=tool_calls` — never a clean tool_call dispatch.

**Root cause:** `or-loop` had `max_tokens: 4096` hardcoded. A `write_file` payload >~3KB blew the cap → JSON tool-args got truncated mid-string → upstream provider rejected.

**Fix applied 2026-05-05:** raised default to 16384, env-overridable via `OR_LOOP_MAX_TOKENS`. See `/usr/local/bin/or-loop` line ~239.

**How to spot the failure pattern:** `out=4096` exactly + `finish=tool_calls` + next turn HTTP 400 with InvalidParameter. Don't switch model — bump cap.