logo
0
1
WeChat Login

Multi-Agent Parallel Orchestrator

This skill turns multi-agent collaboration into a durable coordination board instead of a chat-memory workflow.

Current Capabilities

  • Dual coordination modes: shared-board for lower token cost and easier human handoff, centralized-dispatch for higher-throughput ranking and automation.
  • Topology-aware auto mode selection using optimize_for, agent_count, machine_count, window_count, and shared_storage.
  • while loop execution with one-time workspace and branch confirmation reused through --while-session.
  • Stable agent identity with agent_id, plus host_id and window_id tracking for multi-machine and multi-window teams.
  • SQLite-backed source of truth with coordination.db, exported state.json, and human-readable STATUS.md / AGENTS.md / SESSIONS.md.
  • Shared-board local projections now also include NOTIFICATIONS.md, so direct CLI users can see actionable board signals without running the dispatch service.
  • Dependency-aware ranking with priority, specialty, critical-path pressure, downstream pressure, and path-scope ownership checks.
  • Semi-automatic owned_paths inference from git changes, entry files, and workspace scan candidates.
  • Generated handoff bundles under project-plan/handoffs/ for cross-window takeover and review handoff.
  • Optional lightweight HTTP dispatch service through coordination_dispatch_server.py and coordination_dispatch_client.py, now running coordination commands in-process instead of shelling out to a new Python subprocess per request.
  • Structured dispatch APIs for next-action, claim-next, and claim-review-next, so centralized mode can use JSON scheduling calls without replacing shared-board CLI flows.
  • Structured session and lease APIs for centralized-dispatch, so service users can register windows, bind later mutations to a returned session_id, inspect SESSIONS.md lease state as JSON, renew leases safely, reap stale idle sessions, and reclaim stale work without going back through raw CLI text parsing.
  • Optional persisted event stream for centralized-dispatch, so service users can poll restart-safe coordination events without changing shared-board behavior.
  • Structured centralized notifications plus event-retention pruning, so service users can subscribe to actionable review or blocker signals by workers, reviewers, workers:<specialty>, reviewers:<specialty>, or agent:<id>, and keep coordination.db history bounded over long runs.
  • Event-retention policy is projected back into state.json, STATUS.md, and doctor output, so later agents can see the current pruning rules without inspecting server startup flags.
  • Shared-board direct CLI can now read local event history and prune it with update_coordination_status.py events and update_coordination_status.py prune-events, so event-backed workflows do not require centralized-dispatch.
  • Those local shared-board events now cover preflight, intake scan, validate / validate-repair, and normal board mutations, so the persisted timeline is useful even without the dispatch service.
  • Shared-board notifications now also supports local subscription-style views by target, specialty, kind, priority, and agent:<id>, so separate windows can poll just the reviewer or worker slice they care about.
  • Shared-board local notifications and events can now watch with --watch-seconds, so a local window can long-poll for the next matching signal without needing centralized-dispatch.
  • Shared-board next-action, next-claimable, and next-reviewable now also support --watch-seconds, so idle windows can wait for the next actionable work item without busy looping.
  • Session registry is now promoted into state.json, coordination.db, and SESSIONS.md, so host/window/agent activity is visible as a first-class coordination surface.
  • Handoff inbox state is now projected into HANDOFFS.md and handoff-index.json, and shared-board users can list, claim, acknowledge, and watch the next ready handoff without the dispatch service.
  • Event cursors now let both local CLI users and centralized readers resume from the last seen event or notification, then acknowledge what they have consumed.
  • Event pruning now archives deleted rows into project-plan/archive/events/*.jsonl, so retention stays bounded without losing audit history.
  • Centralized-dispatch now exposes lightweight SSE feeds for /events/stream and /notifications/stream, plus read-only /event-archives and /mode-advisor.
  • State mutations now support optimistic concurrency through revision and --expected-revision, so stale writers can fail fast instead of overwriting newer board state.
  • Centralized lease mutations can now return and consume lease_token, so a reclaimed or re-claimed module can reject stale heartbeats and stale completion/release attempts instead of silently accepting an older holder.
  • claim, claim-next, start-review, and claim-review-next now auto-reclaim expired leases inside the same state mutation, so stale work can be taken over without a separate reclaim pass.
  • Empty-board reads no longer rewrite revision behind the scenes; read-only summary, state, and other load paths now keep optimistic-concurrency counters stable until a real state mutation occurs.
  • A stdio MCP entrypoint now exposes state, notifications, handoffs, dispatch, and generic coordination commands without creating a second coordination backend.
  • Intake-created tasks now split candidate_* dependency and ownership hints from confirmed depends_on / owned_paths, so rough scan suggestions stop over-serializing the board by default.
  • Preflight now supports safe, balanced, and throughput safety profiles, and projects the chosen policy into STATUS.md, state.json, and preflight.json.
  • Verification coverage across workflow tests, enhancement tests, shared-board smoke, centralized-dispatch smoke, and bundle verification.

Best Fit

  • One machine with multiple terminals, windows, or chat threads
  • Multiple machines sharing the same visible project-plan/ directory
  • Projects that need recoverable, auditable, handoff-friendly coordination

Current Boundaries

  • This is still not a full centralized scheduler.
  • The lightweight dispatch server is a thin coordination entrypoint, not an independent orchestration engine.
  • For long-running, high-concurrency, cross-machine setups, the next step is still a real central coordination service.

Quick Start

0. Bootstrap the board and optionally emit host-ready MCP config

python scripts/bootstrap_coordination.py --project-root "<project-root>" python scripts/bootstrap_coordination.py --project-root "<project-root>" --emit-mcp-config

When --emit-mcp-config is used, the skill writes project-plan/mcp-host-config.json with a ready-to-import stdio MCP server entry for the current project root.

1. Run preflight once before a while session

python scripts/preflight_coordination.py --project-root "<project-root>" --confirm-workspace --confirm-branch --session-kind while-loop --mode auto --optimize-for token --agent-count 2 --window-count 2 python scripts/preflight_coordination.py --project-root "<project-root>" --confirm-workspace --confirm-branch --session-kind while-loop --mode auto --optimize-for token --safety-profile safe

For cross-machine work that should optimize for throughput:

python scripts/preflight_coordination.py --project-root "<project-root>" --confirm-workspace --confirm-branch --session-kind while-loop --mode auto --optimize-for efficiency --agent-count 6 --machine-count 2 --window-count 4 --cross-machine

2. Initialize the first agent and seed intake

python scripts/update_coordination_status.py init-agent --project-root "<project-root>" --thread-key "worker-a" --role worker --while-session python scripts/scan_project_intake.py --project-root "<project-root>" --thread-key "worker-a" --seed-tasks

3. Keep looping with next-action

python scripts/update_coordination_status.py next-action --project-root "<project-root>" --agent-id "<agent-id>" --specialty "backend" python scripts/update_coordination_status.py next-action --project-root "<project-root>" --agent-id "<agent-id>" --specialty "backend" --watch-seconds 5 python scripts/update_coordination_status.py claim-next --project-root "<project-root>" --agent-id "<agent-id>" --reviewer-id "<reviewer-id>" --specialty "backend" --while-session python scripts/update_coordination_status.py claim-review-next --project-root "<project-root>" --agent-id "<reviewer-id>" --while-session

4. Generate a handoff bundle

python scripts/update_coordination_status.py handoff-bundle --project-root "<project-root>" --module "<module>" --agent-id "<agent-id>"

The bundle is written to:

project-plan/handoffs/<module>.handoff.md

Lightweight Centralized Entry Point

Start the proxy:

python scripts/coordination_dispatch_server.py --project-root "<project-root>" --host 127.0.0.1 --port 8765

Call it through the client:

python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" preflight --confirm-workspace --confirm-branch --session-kind while-loop --mode centralized-dispatch --optimize-for efficiency --agent-count 6 --machine-count 2 --window-count 4 --cross-machine python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" update summary python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" validate --repair python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" state python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" state format=summary python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" events since=0 limit=50 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" events-stream since=0 limit=20 timeout=5 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" notifications since=0 limit=50 target=reviewers:backend kind=review-ready python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" notifications-stream since=0 limit=20 timeout=5 target=reviewers:backend kind=review-ready python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" mode-advisor python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" prune-events --max-rows 2000 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" event-archives python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" sessions python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" register-session --thread-key "worker-a" --role worker --host-id machine-a --window-id window-1 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" lease-heartbeat --agent-id "<agent-id>" --session-id "<session-id>" --module "core-auth" --ttl-seconds 600 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch claim-next --agent-id "<agent-id>" --session-id "<session-id>" --specialty backend python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" lease-heartbeat --agent-id "<agent-id>" --session-id "<session-id>" --lease-token "<lease-token>" --module "core-auth" --ttl-seconds 600 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" reap-sessions --max-age-seconds 86400 python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" lease-reclaim python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch next-action --agent-id "<agent-id>" --specialty backend python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch claim-next --agent-id "<agent-id>" --reviewer-id "<reviewer-id>" --specialty backend --while-session python scripts/coordination_dispatch_client.py --server-url "http://127.0.0.1:8765" dispatch claim-next --agent-id "<agent-id>" --specialty backend --expected-revision 12

This does not replace shared-board. In local or lower-overhead runs, direct CLI remains first-class:

python scripts/update_coordination_status.py next-action --project-root "<project-root>" --agent-id "<agent-id>" --specialty "backend" python scripts/update_coordination_status.py next-reviewable --project-root "<project-root>" --agent-id "<reviewer-id>" --specialty "backend" --watch-seconds 5 python scripts/update_coordination_status.py claim-next --project-root "<project-root>" --agent-id "<agent-id>" --reviewer-id "<reviewer-id>" --specialty "backend" --while-session python scripts/update_coordination_status.py notifications --project-root "<project-root>" python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "reviewers:backend" --since 0 --limit 20 python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "reviewers:backend" --kind "review-ready" --cursor-id "review-feed" python scripts/update_coordination_status.py ack-notifications --project-root "<project-root>" --cursor-id "review-feed" python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "agent:agent-1234abcd" --since 0 --limit 20 python scripts/update_coordination_status.py notifications --project-root "<project-root>" --target "reviewers:backend" --kind "review-ready" --since 0 --limit 20 --watch-seconds 5 python scripts/update_coordination_status.py events --project-root "<project-root>" --since 0 --limit 50 --notification-only --target "reviewers:backend" python scripts/update_coordination_status.py events --project-root "<project-root>" --cursor-id "event-feed" python scripts/update_coordination_status.py ack-events --project-root "<project-root>" --cursor-id "event-feed" python scripts/update_coordination_status.py events --project-root "<project-root>" --source "validate" --since 0 --limit 20 --watch-seconds 5 python scripts/update_coordination_status.py handoffs --project-root "<project-root>" python scripts/update_coordination_status.py next-handoff --project-root "<project-root>" --agent-id "<agent-id>" --watch-seconds 5 python scripts/update_coordination_status.py claim-handoff --project-root "<project-root>" --module "<module>" --agent-id "<agent-id>" python scripts/update_coordination_status.py ack-handoff --project-root "<project-root>" --module "<module>" --agent-id "<agent-id>" python scripts/update_coordination_status.py mode-advisor --project-root "<project-root>" python scripts/update_coordination_status.py prune-events --project-root "<project-root>" --max-rows 2000 python scripts/update_coordination_status.py event-archives --project-root "<project-root>"

In other words:

  • shared-board: direct CLI is the normal path, with local notifications, cursor-aware event history, handoff inbox commands, pruning, archive inspection, and NOTIFICATIONS.md / HANDOFFS.md projections.
  • centralized-dispatch: service endpoints, structured dispatch, persisted event polling, SSE streams, notification polling, archive inspection, and mode advice become the higher-efficiency path.

MCP Entry Point

Run the stdio MCP bridge when another host wants structured coordination tools:

python scripts/coordination_mcp_server.py --project-root "<project-root>"

Or generate a host-importable config directly from bootstrap:

python scripts/bootstrap_coordination.py --project-root "<project-root>" --emit-mcp-config

That writes:

project-plan/mcp-host-config.json

The file contains a ready mcpServers entry pointing at scripts/coordination_mcp_server.py with the current project root and folder name.

It exposes:

  • coordination_state
  • coordination_events
  • coordination_notifications
  • coordination_handoffs
  • coordination_sessions
  • coordination_mode_advisor
  • coordination_session_register
  • coordination_session_reap
  • coordination_lease_heartbeat
  • coordination_lease_reclaim
  • coordination_dispatch
  • coordination_command

For hosts that keep a long-lived worker or reviewer window, prefer the structured MCP or dispatch path that carries both session_id and lease_token. That gives the host an explicit session binding plus per-lease compare-and-swap style protection for renew/release/finish flows.

Verification

Run the full bundle check:

python scripts/verify_skill_bundle.py

It now covers:

  • test_coordination_workflow.py
  • test_coordination_enhancements.py
  • default shared-board smoke
  • centralized-dispatch + while-session + dispatch server/client smoke

References

  • SKILL.md
  • references/coordination-files.md
  • references/example-walkthrough.md
  • references/example-snapshots.md
  • references/troubleshooting.md

About

0.0

900.00 KiB
Skills0 forks1 stars2 branches0 TagREADME
Language
Python100%