DAO Proposals & Community

View active proposals, submit new ideas, and connect with the SWARMS community.

structs
agents

# Description New `FuguAgent` class (`swarms/agents/fugu.py`) implementing the Fugu/Trinity orchestration pattern — a multi-agent system that presents itself as a single model API. ### Core Architecture The `FuguAgent` coordinates a pool of worker agents through a dedicated coordinator model using **tool-calling** (not text parsing). At each step the coordinator calls the `decide_next_step` tool, committing to a structured `AgentTask {role, worker, instruction, visibility}`. The result is stored directly via closure capture, bypassing fragile history parsing. **Key components:** - **`decide_next_step` tool** — Function-call based orchestration. The coordinator decides role, worker, instruction, and visibility for each step and commits via the tool. No JSON text parsing required. - **Dynamic roles** — Roles are not hardcoded. The coordinator assigns whichever role fits: `planner`, `researcher`, `coder`, `writer`, `verifier`, `reviewer`, `executor`, `summarizer`, etc. - **Model capability ranking** — Workers are ranked by `MODEL_TIER` scores. The coordinator's system prompt lists workers by tier, ensuring the most powerful models handle the hardest subtasks. - **`MemoryStore`** — SQLite-backed persistent memory across turns and sessions. - **Visibility routing** — Each `AgentTask` specifies which prior step outputs (by index) the worker can see, implementing the Conductor's access-list pattern. - **Chain-of-thought aggregation** — Final answer synthesized by passing all step outputs through the coordinator. **Files changed:** | File | Change | |------|--------| | `swarms/agents/fugu.py` | New — core FuguAgent implementation (380 LOC) | | `swarms/agents/__init__.py` | Added `FuguAgent` export | | `swarms/structs/swarm_router.py` | Added `"FuguAgent"` to `SwarmType` + `_create_fugu_agent()` | | `examples/single_agent/fugu_example.py` | New — minimal usage example | ## Architecture ```mermaid flowchart TD User(["User Task"]) --> Coordinator Coordinator -->|"decide_next_step()"| Tool Tool -->|"AgentTask JSON"| Holder Holder --> T1 subgraph T1[" "] direction LR T2["_decide_holder read"] end T2 --> ExecStep["execute_step()"] ExecStep --> ExRole{role} ExRole -->|planner / researcher / coder / writer| Execute ExRole -->|verifier / reviewer| Verify ExRole -->|any| Context["build visibility context"] subgraph Worker_Pool["Worker Pool"] W1["[7] general (gpt-4o)"] W2["[5] coder (gpt-4o-mini)"] W3["[5] researcher (claude-sonnet)"] end Execute --> Context Context --> W1 Context --> W2 Context --> W3 W1 --> R1[/Result/] W2 --> R2[/Result/] W3 --> R3[/Result/] R1 --> WS[WorkflowState] R2 --> WS R3 --> WS Verify --> VResult{ver.accept?} VResult -->|ACCEPT| Done VResult -->|REVISE| Coordinator WS --> Aggregator Done --> Aggregator["coordinator.aggregate()"] Aggregator --> Final(["Final Answer"]) subgraph Mem["MemoryStore (SQLite)"] M1["per-turn artifacts"] M2["session context"] end WS -.-> M1 M1 -.-> M2 ``` ### Execution Loop ```mermaid sequenceDiagram participant User participant Coord as Coordinator participant Tool participant Fugu participant Worker participant Verifier User->>Fugu: run(task) loop max_turns Fugu->>Coord: coordinator.run(history_ctx + memory) Coord->>Tool: decide_next_step(role, worker, instruction, visibility) Tool->>Tool: store task in _decide_holder Tool-->>Coord: AgentTask JSON Coord-->>Fugu: run() returns Fugu->>Fugu: agent_task = _decide_holder.pop() alt agent_task.role in verifier / reviewer Fugu->>Verifier: run(accumulated_work) Verifier-->>Fugu: ACCEPT or REVISE + diagnosis Fugu->>Fugu: if accept: break else Fugu->>Worker: run(instruction + visibility_context) Worker-->>Fugu: result end Fugu->>Fugu: WorkflowState.results.append() Fugu->>Fugu: MemoryStore.save(turn) end Fugu->>Coord: coordinator.run(synthesis_prompt) Coord-->>Fugu: Final Answer Fugu-->>User: Final Answer ``` ## Usage ```python from swarms import FuguAgent agent = FuguAgent( coordinator_model="gpt-4o-mini", max_turns=5, verbose=True, ) result = agent.run("Write a short story about a robot discovering music.") ``` Workers are auto-detected from `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` / `GOOGLE_API_KEY`, or can be passed explicitly: ```python from swarms import FuguAgent, Agent agent = FuguAgent( workers=[ Agent(agent_name="coder", model_name="gpt-4o"), Agent(agent_name="researcher", model_name="claude-sonnet-4-5"), ], max_turns=5, ) ``` Also available via `SwarmRouter`: ```python from swarms import SwarmRouter, Agent router = SwarmRouter( agents=[Agent(agent_name="a", model_name="gpt-4o"), Agent(agent_name="b", model_name="claude-sonnet-4-5")], swarm_type="FuguAgent", max_loops=5, ) result = router.run("Write a story about a robot.") ``` ## Issue https://github.com/kyegomez/swarms/issues/1698 ## Dependencies None beyond existing swarms dependencies. No new packages required. ## Tag Maintainer kye@swarms.world ## Twitter Handle x.com/Euroswarms

IlumCIProposed by IlumCI
View on GitHub →

## Description Self-explanatory as title, example file below. Implementation is near-identical to the same streaming-callback functions in other modules. ### issue: #1696 ### Files changed: /swarms/structs/groupchat.py - The main file. examples/multi_agent/groupchat/groupchat_streaming_example.py - The Example file.

IlumCIProposed by IlumCI
View on GitHub →

## Description Fixed two bugs in multi_agent_exec.py. Replaced deprecated asyncio.get_event_loop() calls with asyncio.get_running_loop() and asyncio.run() — the old calls raise RuntimeError in thread contexts on Python 3.10+. Added per_task_timeout to run_agents_concurrently so a single hung agent no longer blocks the entire batch indefinitely; timed-out agents return a TimeoutError in the result instead of propagating cancellation. ## Files Changed ` swarms/structs/multi_agent_exec.py ` ## Issue #1679 ## Dependencies No extra dependencies required. ## Maintainer @kyegomez ## Twitter @akc__2025

adichaudharyProposed by adichaudhary
View on GitHub →

- Implement a simple agent verison of Fugu from Sakanai - https://x.com/SakanaAILabs/status/2068861630327443966 - https://sakana.ai/fugu-release/ - https://sakana.ai/fugu/ - https://x.com/hardmaru/status/2068884466056225025 - https://x.com/hardmaru/status/2068884466056225025

kyegomezProposed by kyegomez
View on GitHub →

## Description Add `streaming_callback` support to `GroupChat` in `swarms/structs/groupchat.py`, following the same pattern used in `ConcurrentWorkflow`, `HierarchicalSwarm`, and other structs. ### Changes: - `__init__` accepts `streaming_callback: Optional[Callable[[str, str, bool], None]] = None` for instance-level callback - `run()` and `run_batch()` accept `streaming_callback` parameter for per-call override - Per-call callback takes priority over instance callback (method parameter > instance variable) - Callback signature: `Callable[[agent_name: str, content: str, is_final: bool], None]` - `is_final=True` with empty content is sent for each agent when conversation ends ## Issue Fixes #1696 ## Dependencies None ## Tag maintainer @kyegomez (swarms.structures) ## Twitter handle x.com/Euroswarms

IlumCIProposed by IlumCI
View on GitHub →

## Description As the title says, Create a video tutorial on using the Swarms API in external apps(Hermes Agent, OpenClaw, OpenCode, Cursor AI), via the Swarms API OpenAI-compatible endpoints. **Video should be 1-3 minutes long.** ### To-Do: - **Capability:** Show off the capabilities of the Swarms API, what it can be used for, what it's best for, how easy it is to use it. - **Advertisement:** Emphasize the freedom to use any LLM model that ever existed(within the bounds of LiteLLM), Redirect the user to https://swarms.world/signin/ to Sign Up & create an API key to use said models. - **Live Usage:** Record yourself integrating Swarms API into an external system/harness *mentioned above,* and show off it's wonders on a random task. ### References to learn from: - Swarms API Docs: https://docs.swarms.ai/docs/documentation/getting-started/quickstart - Swarms API reference: docs.swarms.ai/api-reference/ - Swarms Agent/Swarm Architecture Docs: https://docs.swarms.world/api/agent

IlumCIProposed by IlumCI
View on GitHub →

## Description Fix two issues in `swarms/structs/multi_agent_exec.py` surfaced in PERFORMANCE_AUDIT.md §3.9: ### Bug Fix: `asyncio.get_event_loop()` deprecation (Python 3.10+) `asyncio.get_event_loop()` has been deprecated since Python 3.10 and raises `RuntimeError` when called from threads or when no running loop exists. **Changes:** - **`run_agent_async()`** — replaced `asyncio.get_event_loop()` with `asyncio.get_running_loop()` since this function is always called from an async context - **`run_agents_concurrently_multiprocess()`** — replaced manual `loop = asyncio.get_event_loop()` + `loop.run_until_complete(...)` with `asyncio.run(...)` which properly creates, runs, and closes its own event loop for sync→async bridges ### Correctness Fix: Per-task timeout for concurrent fan-out The concurrent fan-out path previously awaited all agent tasks without any timeout. A single hung agent (slow LLM, stalled tool call, network stall) would block the entire batch indefinitely. **Changes:** - Added `per_task_timeout: float | None = None` parameter to `run_agents_concurrently()` - Applied `future.result(timeout=per_task_timeout)` in both the dict and list return paths - Timeout exceptions are caught and included in results (not propagated), preventing one hung agent from crashing the batch - Fully backward compatible — default `None` means no timeout (existing behavior unchanged) ## Issue Fixes #1679 — Surfaced in `PERFORMANCE_AUDIT.md §3.9`. ## Dependencies None — no new dependencies required. ## Tests Added comprehensive unit test suite at `tests/structs/test_multi_agent_exec.py` with **21 tests** covering: | Test Class | Tests | What's Covered | |---|---|---| | `TestRunSingleAgent` | 2 | Basic execution, exception propagation | | `TestRunAgentAsync` | 2 | Async context execution, verifies `get_running_loop()` is used | | `TestRunAgentsConcurrentlyAsync` | 1 | `asyncio.gather` fan-out | | `TestRunAgentsConcurrently` | 3 | List/dict return paths, no-timeout default | | `TestRunAgentsConcurrentlyTimeout` | 4 | Timeout in list path, timeout in dict path, all-succeed-within-timeout, backward compat with `None` | | `TestRunAgentsConcurrentlyMultiprocess` | 3 | Sync context execution, verifies `asyncio.run()` is used, batch processing | | `TestBatchedGridAgentExecution` | 3 | Grid execution, mismatched lengths, exception capture | | `TestRunAgentsWithDifferentTasks` | 3 | Pair execution, empty input, batch ordering | All tests use lightweight `_FakeAgent` stubs (no real LLM calls) and run in ~13 seconds. ``` pytest tests/structs/test_multi_agent_exec.py -v # 21 passed in 12.71s ``` Linting passes: ``` black --check --line-length 70 : Done ruff check : Done ``` ## Tag Maintainer @kyegomez — this touches `swarms.structs` (multi-agent execution utilities) ## Backward Compatibility All changes are fully backward compatible: - Existing code calling these functions continues to work without modification - Default behavior unchanged when `per_task_timeout` is not specified - No API removals or signature breaking changes

dbosmrtProposed by dbosmrt
View on GitHub →

## Description: Added `computer_use` module with 8 security-hardened tools for filesystem and shell operations, plus a `create_computer_use_tools()` factory function for easy setup. Tools include read_file, list_directory, grep_files, write_file, edit_file, patch_file, delete_file, and run_command. - **Issue:** fixes #1691 - **Dependencies:** None - **Tag maintainer:** kye@swarms.world - **Twitter/X username:** https://x.com/Euroswarms ## Changes ### `swarms/tools/computer_use.py` The module provides 8 security-hardened computer-use tools: | Tool | Description | |------|-------------| | `read_file` | Read a file's contents with workspace restriction and NUL-byte rejection | | `list_directory` | List directory entries with glob filtering and hidden file control | | `grep_files` | Search for patterns in files using ripgrep (fallback to Python) | | `write_file` | Write content to files atomically with backup support | | `edit_file` | Replace specified text in files with match counting | | `patch_file` | Single replacement alias for edit_file | | `delete_file` | Delete files/directories with snapshot backup | | `run_command` | Execute shell commands with binary allowlist and substring denylist | Added `create_computer_use_tools()` factory function with: - Auto-detection of workspace root (uses `COMPUTER_USE_WORKSPACE` env var, or auto-detects swarms root from `/examples/tools/`) - Sensible default `WritePolicy` (`require_confirm=False`, `follow_symlinks="reject"`) - `ShellPolicy` configured to allow the workspace as cwd - Pre-configured tool wrappers with docstrings and type annotations ### `examples/tools/computer-use/computer-use-example.py` New example demonstrating agent using the computer-use toolkit to create and debug a PyTorch model. Simplified from manual tool configuration to a single call to `create_computer_use_tools()`. ## Usage ### Factory Setup (Recommended) ```python from swarms import Agent from swarms.tools.computer_use import create_computer_use_tools tools = create_computer_use_tools() agent = Agent( agent_name="MyAgent", model_name="gpt-4o", tools=list(tools.values()), max_loops=5, ) agent.run("Write a hello world script to /tmp/hello.py") ``` ### Individual Tool Imports You can also import and use tools individually: ```python from swarms.tools.computer_use import ( read_file, # Read file contents list_directory, # List directory entries grep_files, # Search for patterns in files write_file, # Write content to files edit_file, # Replace text in files patch_file, # Single replacement delete_file, # Delete files run_command, # Execute shell commands ) ``` ### With Custom Workspace ```python from swarms import Agent from swarms.tools.computer_use import create_computer_use_tools tools = create_computer_use_tools(workspace_root="$HOME/user/workspace/myproject") agent = Agent( agent_name="FileBot", model_name="gpt-4o", tools=list(tools.values()), max_loops=3, ) ``` ### With Environment Variable ```bash export COMPUTER_USE_WORKSPACE="$HOME/workspace/myproject" ``` ```python from swarms.tools.computer_use import create_computer_use_tools tools = create_computer_use_tools() # reads from COMPUTER_USE_WORKSPACE env var ``` ### All Available Tools ```python from swarms import Agent from swarms.tools.computer_use import create_computer_use_tools tools = create_computer_use_tools() agent = Agent( agent_name="ComputerAgent", model_name="gpt-4o", tools=list(tools.values()), max_loops=10, ) # The agent has access to: # - read_file: Read a file's contents (read-only) # - list_directory: List directory contents (read-only) # - grep_files: Search for pattern in files (read-only) # - write_file: Write content to a file (overwrite mode) # - edit_file: Edit a file by replacing old text with new text # - patch_file: Patch a file (single replacement) # - delete_file: Delete a file # - run_command: Run a shell command ``` ### Example Task ```python from swarms import Agent from swarms.tools.computer_use import create_computer_use_tools tools = create_computer_use_tools() agent = Agent( agent_name="CodeDebugger", model_name="gpt-4o", system_prompt="""You are a code debugging agent. Fix bugs in the code at $HOME/api/swarms/buggy.py. 1. Read the file to understand the code 2. Identify bugs 3. Fix them with edit_file 4. Test with run_command 5. Repeat until working""", tools=list(tools.values()), max_loops=10, ) result = agent.run("Fix all bugs in $HOME/swarms/buggy.py") ``` --- Maintainer responsibilities: - General / Misc / if you don't know who to tag: kye@swarms.world

IlumCIProposed by IlumCI
View on GitHub →

# Description > **TL;DR.** `swarms/tools/` is currently a pure schema/registry/decorator layer (18 files, ≈4.6K LoC, zero shell, zero filesystem primitives). Agents can call *whatever Python functions the user passes in*, with no allow/deny gating. Add a first-party, security-hardened computer-use toolkit (`read_file`, `write_file`, `edit_file`, `patch_file`, `run_command`, `list_directory`, `delete_file`, `grep_files`, `apply_unified_patch`) and a thin policy engine that wraps the two existing function-dispatch chokepoints. --- ## Why this request exists I audited `swarms/tools/` end-to-end and confirmed: - The directory contains **no** `subprocess`, `os.system`, `os.popen`, `eval`, `exec`, `open(...)`, `shutil.rmtree`, or shell primitives. Grep on the whole tree returns zero matches. - Frame like `BaseTool`, `parse_and_execute_json`, `handoff_task` only **convert functions to/from JSON schemas and dispatch by name** — they're correct plumbing but unsafe by default: any callable the user registers (`os.system`, `shutil.rmtree`, `subprocess.Popen(...)`) runs unrestricted. - The agent *does* invoke functions when given a well-formed `tool_calls` payload, but users have no canonical, safety-gated path for filesystem/shell access — they have to roll their own every time, and every roll differs. So: the agents describe what they would do with a tool, because giving them one is currently a foot-gun. This issue proposes adding both the tools and the gates in one cohesive patch. --- ## Architecture ### Two chokepoints to gate There are exactly **two** sites where `swarms/tools/` invokes a Python callable. Cover both, and the whole surface is covered: | # | File:line | Site | |---|---|---| | A | `swarms/tools/base_tool.py:2766` | `_execute_single_function_call`, line `result = func(**arguments)` | | B | `swarms/tools/tool_parse_exec.py:88` | `parse_and_execute_json`, line `result = function_dict[function_name](**parameters)` | Both are wrapped by exactly one new line each: ```python enforce(policy, ToolCallRequest.from_callable(func, arguments, source="api_response")) ``` ### New computer-use tools **Convention:** every tool is a thin wrapper around stdlib `pathlib` / `subprocess` / `unittest`-style utilities. Every tool carries a default `ToolPolicy`. Users can override per-instance. Each tool is documented separately below, grouped into **three themes** (read-only / write / shell). Grouping is by safety profile: read-only tools share one policy surface, write tools share another with strict `workspace_root` enforcement, and `run_command` has its own shell-specific rules. --- ## Theme 1 — Read-only filesystem tools (grouped) These three have identical policy shape (workspace-rooted read, no confirm) and identical error handling. Grouped because they're shipped as one module and users almost always want them together. ### `read_file(path: str, encoding: str = "utf-8", offset: int = 0, limit: int | None = None) -> str` - Wraps `pathlib.Path.read_text(encoding=encoding)`. - Hard caps: `offset >= 0`, `limit <= 10_000_000` (10 MiB). Beyond cap, return `{...truncated, +N bytes...}` sentinel. - Policy: `allow` iff `realpath(path).is_relative_to(workspace_root)`. Follows symlinks. - Return: file contents as string. On `IsADirectoryError` → `"Error: <path> is a directory; use list_directory"`. ### `list_directory(path: str = ".", glob: str = "*", include_hidden: bool = False) -> list[dict]` - Wraps `pathlib.Path.iterdir()` + `fnmatch.filter`. - Returns `[{name, path, type: "file"|"dir"|"symlink", size_bytes}]` — no file contents. - `glob` whitelist: alphanumeric + `*?[]!` (rejects `..`, `/`, null bytes, regex metas). - Policy: workspace-rooted same as `read_file`. ### `grep_files(pattern: str, path: str = ".", glob: str = "*", context_lines: int = 2, max_matches: int = 200) -> list[dict]` - Subprocess to `rg` (ripgrep) if available, fallback to Python `re` over `pathlib.Path.rglob(glob)`. - Subprocess path uses argv list (no shell), `timeout=10s`, `max_output_bytes=1 MiB`. - Returns `[{path, line, match, context_before, context_after}]`. - Pattern rejected if contains NUL bytes; rejects `--` flag-injection in `glob` by rejecting leading `-`. - Policy: workspace-rooted; deny listed dirs at argv level: `{"/proc", "/sys", "/dev"}`. **Default policy block (read theme):** ```python ReadPolicy( require_cwd_under="/workspace", deny_paths={"/etc", "/root", "/home/*/.ssh", "/proc", "/sys", "/var/lib", "/boot", "/usr/lib", "/usr/lib64"}, follow_symlinks=False, max_file_size_bytes=10 * 1024 * 1024, max_output_bytes=1 * 1024 * 1024, ) ``` --- ## Theme 2 — Write/edit filesystem tools These need **confirmation by default** + workspace-root enforcement + atomic write semantics. Grouped because they share the same `WritePolicy` and atomic-write helper. ### `write_file(path: str, content: str, mode: Literal["fail", "overwrite", "append"] = "fail") -> dict` - `mode="fail"`: refuse if file exists. Default. - `mode="overwrite"`: requires explicit `confirm: True` passed in dict-position. - `mode="append"`: append with O_APPEND; same policy as overwrite. - Atomic write: write to `<path>.tmp.<pid>.<ts>`, `fsync`, `os.replace()`. No partial writes on parent crash. - Rejects NUL bytes and invalid filename characters via `os.pathconf`/`pathlib`. ### `edit_file(path: str, old: str, new: str, expected_replacements: int | None = 1) -> dict` - Reads file, counts occurrences of `old`; rejects unless count matches `expected_replacements` exactly. - `expected_replacements=None` means "exactly 1"; refuse 0 matches and >1 matches. - Writes via atomic helper above. - `old` and `new` length-capped at 1 MiB each. ### `apply_unified_patch(unified_diff: str, base_path: str = ".") -> dict` - Stdlib parse (split on `---` / `+++` / `@@` headers). No external deps. - Refuses if any hunk targets a path outside `workspace_root` *after realpath resolution*. - Apply order: hunks applied top-to-bottom, recorded in audit log; rollback by reverting to `.bak` snapshot taken before apply. - Returns `{applied_files: [...], rejected_hunks: [...], snapshot_id}`. ### `delete_file(path: str, recursive: bool = False) -> dict` - `recursive=False`: `Path.unlink()` only; on directory returns error unless `recursive=True`. - `recursive=True`: snapshot subtree first, then `shutil.rmtree`. Confirms twice (dry-run summary, then re-confirm). - Default `recursive=False`, `confirm=True` required. ### `patch_file(path: str, old: str, new: str) -> dict` Convenience wrapper — thin alias of `edit_file(..., expected_replacements=1)`. Provided so the LLM has a name that matches its mental model and so the OpenAI schema surface stays symmetric (every doc pairing naturally produces a `read_file`/`write_file`/`patch_file` triplet, even though under the hood `patch_file` is `edit_file`). Symbol-only alias — same implementation. **Default policy block (write theme):** ```python WritePolicy( require_cwd_under="/workspace", require_confirm=True, # default off only for trusted agents mode_default="fail", atomic=True, max_file_size_bytes=10 * 1024 * 1024, backup_on_overwrite=True, backup_dir=".swarm_backups", follow_symlinks="reject", # never follow write through a symlink ) ``` --- ## Theme 3 — Shell tooling ### `run_command(argv: list[str], cwd: str = "/workspace", env: dict | None = None, timeout: int = 30, stdin: str | None = None) -> dict` - **Always** `subprocess.run(argv, shell=False, check=False, timeout=timeout, capture_output=True, text=True)`. No exceptions. The whole point of this tool is to refuse any code path that could ever invoke a shell. - `argv` validated: - Length in `[1, 256]`. - Each token ≤ 4096 bytes. - No NUL bytes anywhere. - The first token (binary name) is checked against an **allowlist** — a sensible default is `{"ls","cat","grep","rg","find","head","tail","wc","sort","uniq","tr","cut","paste","sed","awk","git","npm","node","python","python3","pip","pip3","pytest","cargo","rustc","go","make","cmake","gcc","clang","curl","wget","jq","tar","gzip","gunzip","zip","unzip","ssh","scp","rsync","ps","top","htop","df","du","free","uname","whoami","id","env","printenv","date","cal","bc","diff","patch"}` plus per-agent additions. - Argv-pattern denylist (substring match on the joined argv): `{"rm -rf", "mkfs", "dd if=", ">/dev/sd", "chmod -R 777", ":(){:|:", "curl ... | sh", "wget ... | sh", ">/etc/", "/etc/passwd", "/etc/shadow", "/root/.ssh", "ssh-keygen", "iptables", "ufw ", "systemctl ", "service "}`. - `cwd` policy: must `realpath(cwd).is_relative_to(workspace_root)` *unless* `cwd` is on a small explicit allowlist (e.g. `/tmp` for ephemeral work). - Per-call resource cap: - `timeout ∈ [1, 600]` (default 30). - `max_output_bytes = 1 MiB` per stream (stdout and stderr separately; truncate beyond with sentinel `{... truncated at 1 MiB, see redirect files ...}`). - Kills the process group on timeout (`start_new_session=True`, then `os.killpg(SIGKILL)`). - `env` policy: if `None`, inherits `os.environ` minus a denylist (`{"AWS_SECRET_ACCESS_KEY","GITHUB_TOKEN","ANTHROPIC_API_KEY","OPENAI_API_KEY",...}`). If provided, **only** keys in the agent's `allowed_env_keys` set are kept. - `stdin` policy: length-capped at 64 KiB; never logged; HMAC'd into audit record only. - Returns `{"argv": [...], "returncode": int, "stdout": str, "stderr": str, "duration_ms": int, "truncated": {"stdout": bool, "stderr": bool}}`. - Refuses to run if any token matches denylist — `SafetyError` raised **before** `subprocess.run` is called. **Default policy block (shell theme):** ```python ShellPolicy( shell=False, # never binary_allowlist=DEFAULT_BINARY_ALLOWLIST, argv_substring_denylist=DEFAULT_DENYLIST, cwd_under="/workspace", cwd_extra={"/tmp"}, timeout_default=30, timeout_max=600, max_stdout_bytes=1 * 1024 * 1024, max_stderr_bytes=1 * 1024 * 1024, max_stdin_bytes=64 * 1024, kill_process_group_on_timeout=True, redact_stdin_in_logs=True, blocked_env_keys={...25 common secret keys...}, ) ``` --- ## Cross-cutting requirements (apply to **all** tools) These belong at the policy-engine level, not per-tool — they're a contract for the whole feature. 1. **Symlink hard-stop on writes.** Write/edit/delete operations: refuse if any path component (including the leaf) is a symlink and `follow_symlinks == "reject"`. 2. **Path canonicalization.** Realpath resolved before allow/deny check on every FS tool — no exception. 3. **NUL byte rejection** in every string argument, no exception. 4. **Arg size caps** per tool (configurable; defaults listed above). 5. **Concurrency control.** `BaseTool._execute_function_calls_parallel` is wrapped in a `threading.Semaphore` whose counter distinguishes FS ops from shell ops (defaults: 4 FS, 2 shell). 6. **Confirmation flow.** For tools whose policy has `require_confirm=True`, `enforce()` raises `ConfirmationRequired(request)`. The Agent orchestrator handles this by pre-approving policies globally or asking the user. No silent confirmations. 7. **Audit log.** Per-call HMAC-chained JSONL record, written to `~/.swarms/audit/<agent_name>.jsonl`. Opt-in via `audit_log=True`. Records: timestamp, tool qualname, args hash, args preview (with redactions), decision (`allow|deny|confirm`), duration, returncode/filesize/error. 8. **`redaction.filter()`** runs over every record before disk write — keys marked `redact_secrets=...` are replaced with `sha256[:12]` prefix and never logged raw. 9. **`tool_parse_exec.parse_md=True`** blocks-and-executes each fenced code block independently — that's now gated too. Each block goes through `enforce()` before execution; a denied block short-circuits the rest with a structured error in the response. 10. **`create_agent_tool.lru_cache(128)`** is preserved but adds `confirmation_nonce` to the cache key — replay requires the same nonce, breaking the silent-cache-poisoning surface.

IlumCIProposed by IlumCI
View on GitHub →

## Description Adds `TournamentSwarm`, a new multi-agent structure where N candidate agents answer the task independently and a judge eliminates answers through head-to-head matches until one survives. Every judgment is a two-way comparison - a far easier task for a judge model than absolute scoring of N long-form answers - at O(N) comparisons. ```python from swarms import Agent, TournamentSwarm swarm = TournamentSwarm( agents=[...], # candidate generators judge=Agent(...), # optional pairwise comparator bracket="single-elimination", # or "swiss" ) result = swarm.run("Write the strongest possible launch announcement.") ``` The judge picks each winner via a forced `pick_winner(winner, reasoning)` function call (same function-caller pattern as `MultiAgentRouter`/`HeavySwarm`), so verdicts are structured, never free-text. Single-elimination gives byes to top seeds; Swiss runs `ceil(log2(N))` rounds with rematch avoidance and head-to-head tie-breaks. Failed candidates or judge calls are logged and handled gracefully - a flaky call never aborts the tournament. The full bracket is available as metadata via `get_bracket()`, and `batch_run(tasks)` is supported. ## Files Changed * `swarms/structs/tournament_swarm.py` * `swarms/structs/__init__.py` * `docs/MULTI_AGENT_STRUCTURES.md` ## Issue #1683 ## Dependencies No extra dependencies required. ## Maintainer @kyegomez ## Twitter [@akc__2025](https://x.com/akc__2025)

adichaudharyProposed by adichaudhary
View on GitHub →

## Description Workers in `MixtureOfAgents` were re-receiving the full growing conversation transcript on every layer. On layer 0 workers now get only the original task; on layer 1+ they get the task + the previous layer's concatenated outputs. The aggregator still receives the full conversation. This cuts worker token cost from O(history × workers × layers) to O(1 layer of outputs × workers × layers). ## Files Changed - `swarms/structs/mixture_of_agents.py` ## Issue #1678 ## Dependencies No extra dependencies required. ## Maintainer @kyegomez ## Twitter [@akc__2025](https://x.com/akc__2025)

adichaudharyProposed by adichaudhary
View on GitHub →

## Description Only rebuild the Rich panel that actually changed on each dashboard tick, instead of reconstructing the full layout tree. Added a `_refresh_section()` helper and cached the `Layout` object in `start()` so each `update_*` method touches only its own section. ## Files Changed - `swarms/utils/hierarchical_swarm_dashboard.py` ## Issue #1677 ## Dependencies No extra dependencies required. ## Maintainer @kyegomez ## Twitter [@akc__2025](https://x.com/akc__2025)

adichaudharyProposed by adichaudhary
View on GitHub →

## Summary Two issues in `multi_agent_exec.py`: ### 1. `asyncio.get_event_loop()` is deprecated / broken (bug) - Lines: `227`, `232` - `asyncio.get_event_loop()` has been deprecated since Python 3.10 when there is no running loop, and it raises in thread contexts. Calling this from a worker thread or from a sync context on 3.10+ either issues a `DeprecationWarning` or fails outright with `RuntimeError: There is no current event loop in thread`. **Fix**: use `asyncio.run(...)` for one-shot sync→async bridges, or `asyncio.new_event_loop()` + explicit `loop.run_until_complete(...)` + `loop.close()` when a reusable loop is required. ### 2. No per-task timeout in the concurrent fan-out path (correctness) - Lines: `152-160` - The concurrent-fanout path awaits all agent tasks without a per-task timeout. A single hung agent (slow LLM, stalled tool call, network stall) blocks the entire batch indefinitely. **Fix**: accept a `per_task_timeout: float | None` parameter; wrap each task in `asyncio.wait_for(task, timeout=per_task_timeout)`; collect timeouts as a typed failure result rather than propagating the cancellation up. ## Location - File: `swarms/structs/multi_agent_exec.py` - Lines: `152-160`, `227`, `232` ## Source Surfaced in `PERFORMANCE_AUDIT.md` §3.9.

kyegomezProposed by kyegomez
View on GitHub →

## Summary `mixture_of_agents.py:171-182` re-passes `full_context = conversation.get_str()` to every worker on every layer. The full transcript is the dominant cost — a 2-layer run with 100 workers over a 20 KB history passes ~2 MB of context per layer. ## Impact - **Latency**: workers wait on a much larger prompt than they need. - **Token cost**: paying to re-send already-summarised content. - **Quality**: workers can drift on stale earlier turns instead of focusing on the current task. Combined with §1.1 of the audit (`Conversation.return_history_as_string()` rebuilt O(n) times per loop), this is the largest single hot path in MoA execution. ## Suggested fix Pass to each worker: 1. The original `task` 2. The **previous-layer aggregated output** (already a synthesised summary by construction) Skip the full transcript. The aggregator can still see the full thread if needed; workers do not. ## Location - File: `swarms/structs/mixture_of_agents.py` - Lines: `171-182` ## Source Surfaced in `PERFORMANCE_AUDIT.md` §1.1 and §3.6.

kyegomezProposed by kyegomez
View on GitHub →

## Summary `hiearchical_swarm.py` rebuilds **all** Rich panels on every status tick (`282-320`, `450-456`). Only the panel whose status actually changed needs to refresh. ## Impact Per-tick wasted CPU + flicker. Cost scales linearly with agent count × tick rate. Becomes the dominant cost when running a HierarchicalSwarm with many workers and a fast refresh interval. ## Suggested fix - Track per-panel dirty state; only re-render and re-render-target the panels whose status changed since the last tick. - Keep a static `Group`/`Layout` of panels; mutate the contents of the changed one in place rather than rebuilding the whole panel tree. ## Location - File: `swarms/structs/hiearchical_swarm.py` - Lines: `282-320`, `450-456` ## Source Surfaced in `PERFORMANCE_AUDIT.md` §3.5.

kyegomezProposed by kyegomez
View on GitHub →

Catches individual agent failures and continues execution instead of aborting all. Fixes #1613

Oxygen56Proposed by Oxygen56
View on GitHub →
tests

Summary: - Add or tighten focused edge-case tests or type assertions in tests/test_cli.py, tests/structs/test_moa.py related to Python typing, tests, CLI ergonomics, observability; avoid docs-only changes and broad refactors. - Keep the change narrow so it is straightforward to review. Notes: - I kept this scoped to the relevant implementation and tests.

lphuc2250gmaProposed by lphuc2250gma
View on GitHub →

## What Fixes #1553 — When the judge rejects a director's plan in HierarchicalSwarm, the swarm now performs mid-flight replanning instead of blindly looping the same plan. ## Problem Currently, when `agent_as_judge=True` and the judge rejects a plan, the swarm just feeds the raw feedback into the next loop iteration. The director receives unstructured context and typically produces the same plan again, leading to wasted iterations. ## Solution ### 1. Rejection Detection (`_is_judge_rejection`) Detects rejection signals in both structured dicts and free-text feedback: - Dict: `{"status": "REJECTED"}` / `"PARTIAL"` / `"REVISION_REQUIRED"` - Text: Contains "REJECTED", "REVISION_REQUIRED", or "PLAN_FAILED" ### 2. Preserved Outputs Approved subtask outputs are preserved across replanning iterations — completed work is not redone. ### 3. Replanning Context When a rejection is detected, the next loop's task includes: - Judge's feedback (why the plan was rejected) - Preserved outputs from approved subtasks - Explicit "REPLAN REQUIRED" instruction - Instruction NOT to redo completed subtasks ### 4. Dashboard Integration Director status shows "REPLANNING" during replanning phases. ## Usage ```python swarm = HierarchicalSwarm( agents=[...], agent_as_judge=True, judge_agent_model_name="gpt-5.4", max_loops=3, # Allow up to 3 plan iterations ) ```

Oxygen56Proposed by Oxygen56
View on GitHub →

## What Fixes #1554 — Workers in HierarchicalSwarm calling external tools (MCP, browsers, shell) can hang indefinitely, blocking the entire plan. This adds per-worker timeout, automatic retry, and failure handling. ## Changes - **worker_timeout** (default 300s): per-worker execution timeout - **heartbeat_interval** (default 30s): heartbeat check interval - **max_retries** (default 2): retries before marking FAILED On timeout: task is automatically resubmitted to the executor. After max_retries: worker marked FAILED with structured error dict. Failed workers no longer block sibling workers. ## Backward Compatibility Fully backward compatible — all new parameters have sensible defaults.

Oxygen56Proposed by Oxygen56
View on GitHub →

### Summary Tool functions with an `Optional`/`Union` parameter, or with `*args`/`**kwargs`, currently crash OpenAI-schema generation. This fixes both, plus two smaller correctness issues in the same two helpers, and adds a regression test. ### Problems 1. `function_to_str` raises `KeyError('type')` for any `Optional`/`Union` parameter. Such a parameter's JSON schema uses `anyOf` and has no top-level `type`, but the formatter indexes `details['type']` directly. 2. `get_openai_function_schema_from_func` raises `TypeError` for a function with unannotated `*args`/`**kwargs`, because variadics are included in the missing-annotation check. 3. Annotated `*args`/`**kwargs` are emitted as scalar `args`/`kwargs` properties and marked required, which is not a correct OpenAI tool schema. 4. Default detection uses `==`/`!=` against `inspect.Signature.empty`; a default object whose `__eq__` returns a non-bool is misclassified. ### Reproduction (before this PR) ```python from typing import Optional from swarms.tools.py_func_to_openai_func_str import get_openai_function_schema_from_func from swarms.tools.func_to_str import function_to_str def lookup(name: str, title: Optional[str] = None) -> str: return name schema = get_openai_function_schema_from_func(lookup, name="lookup", description="x") print(function_to_str(schema["function"])) # KeyError: 'type' ``` ```python def search(*args, **kwargs): return "" get_openai_function_schema_from_func(search, name="search", description="x") # TypeError ``` ### Fix - `func_to_str.function_to_str`: when a property has no top-level `type`, derive a readable type string from its `anyOf`/`oneOf` members (falling back to `any`). - `py_func_to_openai_func_str`: exclude `VAR_POSITIONAL`/`VAR_KEYWORD` parameters from the required-params and missing-annotation passes, and compare defaults to `inspect.Signature.empty` by identity. ### Tests Adds `tests/tools/test_schema_generation_robustness.py` covering an Optional parameter, unannotated variadics, and a non-bool-`__eq__` default. The tests fail on master and pass with this change. No model or network is required. ### Notes Minimal, no behavior change for already-working tools, no new dependency.

huntmythosProposed by huntmythos
View on GitHub →
documentation
tests
prompts
structs

## Summary - Renamed hiearchical_swarm.py -> hierarchical_swarm.py (git mv) - Renamed hybrid_hiearchical_peer_swarm.py -> hybrid_hierarchical_peer_swarm.py (git mv) - Renamed hiearchical_system_prompt.py -> hierarchical_system_prompt.py (git mv) - Updated all import references in __init__.py, swarm_router.py, tests, examples - Updated documentation references in docs/ Fixes #1504 Pure mechanical fix, no behavioral changes. ## Test Plan - [x] No remaining 'hiearchical' references in codebase (grep returns empty) - [x] Python files parse correctly (ast.parse on all modified files)

Oxygen56Proposed by Oxygen56
View on GitHub →

## Description `Agent` now streams thinking tokens from reasoning models (Claude extended thinking, OpenAI o-series) to `streaming_callback`, `run_stream`, and `arun_stream` in real time as structured `{"type": "thinking", "token": "..."}` events. Previously, thinking deltas were swallowed and only flushed as a single Rich panel after the thinking phase ended, making real-time UI integration impossible. Content tokens now carry a consistent `"type": "content"` field across all streaming modes including `stream=True` detailed mode. `thinking_tokens` alone is now sufficient to activate extended thinking — `agent.py` automatically sets `reasoning_enabled=True` when `thinking_tokens` is provided, so callers no longer need to set both. Existing `streaming_callback=lambda tok: str` integrations are unchanged. ## Files Changed * `swarms/structs/agent.py` * `examples/single_agent/streaming/thinking_stream_events_example.py` * `tests/structs/test_streaming_thinking_events.py` ## Issue #1621 ## Dependencies No extra dependencies required. ## Maintainer @kyegomez ## Twitter [@akc__2025](https://x.com/akc__2025)

adichaudharyProposed by adichaudhary
View on GitHub →

## Summary `Agent` currently swallows reasoning/thinking deltas from the LLM stream and only flushes them to the console as a single rich panel after the thinking phase ends. Integrators using `streaming_callback`, `arun_stream`, or `run_stream` cannot surface thinking tokens in real time — they only ever see content tokens. We should stream thinking tokens through the same callback surface as content tokens, with a clear way to distinguish them. ## Today's behavior `swarms/structs/agent.py:4036-4090` — `_yield_only_content_chunks`: ```python reasoning = getattr(delta, \"reasoning_content\", None) if reasoning: thinking_parts.append(reasoning) continue # swallow the thinking chunk; don't pass to content stream # First non-thinking chunk — flush accumulated thinking if thinking_parts and not thinking_displayed: if self.print_on: formatter.print_thinking_panel(\"\".join(thinking_parts), title=...) thinking_displayed = True ``` The reasoning deltas: - Never reach `streaming_callback`. - Never reach `arun_stream` / `run_stream` consumers. - Are batched, so even console users see the thinking as one block at the end of the thinking phase, not as it's produced. This means a dashboard, web UI, or terminal renderer integrating against \`Agent\` cannot show \"thinking in progress\" the way Claude.ai / OpenAI playground / Anthropic Console do. ## Repro ```python from swarms import Agent agent = Agent( agent_name=\"Reasoner\", model_name=\"claude-sonnet-4-6\", thinking_tokens=2000, streaming_callback=lambda tok: print(repr(tok)), ) agent.run(\"Solve: a chicken and a half lays an egg and a half in a day and a half.\") # Expected: callback fires for thinking tokens AND content tokens, distinguishably. # Actual: callback fires only for content tokens. Thinking is invisible to the callback. ``` Same gap exists for `arun_stream` / `run_stream` — they only yield content tokens. ## Proposed design **Option A (preferred): tagged events.** Change `streaming_callback` to optionally accept a structured event dict, and switch `arun_stream` / `run_stream` to yield events by default when an opt-in flag is set: ```python # Token event {\"type\": \"thinking\", \"token\": \"...\"} {\"type\": \"content\", \"token\": \"...\"} # Phase boundaries (optional but useful) {\"type\": \"thinking_start\"} {\"type\": \"thinking_end\", \"text\": \"<full thinking>\"} {\"type\": \"content_start\"} {\"type\": \"content_end\", \"text\": \"<full content>\"} ``` Preserve back-compat: if the callback signature is `Callable[[str], None]`, keep delivering only content tokens (today's behavior). If it's `Callable[[dict], None]` (detect via `inspect.signature`) or the user passes `streaming_events=True`, deliver tagged events. **Option B: separate `thinking_callback`.** Add a second kwarg: ```python agent = Agent( ..., streaming_callback=on_content_token, thinking_callback=on_thinking_token, ) ``` Simpler to add, no signature detection, but doesn't generalize to `arun_stream`/`run_stream` cleanly. I lean toward **Option A** because it composes with the existing `arun_stream(with_events=True)` pattern already established in `AgentRearrange` (`swarms/structs/agent_rearrange.py:1105-1129`) — same event shape, just add `thinking` / `thinking_start` / `thinking_end` types. ## Acceptance criteria - A reasoning model (`claude-sonnet-4-6` with `thinking_tokens=...`, or an OpenAI o-series model) streams thinking deltas to the registered callback in real time, one chunk at a time, before the first content token arrives. - Thinking tokens are distinguishable from content tokens in the callback payload. - `arun_stream(with_events=True)` yields `{\"type\": \"thinking\", \"token\": ...}` events for reasoning deltas alongside the existing content events. - The console rich-panel UX for `print_on=True` is preserved (or rendered incrementally — bonus). - Back-compat: existing `streaming_callback=lambda tok: ...` integrations that only care about content keep working without code changes. ## Notes - `_yield_only_content_chunks` (`agent.py:4036`) is the natural place to fire thinking events before swallowing the chunk. Pass the callback / event-sink through from `call_llm` (`agent.py:4092`). - Reasoning content lives at `delta.reasoning_content` per LiteLLM; same accessor already used at L4056. - `AgentRearrange.arun_stream(with_events=True)` already returns `agent_start` / `token` / `agent_end` events — extending the same shape with `thinking_start` / `thinking` / `thinking_end` keeps the multi-agent streaming layer consistent.

kyegomezProposed by kyegomez
View on GitHub →
enhancement
help wanted
good first issue
FEAT

## Summary In `ConcurrentWorkflow`, one failed agent kills the entire run and discards all sibling outputs that already completed successfully. This contradicts the design goal of independent concurrent execution. ## Repro ```python from swarms import Agent, ConcurrentWorkflow a = Agent(agent_name="Good-1", model_name="gpt-4.1", max_loops=1) b = Agent(agent_name="Good-2", model_name="gpt-4.1", max_loops=1) c = Agent(agent_name="Bad", model_name="gpt-4.1", max_loops=1) c.run = lambda *a, **kw: (_ for _ in ()).throw(RuntimeError("boom")) wf = ConcurrentWorkflow(agents=[a, b, c]) wf.run("Summarise the news today.") # Expected: results for Good-1 and Good-2, plus an error marker for Bad. # Actual: RuntimeError propagates out — Good-1 and Good-2's outputs are lost. ``` ## Root cause `swarms/structs/concurrent_workflow.py:418-425` does not wrap `future.result()` in a try/except: ```python for future in as_completed(futures): output = future.result() # raises on any failed worker self.conversation.add(...) ``` The dashboard variant already handles this correctly at `concurrent_workflow.py:346-356`. The non-dashboard path was missed. ## Proposed fix Wrap `future.result()` in try/except mirroring the dashboard path. On error, append a clearly-labeled error entry to the conversation (e.g., role `f"{agent_name} (failed)"`) so callers can filter it, and continue collecting sibling results. Add an opt-in constructor flag for users who prefer the old behavior: ```python on_error: Literal["raise", "store"] = "store" ``` `"store"` (default) collects per-agent errors and returns partial results. `"raise"` preserves today's fail-fast behavior. ## Tests to add - `ConcurrentWorkflow` with one failing agent and two passing agents: assert partial results returned, failed agent's error captured under its name, no exception escapes. - `on_error="raise"`: assert the exception still propagates.

kyegomezProposed by kyegomez
View on GitHub →
about 1 month ago1 comments
documentation

## Summary - Fixed a broken Python string literal in the tools guide. - Fixed the SwarmMatcher advanced import example so it is valid copy/paste Python. - Retagged ReflexionAgent and AgentJudge API signature blocks as text instead of executable Python fences. - Added the missing ReadTheDocs config at `docs/.readthedocs.yaml`, pinned the docs build dependencies, and removed the stale missing `theme.custom_dir` setting so docs previews can render. - Added `docs/index.md` so ReadTheDocs generates the required root `index.html`. ## Proof - Parsed all Python fenced blocks in the four touched docs files with `ast.parse`; no invalid Python fences remained. - Ran `git diff --check`. - Ran `/tmp/swarms-docs-venv/bin/python -m mkdocs build -f docs/mkdocs.yml --site-dir /tmp/swarms-site-preview`; the docs build completed successfully and generated `/tmp/swarms-site-preview/index.html`. ## Bounty / payment This is a focused documentation-quality fix and I am submitting it for Swarms documentation bounty consideration. Expected tier: Silver ($5-$20), if maintainers agree after review/merge. Preferred payout method for legitimate completed/merged work: PayPal: https://www.paypal.com/paypalme/MouadBERQIA

BerqiaMouadProposed by BerqiaMouad
View on GitHub →
about 1 month ago1 comments
documentation

## Summary - Fix invalid Python in the custom `Agent` template in `docs/swarms/agents/new_agent.md`. - Pass `agent_description` and optional `llm` through the base `Agent` initializer instead of assigning undefined local variables. - Update the mock LLM and Griptape examples so the signatures and instantiation are syntactically valid. - Add the ReadTheDocs configuration and docs dependency list needed for PR documentation previews. - Add a small docs landing page and SIP page so ReadTheDocs has the root docs files referenced by `mkdocs.yml`. ## Validation - `git diff --check` - Parsed all 3 Python code fences in `docs/swarms/agents/new_agent.md` with `ast.parse`. ## Documentation bounty Please mark this PR as `documentation` and `bounty-eligible`. Expected tier: Silver ($5-$20). This is a focused API example correction and clarification for the custom-agent guide, including runnable constructor/signature fixes and explanation of the `llm` callable path. Preferred payout method if awarded: PayPal.

BerqiaMouadProposed by BerqiaMouad
View on GitHub →
tests
prompts
structs

## Description Adds incremental replan support to HierarchicalSwarm when the judge returns a low score. The `JudgeReport` schema gains `verdict`, `feedback`, and `failed_subtasks` fields. On a REVISE verdict, the director is called with the judge feedback and previous per-agent outputs and returns a `ReplanAction` (ADD / REASSIGN / REORDER / DROP) whose orders are the only subtasks re-executed. Successful agents are never re-run. Includes 7 unit tests and a runnable example. ## Files Changed * `swarms/structs/hiearchical_swarm.py` * `swarms/prompts/agent_judge_prompt.py` * `tests/structs/test_hierarchical_swarm_replan.py` * `examples/multi_agent/hierarchical_swarm_replan.py` ## Issue #1553 ## Dependencies No extra dependencies required. ## Maintainer @kyegomez ## Twitter [@akc__2025](https://x.com/akc__2025) <!-- readthedocs-preview swarms start --> ---- 📚 Documentation preview 📚: https://swarms--1606.org.readthedocs.build/en/1606/ <!-- readthedocs-preview swarms end -->

adichaudharyProposed by adichaudhary
View on GitHub →