tota ships with 70+ built-in tools, grouped into capability areas.
Filesystem
| Tool | Description |
|---|
read_file | Read file contents with optional line range |
write_file | Write or overwrite a file |
create_file | Create a new file |
edit_file | Apply a targeted string replacement edit to an existing file |
delete_file | Delete a file |
list_dir | List directory contents |
send_file | Send a file to the user (Telegram or API channel) |
approve_scope | Approve a filesystem path for read or write access |
find_files | Advanced file finder — search by glob pattern, content keyword, file type, date range, and size. Recurses up to depth 6, skips node_modules/.git/dist. |
Document Readers
| Tool | Description |
|---|
read_pdf | Extract text, page count, and metadata from a PDF file |
read_excel | Read an Excel / CSV file and render as a markdown table or JSON (supports .xlsx, .xls, .ods, .csv) |
write_excel | Create an .xlsx file from provided rows and headers, with auto-sized columns and bold header row |
read_docx | Extract plain text or HTML from a Word .docx file |
All document tools use permission-gated filesystem access and dynamic imports — tota starts normally even without the optional packages installed.
Browser Automation
tota supports three browser engines via Playwright — Chromium (default), Firefox, and WebKit (Safari-compatible). The browser opens as a visible window on your desktop by default.
| Tool | Description |
|---|
browser_open | Open a URL in the browser; returns page title, URL, and visible text |
browser_click | Click an element by CSS selector or visible text |
browser_type | Type text into an input field (click-to-focus + fill; SPA-safe for Google, Gmail, etc.) |
browser_key | Press a keyboard key — Enter, Tab, Escape, ArrowDown, combos like Control+a |
browser_wait | Wait for a CSS selector to appear or for page navigation to complete |
browser_screenshot | Take a full-page or element screenshot and send it to the user |
browser_extract | Extract text or attribute values from elements matching a CSS selector |
browser_scroll | Scroll the page up, down, to the top, or to the bottom |
browser_close | Close the browser session and free resources |
browser_engine | Switch the active browser engine (chromium / firefox / webkit) |
Install browser binaries once:
npx playwright install chromium firefox webkit
# or via the wizard:
tota setup browser
Set BROWSER_ENGINE=firefox (or webkit) in ~/.tota/.env to change the default engine. Set PLAYWRIGHT_HEADLESS=true to run without a visible window (CI/server environments).
Computer-Use (Desktop Control)
Enable with COMPUTER_USE_ENABLED=true in ~/.tota/.env (or capabilities.computer.enabled: true in tota.yaml).
Desktop
| Tool | Description |
|---|
computer_screenshot | Capture the full screen or a region; save to a temp file |
computer_see | Screenshot + immediate vision AI analysis — understand what's on screen before acting |
computer_click | Left, right, or double-click at pixel coordinates |
computer_move | Move the cursor to coordinates |
computer_type | Type text at the current keyboard focus |
computer_key | Press a key or key combo (cmd+c, ctrl+z, enter, tab, escape, arrow keys, etc.) |
computer_scroll | Scroll up, down, left, or right at a position |
computer_drag | Click and drag between two screen coordinates |
computer_screen_size | Get the primary display resolution |
Mouse/keyboard control uses @nut-tree-fork/nut-js (cross-platform native module). On Linux, install libxtst-dev first:
sudo apt install libxtst-dev
Android (ADB)
Android tools work via adb in your PATH — no additional Node.js packages needed.
| Tool | Description |
|---|
adb_devices | List connected Android devices |
adb_screenshot | Take a screenshot of an Android device |
adb_see | Android screenshot + vision AI analysis |
adb_tap | Tap at pixel coordinates on the device |
adb_swipe | Swipe between two coordinates (with optional duration) |
adb_type | Type text into the focused field on the device |
adb_key | Send an Android key event (3=HOME, 4=BACK, 26=POWER, 66=ENTER…) |
adb_shell | Run any adb shell command on the device |
adb_pull | Pull a file from the device to the local machine |
adb_push | Push a local file to the device |
Shell
| Tool | Description |
|---|
run_command | Run a shell command with timeout and output capture |
cd | Change the working directory |
approve_command | Approve a shell command pattern for the current session |
Code Sandbox
| Tool | Description |
|---|
run_code | Execute a code snippet in an isolated sandbox. Supports Python, JavaScript (Node.js), Bash, TypeScript, Ruby, and Go. Output capped at 8,000 chars. |
Sandbox is isolated from your working directory — use run_command for in-project commands.
Git
| Tool | Description |
|---|
git_status | Show working tree status |
git_diff | Show diffs (staged, unstaged, or between refs) |
git_log | Show commit history |
git_commit | Stage and commit changes with a message |
git_add | Stage files |
git_push | Push commits to remote |
GitHub
| Tool | Description |
|---|
create_pr | Create a pull request |
review_pr | Read PR diff and post a review |
list_issues | List open issues with filters |
create_issue | Open a new issue |
github_api | Raw call to any GitHub REST endpoint |
Web
| Tool | Description |
|---|
web_search | Search the web using Brave, Serper, or Tavily (auto-detected from env keys) |
read_url | Fetch and extract main content from a URL |
Vision
| Tool | Description |
|---|
analyze_image | Analyze a local image file or image URL. Detects MIME type automatically. Supports JPEG, PNG, GIF, WebP. |
Memory
| Tool | Description |
|---|
remember | Store a memory with type and optional scope |
search_memory | Full-text search across all memories |
delete_memory | Delete a specific memory by ID |
Messaging
| Tool | Description |
|---|
send_telegram_message | Send a message to a Telegram user by ID |
Scheduler
| Tool | Description |
|---|
schedule_task | Schedule a one-shot or recurring cron task |
list_tasks | List all scheduled tasks |
cancel_task | Cancel a scheduled task by ID |
System
| Tool | Description |
|---|
get_current_time | Return current datetime and timezone |
delegate_task | Delegate a focused sub-task to a fresh agent instance and return its result |
MCP Plugins
When MCP servers are configured, their tools appear automatically with the prefix mcp_<server>_<tool>:
mcp_filesystem_read_file
mcp_my_db_query
See MCP Plugins for setup instructions.
Tool Output Truncation
All tool outputs are capped at 12,000 characters. If a tool returns more, the output is truncated with a clear notice:
... [output truncated — 45320 chars total, showing first 12000]
This prevents token budget overflows on large file reads or command outputs.
| sleep | Wait for N seconds (use sparingly) |