AI-powered Chrome extension that can read pages, reason with Gemini, and execute real DOM actions (click, type, extract, scroll, fill forms, and more).
Most browser assistants stop at suggestions. Do Nothing executes actions directly on the current page through a controlled tool loop.
- Run browser automation from natural language prompts.
- Auto-select domain-specific skills (for LinkedIn, jobs, extraction, summaries, etc.).
- Stream progress via pipeline steps and tool call states.
- Keep session history and long-term memory in local extension storage.
- Support voice typing in the sidepanel chat input.
flowchart LR
U[User Prompt in Sidepanel] --> BG[Background Service Worker]
BG --> CTX[Context Builder<br/>+ Memory + Skills]
CTX --> LLM[Gemini Model]
LLM -->|Function Calls| DOM[Content Script DOM Engine]
DOM -->|Action Results| LLM
LLM -->|Final Text| BG
BG --> UI[Sidepanel Updates<br/>chat, pipeline, tools]
Read deeper architecture docs in ARCHITECTURE.md.
The registry lives in src/background/skills.ts and includes skills such as:
- LinkedIn Connect / Message / Follow / Comment
- X/Twitter Follow
- Job Applier
- Page Summarizer
- Data Extractor
- Form Filler
- Auto Scroller
- Email Drafter
- Shopping Assistant
Skills are matched by URL + intent score and can be explicitly selected from the Skills tab.
npm installnpm run build- Open
chrome://extensions - Enable Developer mode
- Click Load unpacked
- Select the
dist/folder
- Open extension sidepanel
- Go to Settings
- Add your Gemini API key
- Save settings
- Ask with direct intent:
Extract all job titles and company names from this page. - Use explicit constraints:
Apply only to remote React roles with Easy Apply. - If tasks are complex, split into smaller prompts.
- Pick a specific skill when you need deterministic behavior.
- Use sample prompts from each skill card as templates.
- Store durable facts you want future tasks to reuse.
- Keep memory concise and structured for better reuse.
- Click the mic icon in chat input.
- Allow microphone permission when prompted.
- Speak in short phrases for cleaner transcripts.
- Start with a dry run prompt:
Show me what you'd do first, then execute. - Prefer exact selectors/targets in prompts when possible.
- For bulk actions, define limits:
first 10,top 5,only visible. - For form tasks, provide profile data once in memory, then reuse.
- Tune
Action Delayin Settings to reduce rate-limit or anti-bot triggers.
src/
background/ Service worker, Gemini loop, skills, memory, session
content/ DOM interaction engine injected into pages
sidepanel/ React UI (chat, skills, memory, history, settings)
dist/ Webpack build output for unpacked extension loading
npm run dev # watch mode
npm run build # production build
npm run clean # remove dist- Memory and session data are stored in
chrome.storage.local. - API key is stored locally via extension settings.
- The extension executes DOM actions only on pages where content scripts run.
- No response / errors: verify API key and selected model in Settings.
- Action fails repeatedly: ask for smaller scoped actions and clearer targets.
- Voice typing not starting: check browser microphone permissions for extensions.
- Content script issues: reload extension and the active tab.
Contributions are welcome, especially new day-to-day automation skills.
- Contribution guide: CONTRIBUTING.md
- Architecture details: ARCHITECTURE.md
- Buy me a coffee: ☕ buymeacoffee.com/bhuvan_ade
- Star this repo: ⭐ github.com/BhuvanAde/do-nothing