# AI Agents Web Browsing
AI agents web browsing refers to the capability of [[AI Agents]] to autonomously navigate and interact with web pages. This is one of the most impactful [[AI Agent Skills]], enabling agents to perform tasks like filling forms, extracting data, navigating multi-step workflows, and interacting with web applications on behalf of users.
Several approaches exist:
- **LLM-driven browser automation**: tools like [[Browser Use]] connect [[Large Language Models (LLMs)]] to [[Playwright]] or similar engines, letting the agent describe actions in natural language that get translated to browser interactions
- **CLI-based agent tools**: [[Browser Use CLI]] and [[Vercel Agent Browser]] provide command-line interfaces optimized for agent consumption, with compact output formats that minimize token usage
- **Protocol-level integration**: [[WebMCP]] brings the [[Model Context Protocol (MCP)]] natively into browsers, letting web pages expose structured tools and resources directly to AI agents without scraping
- **Adaptive scraping**: [[Scrapling]] uses AI-friendly selectors and anti-bot bypass to handle dynamic content extraction
Key challenges include context efficiency (DOM trees are token-heavy), authentication persistence, CAPTCHA handling, and maintaining deterministic element selection across page changes. Ref-based and accessibility-tree approaches (as used by [[Vercel Agent Browser]]) address context bloat by providing compact element identifiers instead of full HTML.
## References
- https://docs.browser-use.com
- https://agent-browser.dev
## Related
- [[AI Agents]]
- [[AI Agent Skills]]
- [[Browser Use]]
- [[Browser Use CLI]]
- [[Vercel Agent Browser]]
- [[WebMCP]]
- [[Scrapling]]
- [[Playwright]]
- [[Model Context Protocol (MCP)]]
- [[Large Language Models (LLMs)]]