# AI Agents Web Browsing AI agents web browsing refers to the capability of [[AI Agents]] to autonomously navigate and interact with web pages. This is one of the most impactful [[AI Agent Skills]], enabling agents to perform tasks like filling forms, extracting data, navigating multi-step workflows, and interacting with web applications on behalf of users. Several approaches exist: - **LLM-driven browser automation**: tools like [[Browser Use]] connect [[Large Language Models (LLMs)]] to [[Playwright]] or similar engines, letting the agent describe actions in natural language that get translated to browser interactions - **CLI-based agent tools**: [[Browser Use CLI]] and [[Vercel Agent Browser]] provide command-line interfaces optimized for agent consumption, with compact output formats that minimize token usage - **Protocol-level integration**: [[WebMCP]] brings the [[Model Context Protocol (MCP)]] natively into browsers, letting web pages expose structured tools and resources directly to AI agents without scraping - **Adaptive scraping**: [[Scrapling]] uses AI-friendly selectors and anti-bot bypass to handle dynamic content extraction Key challenges include context efficiency (DOM trees are token-heavy), authentication persistence, CAPTCHA handling, and maintaining deterministic element selection across page changes. Ref-based and accessibility-tree approaches (as used by [[Vercel Agent Browser]]) address context bloat by providing compact element identifiers instead of full HTML. ## References - https://docs.browser-use.com - https://agent-browser.dev ## Related - [[AI Agents]] - [[AI Agent Skills]] - [[Browser Use]] - [[Browser Use CLI]] - [[Vercel Agent Browser]] - [[WebMCP]] - [[Scrapling]] - [[Playwright]] - [[Model Context Protocol (MCP)]] - [[Large Language Models (LLMs)]]