Web Automation

February 15, 2025
in Web Automation
4 min read

Integrating browser-use with Undetected-Chromedriver, Gemini, ML, and Crypto Dice Automation

Integrating `browser-use` with Undetected-Chromedriver, Gemini, and Machine Learning

Browser automation has come a long way, but avoiding detection while using automation frameworks like Playwright, Selenium, or Puppeteer remains a challenge. In this post, we explore how to integrate browser-use with undetected-chromedriver, Google's Gemini AI, machine learning models, and crypto dice automation for bot-resistant web interactions.

Why Use `browser-use` with Undetected-Chromedriver?

Traditional browser automation tools can be detected easily, especially on sites with strong bot protection. By integrating undetected-chromedriver, we can bypass bot detection mechanisms while leveraging Playwright’s robust automation capabilities.

Step 1: Modify the `browser-use` Code to Use Undetected-Chromedriver

To enable bot-resistant browsing, modify browser.py in your browser-use repository with the following implementation:

async def _setup_undetected_browser(self, playwright: Playwright) -> PlaywrightBrowser:
    """Sets up and returns a Playwright Browser instance with anti-detection measures using undetected-chromedriver."""
    try:
        import undetected_chromedriver as uc
        from selenium import webdriver

        options = uc.ChromeOptions()
        options.headless = self.config.headless
        for arg in [
            '--no-sandbox',
            '--disable-infobars',
            '--disable-popup-blocking',
            '--no-first-run',
            '--no-default-browser-check'
        ] + self.disable_security_args + self.config.extra_chromium_args:
            options.add_argument(arg)

        if self.config.proxy:
            options.add_argument(f'--proxy-server={self.config.proxy.get("server")}')

        driver = uc.Chrome(options=options)  # type: ignore
        cdp_endpoint = driver.command_executor._url + '/devtools/browser/' + driver.session_id  # type: ignore
        browser = await playwright.chromium.connect_over_cdp(cdp_endpoint)

        # Ensure Selenium driver quits when Playwright browser closes
        def _close_undetected_chrome(self):
            try:
                driver.quit()
            except Exception as e:
                logger.warn(f"Error quitting undetected_chromedriver: {e}")

        browser._close_undetected_chrome = self._close_undetected_chrome  # type: ignore
        return browser

    except ImportError:
        logger.error("undetected-chromedriver is not installed. Install it with `pip install undetected-chromedriver`.")
        raise
    except Exception as e:
        logger.error(f"Failed to launch undetected-chromedriver: {e}")
        raise

This implementation ensures the browser runs stealthily and minimizes detection.

Step 2: Modify the Close Method for Clean Browser Shutdown

Add the following method to properly close the browser and avoid resource leaks:

async def close(self):
    """Close the browser instance."""
    try:
        if self.playwright_browser:
            if hasattr(self.playwright_browser, '_close_undetected_chrome') and self.playwright_browser._close_undetected_chrome:  # type: ignore
                self.playwright_browser._close_undetected_chrome()  # type: ignore

            await self.playwright_browser.close()
        if self.playwright:
            await self.playwright.stop()
    except Exception as e:
        logger.debug(f'Failed to close browser properly: {e}')
    finally:
        self.playwright_browser = None
        self.playwright = None

Step 3: Implement Browser Setup Selection

Ensure that the browser initialization method can properly select undetected-chromedriver:

async def _setup_browser(self, playwright: Playwright) -> PlaywrightBrowser:
    """Sets up and returns a Playwright Browser instance."""
    try:
        if self.config.cdp_url:
            return await self._setup_cdp(playwright)
        if self.config.wss_url:
            return await self._setup_wss(playwright)
        elif self.config.chrome_instance_path:
            return await self._setup_browser_with_instance(playwright)
        elif self.config.use_undetected_chromedriver:
            return await self._setup_undetected_browser(playwright)
        else:
            return await self._setup_standard_browser(playwright)

    except Exception as e:
        logger.error(f'Failed to initialize Playwright browser: {str(e)}')
        raise

Step 4: Run a Sample Script to Test It

Once your modified browser-use setup is complete, use the following script to test bot-resistant browsing:

import asyncio
from browser_use import Agent, Browser, BrowserConfig

async def main():
    config = BrowserConfig(headless=False, use_undetected_chromedriver=True)
    browser = Browser(config=config)
    playwright_browser = await browser.get_playwright_browser()
    context = await playwright_browser.new_context()
    page = await context.new_page()
    await page.goto("https://nowsecure.nl")  # Site to test bot detection
    await asyncio.sleep(5)  # Observe results
    await browser.close()

if __name__ == "__main__":
    asyncio.run(main())

Step 5: Expanding with Machine Learning and AI

Now that we have a bot-resistant browser, let's explore how to integrate Gemini AI and machine learning to enhance automation.

Using Gemini for Text-Based Automation

If you're automating interactions that require AI-generated content, you can integrate Google's Gemini AI:

from google.generativeai import Gemini

# Initialize Gemini AI
gemini = Gemini(api_key="your_gemini_api_key")

async def generate_ai_response(prompt):
    response = gemini.complete(prompt=prompt)
    return response['choices'][0]['text']

Crypto Dice Automation

For crypto dice betting, you can use your automated browser to interact with betting sites:

async def play_crypto_dice():
    page = await browser.new_page()
    await page.goto("https://crypto-dice-betting-site.com")

    # Example: Place a bet
    await page.click("button#bet")
    await page.wait_for_timeout(2000)  # Wait for result

    result = await page.inner_text("div#result")
    print(f"Dice Result: {result}")

    await page.close()

Frequently Asked Questions (FAQ)

Q: What makes undetected-chromedriver useful?
A: It bypasses bot detection, making it useful for web scraping, automation, and avoiding anti-bot systems.

Q: Does this work with proxies?
A: Yes, you can configure proxy-server options in the browser configuration.

Q: How can I integrate machine learning?
A: You can use models like TensorFlow, Gemini AI, or OpenAI’s GPT-4 for decision-making within the automation workflow.

Final Thoughts

With browser-use, undetected-chromedriver, machine learning, and crypto automation, we unlock a powerful bot-resistant automation pipeline. Whether you're working with Gemini AI, web scraping, or crypto gambling automation, this setup enables high stealth and flexibility.

🔥 Let me know in the comments how this setup works for you or if you have any improvements! 🚀

February 15, 2025
in Web Automation
4 min read

Direct Access to Backend APIs - A Step-by-Step Guide to Bypassing HTML Scraping

Direct Access to Backend APIs: A Step-by-Step Guide to Bypassing HTML Scraping

Modern websites—especially single-page applications (SPAs)—often make calls to backend APIs in the background. Whether the site uses RESTful endpoints or GraphQL, these calls load data dynamically. Instead of the traditional (and sometimes messy) approach of scraping HTML, you can often directly access these APIs to get structured JSON data.

In this post, we’ll walk through how to discover these backend endpoints and replicate the requests, saving you both time and complexity.

1. Use Developer Tools to Inspect Network Requests

Modern browsers come equipped with powerful development tools that can show you every request being made as a webpage loads. Follow these steps:

Open Developer Tools
In Chrome, Firefox, Edge, or Safari, press F12 or right-click on the page and select Inspect.
Navigate to the “Network” Tab
This tab displays network activity including AJAX calls, fetch requests, and XHRs.
Reload the Page
As the page reloads, you’ll see each network request appear in real time.
Look for requests returning JSON (they might have “application/json” in the Content-Type header, or you may see “graphql” in the URL).
Inspect Each Request
Click on the request to see its details:
- Headers (e.g., Authorization, User-Agent)
- Query Params (e.g., ?page=2&limit=20)
- Request Body (for POST/PUT)

You’ll often find URLs like:

https://api.example.com/v1/some-resource

or GraphQL endpoints like:

https://api.example.com/graphql

2. Identify the Necessary Request Details

To replicate an API call outside the browser, you’ll need:

URL/Endpoint
Example: https://api.example.com/v1/users?sort=desc
HTTP Method
(GET, POST, PUT, DELETE, etc.)
Headers
Look for authentication tokens, custom headers, or user-agent strings that might be required.
Query Parameters
Anything after ? in the URL, such as page=2&limit=20.
Body/Payload (for POST or PUT)
In GraphQL, you might see a JSON body containing:
```
{
  "query": "...",
  "variables": { ... }
}
```
Cookies or Tokens
Some APIs require session cookies or Bearer tokens to authenticate or keep track of user sessions.

3. Recreate the Request With a Tool or Script

Once you’ve gathered the request info, you can reproduce it using various tools or libraries:

cURL or Postman
Postman is a graphical tool that simplifies testing APIs.
In Chrome DevTools, you can often right-click a request and choose Copy as cURL to get a ready-to-paste command.
Programming Libraries

Python (requests):

import requests

headers = {
  'Authorization': 'Bearer <TOKEN_IF_NEEDED>',
  'User-Agent': 'Mozilla/5.0 ...'
}

response = requests.get(
  'https://api.example.com/v1/endpoint',
  headers=headers
)
print(response.json())

Node.js (axios):

const axios = require('axios');

axios.get('https://api.example.com/v1/endpoint', {
  headers: {
    'Authorization': 'Bearer <TOKEN_IF_NEEDED>',
    'User-Agent': 'Mozilla/5.0 ...',
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error(error);
});

These examples make it easy to authenticate and include headers or JSON bodies.

4. Understand Potential Security and Anti-Bot Measures

When dealing with APIs, be aware that:

Rate Limiting
The site may allow only a certain number of requests per minute/hour/day.
API Keys or Tokens
You might need a key, sometimes embedded in the front-end code. Check for domain restrictions or usage limits.
CSRF Tokens / Cookies
Some requests need a valid session or a dynamically generated token for security.
CAPTCHA / Bot Detection
If the site has advanced bot protection, you may encounter CAPTCHAs or behavioral detection (Cloudflare, reCAPTCHA, etc.).
Obfuscated Calls
Rarely, sites encrypt or obfuscate requests to hide internal endpoints.

Pro Tip: If an “API key” is found in the front-end code or request payloads, handle it responsibly. Using that key outside its intended context could lead to blocks or legal issues if it violates the site’s policies.

5. Use Proxies or Browser Emulation If Needed

For sites that employ stricter anti-scraping measures:

Proxies
Configure your client or scripts to send requests through proxies (if permitted by the site’s terms of service).
Browser Emulation
Tools like Selenium or Puppeteer can fully emulate user interactions, including JavaScript execution, cookies, and dynamic tokens.

6. Respect Terms of Service and Legal Considerations

Always ensure you:

Review the site’s Terms of Service
Some sites explicitly forbid automated calls or direct API usage.
Check robots.txt
Though not legally binding, it often indicates how the site prefers bots to behave.
Avoid Violating Privacy Laws
Make sure you’re not collecting personal data illegally.
Watch Out for Intellectual Property Protections
Even if endpoints aren’t strictly protected, they might still be covered by usage restrictions.

Example Real-World Flow

Visit example.com.
Open DevTools → Network.

Observe requests. Suppose you see something like:

GET https://api.example.com/v1/products?page=1&limit=20

Right-click → Copy as cURL
Then paste into your terminal:

curl 'https://api.example.com/v1/products?page=1&limit=20' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)' \
-H 'Accept: application/json' \
--compressed

Check the JSON response. If it works as expected, you can integrate it into your automation or data processing pipeline.

Key Takeaways

APIs Power Most Modern Front-Ends
Scraping HTML is often unnecessary if you can directly fetch structured data from an endpoint.
Efficiency & Reliability
Direct API calls give you JSON or other machine-readable formats, which are more robust than parsing HTML.
Mind Legal & Ethical Boundaries
Always respect the site’s policies and relevant laws.
Start Slowly
Test a few requests to gauge how the API behaves, then scale your approach responsibly.

By following these steps, you can harness the power of backend APIs for faster, cleaner, and more direct data access—all while staying within site policies and best practices. Let me know if you have any questions or experiences to share in the comments below!