Support connecting to browser using `connect()` instead of `launch()` #892

Kilowhisky · 2025-01-15T05:08:04Z

Is your feature request related to a problem? Please describe.
When trying to automate the process of scraping i've found that trying to run the local browser inside a lambda that handles multiple sessions causes problems with the browser when its built into the app.

Describe the solution you'd like
Playwright lets you setup a remote server and connect to it using web sockets.

see here for example:
https://github.com/pixelfactoryio/playwright-server

I would like to see a configuration that lets me specify that it should connect to one of these servers instead of trying to launch the browser locally.

see here for how to connect() instead of launch() the browser:
https://playwright.dev/python/docs/api/class-browsertype#browser-type-connect

Describe alternatives you've considered
I've played around with scrape_do and browser_base and they both have some drawbacks and sometimes its best to have total control over the browser that is being launched.

Additional context
There seems to be some bugs when trying to run scrapegraph in a long running process that handles multiple requests. It would be just easier if I can place the chrome server on a machine with a GPU and let scrapegraph just control it remotely.

The text was updated successfully, but these errors were encountered:

Kilowhisky · 2025-01-15T05:10:17Z

Injection point could be right here:

Scrapegraph-ai/scrapegraphai/docloaders/chromium.py

Lines 349 to 353 in aa6a76e

    
           browser = await p.chromium.launch( 
        
               headless=self.headless, 
        
               proxy=self.proxy, 
        
               **self.browser_config, 
        
           )

dosubot bot added the feature request label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support connecting to browser using `connect()` instead of `launch()` #892

Support connecting to browser using `connect()` instead of `launch()` #892

Kilowhisky commented Jan 15, 2025

Kilowhisky commented Jan 15, 2025

Support connecting to browser using connect() instead of launch() #892

Support connecting to browser using connect() instead of launch() #892

Comments

Kilowhisky commented Jan 15, 2025

Kilowhisky commented Jan 15, 2025

Support connecting to browser using `connect()` instead of `launch()` #892

Support connecting to browser using `connect()` instead of `launch()` #892