`woob.browser.selenium`¶

class SeleniumBrowser(logger=None, proxy=None, responses_dirname=None, weboob=None, proxy_headers=None, preferences=None, remote_driver_url=None, woob=None)[source]¶

Bases: object

Browser similar to PagesBrowser, but using Selenium.

URLs instances can be used. The need_login decorator can be used too.

Differences: - since JS code can be run anytime, the current url and page can change anytime - it’s not possible to use open(), only location() can be used - many options are not implemented yet (like proxies) or cannot be implemented at all

DRIVER¶

Selenium driver class

alias of WebDriver

HEADLESS = True¶: Run without any display

DEFAULT_WAIT = 10¶: Default wait time for wait_* methods

WINDOW_SIZE = None¶

Rendering window size

It can be useful for responsive websites which show or hide elements depending on the viewport size.

BASEURL = None¶

MAX_SAVED_RESPONSES = 1073741824¶

get_proxy_url(url)[source]¶

deinit()[source]¶

property url¶

property page¶

open(*args, **kwargs)[source]¶: Raises NotImplementedError.

location(url, data=None, headers=None, params=None, method=None, json=None, timeout=None)[source]¶

Change current url of the browser.

Warning: unlike other requests-based woob browsers, this function does not block until the page is loaded, it’s completely asynchronous. To use the new page content, it’s necessary to wait, either implicitly (e.g. with context manager implicit_wait()) or explicitly (e.g. using method wait_until())

export_session()[source]¶

save_response_if_changed()[source]¶

save_response()[source]¶

absurl(uri, base=None)[source]¶

wait_xpath(xpath, timeout=None)[source]¶

wait_xpath_visible(xpath, timeout=None)[source]¶

wait_xpath_invisible(xpath, timeout=None)[source]¶

wait_xpath_clickable(xpath, timeout=None)[source]¶

wait_until_is_here(urlobj, timeout=None)[source]¶

wait_until(condition, timeout=None)[source]¶

Wait until some condition object is met

Wraps WebDriverWait. See https://seleniumhq.github.io/selenium/docs/api/py/webdriver_support/selenium.webdriver.support.wait.html

See CustomCondition.

Parameters:: timeout – wait time in seconds (else DEFAULT_WAIT if None) (default: None)

implicitly_wait(timeout)[source]¶

Set implicit wait time

When querying anything in DOM in Selenium, like evaluating XPath, if not found, Selenium will wait in a blocking manner until it is found or until the implicit wait timeouts. By default, it is 0, so if an XPath is not found, it fails immediately.

Parameters:: timeout – new implicit wait time in seconds

implicit_wait(timeout)[source]¶

Context manager to change implicit wait time and restore it

Example:

with browser.implicit_wait(10):
    # Within this block, the implicit wait will be set to 10 seconds
    # and be restored at the end of block.
    # If the link is not found immediately, it will be periodically
    # retried until found (for max 10 seconds).
    el = self.find_element_link_text("Show list")
    el.click()

in_frame(selector)[source]¶

Context manager to execute a block inside a frame and restore main page after.

In selenium, to operate on a frame’s content, one needs to switch to the frame before and return to main page after.

Parameters:: selector – selector to match the frame

Example:

with self.in_frame(xpath_locator('//frame[@id="foo"]')):
    el = self.find_element_by_xpath('//a[@id="bar"]')
    el.click()

get_storage()[source]¶

Get localStorage content for current domain.

As for cookies, this method only manipulates data for current domain. It’s not possible to get all localStorage content. To get localStorage for multiple domains, the browser must change the url to each domain and call get_storage each time after. To do so, it’s wise to choose a neutral URL (like an image file or JS file) to avoid the target page itself changing the cookies.

update_storage(d)[source]¶

Update local storage content for current domain.

It has the same restrictions as get_storage.

clear_storage()[source]¶: Clear local storage.

class SeleniumPage(browser)[source]¶

Bases: object

Page to use in a SeleniumBrowser

Differences with regular woob Pages: - cannot access raw HTML text

logged = False¶

property doc¶

is_here()[source]¶

Method to determine if the browser is on this page and the page is ready.

Use XPath and page content to determine if we are on this page. Make sure the page is “ready” for the usage we want. For example, if there’s a splash screen in front the page, preventing click, it should return False.

is_here can be a method or a CustomCondition instance.

class HTMLPage(*args, **kwargs)[source]¶

Bases: HTMLPage

ENCODING: ClassVar[str | None] = 'utf-8'¶: Force a page encoding. It is recommended to use None for autodetection.

class CustomCondition[source]¶

Bases: object

Abstract condition class

In Selenium, waiting is done on callable objects named “conditions”. Basically, a condition is a function predicate returning True if some condition is met.

The builtin selenium conditions are in selenium.webdriver.support.expected_conditions().

This class exists to differentiate normal methods from condition objects when calling \(SeleniumPage.is_here\).

See https://seleniumhq.github.io/selenium/docs/api/py/webdriver_support/selenium.webdriver.support.expected_conditions.html When using selenium.webdriver.support.expected_conditions, it’s better to wrap them using WrapException.

class AnyCondition(*conditions)[source]¶

Bases: CustomCondition

Condition that is true if any of several conditions is true.

class AllCondition(*conditions)[source]¶

Bases: CustomCondition

Condition that is true if all of several conditions are true.

class NotCondition(condition)[source]¶

Bases: CustomCondition

Condition that tests the inverse of another condition.

class IsHereCondition(urlobj)[source]¶

Bases: CustomCondition

Condition that is true if a page “is here”.

This condition is to be passed to SeleniumBrowser.wait_until. It mustn’t be used in a SeleniumPage.is_here definition.

VisibleXPath(xpath)[source]¶: Wraps visibility_of_element_located

ClickableXPath(xpath)[source]¶: Wraps element_to_be_clickable

ClickableLinkText(text, partial=False)[source]¶: Wraps element_to_be_clickable

HasTextCondition(xpath)[source]¶: Condition to ensure some xpath is visible and contains non-empty text.

class WrapException(condition)[source]¶

Bases: CustomCondition

Wrap Selenium’s builtin expected_conditions to catch exceptions.

Selenium’s builtin expected_conditions return True when a condition is met but might throw exceptions when it’s not met, which might not be desirable.

WrapException wraps such expected_conditions to catch those exception and simply return False when such exception is thrown.

xpath_locator(xpath)[source]¶

Creates an XPath locator from a string

Most Selenium functions don’t accept XPaths directly but “locators”. Locators can be XPath, CSS selectors.

link_locator(text, partial=False)[source]¶

Creates an link text locator locator from a string

Most Selenium functions don’t accept XPaths directly but “locators”.

Warning: if searched text is not directly in <a> but in one of its children, some webdrivers might not find the link.

class ElementWrapper(wrapped)[source]¶

Bases: object

Wrapper to Selenium element to ressemble lxml.

Some differences: - only a subset of lxml’s Element class are available - cannot access XPath “text()”, only Elements

See https://seleniumhq.github.io/selenium/docs/api/py/webdriver_remote/selenium.webdriver.remote.webelement.html

xpath(xpath)[source]¶

Returns a list of elements matching xpath.

Since it uses find_elements_by_xpath, it does not raise NoSuchElementException or TimeoutException.

text_content()[source]¶

property text¶

itertext()[source]¶

property attrib¶

`woob.browser.selenium`¶

Navigation

External links

Related Topics

woob.browser.selenium¶

`woob.browser.selenium`¶