woob.browser.selenium
¶
- class SeleniumBrowser(logger=None, proxy=None, responses_dirname=None, weboob=None, proxy_headers=None, preferences=None, remote_driver_url=None, woob=None)[source]¶
Bases:
object
Browser similar to PagesBrowser, but using Selenium.
URLs instances can be used. The need_login decorator can be used too.
Differences: - since JS code can be run anytime, the current url and page can change anytime - it’s not possible to use
open()
, onlylocation()
can be used - many options are not implemented yet (like proxies) or cannot be implemented at all- DRIVER¶
Selenium driver class
alias of
WebDriver
- HEADLESS = True¶
Run without any display
- DEFAULT_WAIT = 10¶
Default wait time for wait_* methods
- WINDOW_SIZE = None¶
Rendering window size
It can be useful for responsive websites which show or hide elements depending on the viewport size.
- BASEURL = None¶
- MAX_SAVED_RESPONSES = 1073741824¶
- property url¶
- property page¶
- open(*args, **kwargs)[source]¶
Raises
NotImplementedError
.
- location(url, data=None, headers=None, params=None, method=None, json=None, timeout=None)[source]¶
Change current url of the browser.
Warning: unlike other requests-based woob browsers, this function does not block until the page is loaded, it’s completely asynchronous. To use the new page content, it’s necessary to wait, either implicitly (e.g. with context manager
implicit_wait()
) or explicitly (e.g. using methodwait_until()
)
- wait_until(condition, timeout=None)[source]¶
Wait until some condition object is met
Wraps WebDriverWait. See https://seleniumhq.github.io/selenium/docs/api/py/webdriver_support/selenium.webdriver.support.wait.html
See
CustomCondition
.- Parameters:
timeout – wait time in seconds (else DEFAULT_WAIT if None) (default:
None
)
- implicitly_wait(timeout)[source]¶
Set implicit wait time
When querying anything in DOM in Selenium, like evaluating XPath, if not found, Selenium will wait in a blocking manner until it is found or until the implicit wait timeouts. By default, it is 0, so if an XPath is not found, it fails immediately.
- Parameters:
timeout – new implicit wait time in seconds
- implicit_wait(timeout)[source]¶
Context manager to change implicit wait time and restore it
Example:
with browser.implicit_wait(10): # Within this block, the implicit wait will be set to 10 seconds # and be restored at the end of block. # If the link is not found immediately, it will be periodically # retried until found (for max 10 seconds). el = self.find_element_link_text("Show list") el.click()
- in_frame(selector)[source]¶
Context manager to execute a block inside a frame and restore main page after.
In selenium, to operate on a frame’s content, one needs to switch to the frame before and return to main page after.
- Parameters:
selector – selector to match the frame
Example:
with self.in_frame(xpath_locator('//frame[@id="foo"]')): el = self.find_element_by_xpath('//a[@id="bar"]') el.click()
- get_storage()[source]¶
Get localStorage content for current domain.
As for cookies, this method only manipulates data for current domain. It’s not possible to get all localStorage content. To get localStorage for multiple domains, the browser must change the url to each domain and call get_storage each time after. To do so, it’s wise to choose a neutral URL (like an image file or JS file) to avoid the target page itself changing the cookies.
- class SeleniumPage(browser)[source]¶
Bases:
object
Page to use in a SeleniumBrowser
Differences with regular woob Pages: - cannot access raw HTML text
- logged = False¶
- property doc¶
- is_here()[source]¶
Method to determine if the browser is on this page and the page is ready.
Use XPath and page content to determine if we are on this page. Make sure the page is “ready” for the usage we want. For example, if there’s a splash screen in front the page, preventing click, it should return False.
is_here can be a method or a
CustomCondition
instance.
- class HTMLPage(*args, **kwargs)[source]¶
Bases:
HTMLPage
- ENCODING: ClassVar[str | None] = 'utf-8'¶
Force a page encoding. It is recommended to use None for autodetection.
- class CustomCondition[source]¶
Bases:
object
Abstract condition class
In Selenium, waiting is done on callable objects named “conditions”. Basically, a condition is a function predicate returning True if some condition is met.
The builtin selenium conditions are in
selenium.webdriver.support.expected_conditions()
.This class exists to differentiate normal methods from condition objects when calling
.See https://seleniumhq.github.io/selenium/docs/api/py/webdriver_support/selenium.webdriver.support.expected_conditions.html When using selenium.webdriver.support.expected_conditions, it’s better to wrap them using
WrapException
.
- class AnyCondition(*conditions)[source]¶
Bases:
CustomCondition
Condition that is true if any of several conditions is true.
- class AllCondition(*conditions)[source]¶
Bases:
CustomCondition
Condition that is true if all of several conditions are true.
- class NotCondition(condition)[source]¶
Bases:
CustomCondition
Condition that tests the inverse of another condition.
- class IsHereCondition(urlobj)[source]¶
Bases:
CustomCondition
Condition that is true if a page “is here”.
This condition is to be passed to SeleniumBrowser.wait_until. It mustn’t be used in a SeleniumPage.is_here definition.
- HasTextCondition(xpath)[source]¶
Condition to ensure some xpath is visible and contains non-empty text.
- class WrapException(condition)[source]¶
Bases:
CustomCondition
Wrap Selenium’s builtin expected_conditions to catch exceptions.
Selenium’s builtin expected_conditions return True when a condition is met but might throw exceptions when it’s not met, which might not be desirable.
WrapException wraps such expected_conditions to catch those exception and simply return False when such exception is thrown.
- xpath_locator(xpath)[source]¶
Creates an XPath locator from a string
Most Selenium functions don’t accept XPaths directly but “locators”. Locators can be XPath, CSS selectors.
- link_locator(text, partial=False)[source]¶
Creates an link text locator locator from a string
Most Selenium functions don’t accept XPaths directly but “locators”.
Warning: if searched text is not directly in <a> but in one of its children, some webdrivers might not find the link.
- class ElementWrapper(wrapped)[source]¶
Bases:
object
Wrapper to Selenium element to ressemble lxml.
Some differences: - only a subset of lxml’s Element class are available - cannot access XPath “text()”, only Elements
- xpath(xpath)[source]¶
Returns a list of elements matching xpath.
Since it uses find_elements_by_xpath, it does not raise NoSuchElementException or TimeoutException.
- property text¶
- property attrib¶