在Selenium WebDriver中使用Python获取WebElement的HTML源代码

我使用Python绑定来运行Selenium WebDriver:

from selenium import webdriver
wd = webdriver.Firefox()

我知道我可以像这样抓取一个webelement:

elem = wd.find_element_by_css_selector('#my-id')

我知道我可以得到整页的源代码…

wd.page_source

但是是否有一种获取“元素源”的方法?

elem.source   # <-- returns the HTML as a string

Python的Selenium WebDriver文档基本上不存在，我在代码中没有看到任何支持该功能的东西。

访问一个元素(及其子元素)的HTML的最佳方法是什么?

当前回答

在PHPUnit Selenium测试中，它是这样的:

$text = $this->byCssSelector('.some-class-nmae')->attribute('innerHTML');

2014-05-30 10:25:21

其他回答

使用execute_script get html

bs4(BeautifulSoup)也可以快速访问html标签。

from bs4 import BeautifulSoup
html = adriver.execute_script("return document.documentElement.outerHTML")
bs4_onepage_object=BeautifulSoup(html,"html.parser")
bs4_div_object=bs4_onepage_object.find_all("atag",class_="attribute")

2021-09-11 02:49:56

更新了2022硒检索HTML

首先，下载Selenium WebDriver的Python绑定。

可以从Selenium包的PyPI页面执行此操作。或者，也可以使用pip来安装Selenium包。Python 3.6在标准库中提供了pip。

方法1

读取innerHTML属性以获得元素内容的源。innerHTML是DOM元素的属性，它的值是开始标记和结束标记之间的HTML。

例如，下面代码中的innerHTML属性包含值" text "

<p>
a text
</p>

element.get_attribute('innerHTML')

方法2

读取outerHTML以获得带有当前元素的源代码。outerHTML是一个元素属性，其值是开始和结束标记之间的HTML以及所选元素本身的HTML。

例如，代码的outerHTML属性携带了一个包含div和span的值。

<div>
<span>Hello there!</span>
</div>

ele.get_atrribute("outerHTML")

2022-12-15 10:51:06

WebElement element = driver.findElement(By.id("foo"));
String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);

这段代码真的可以从源代码获得JavaScript !

2012-08-31 04:04:51

在PHP Selenium WebDriver中，你可以像这样获得页面源代码:

$html = $driver->getPageSource();

或者像这样获取元素的HTML:

// innerHTML if you need HTML of the element content
$html = $element->getDomProperty('outerHTML');

2021-12-22 07:51:44

它看起来过时了，但不管怎样，就让它留在这里吧。在你的情况下，正确的做法是:

elem = wd.find_element_by_css_selector('#my-id')
html = wd.execute_script("return arguments[0].innerHTML;", elem)

html = elem.get_attribute('innerHTML')

两者都适合我(selenium-server-standalone-2.35.0)。

2014-03-06 14:52:17

在Selenium WebDriver中使用Python获取WebElement的HTML源代码

推荐文章

最新文章

标签