使用Python在Selenium WebDriver中获取WebElement的HTML源代码

2022-07-11 17:13:16玩技站长

玩技站长

管理员, Keymaster

关注

7003
文章

2
粉丝

测试交流18424字数 134阅读0分26秒阅读模式

我正在使用Python绑定来运行Selenium WebDriver：

from selenium import webdriver
wd = webdriver.Firefox()

我知道我可以抓取这样的Web元素：

elem = wd.find_element_by_css_selector('#my-id')

我知道我可以得到完整的来源与。。。

wd.page_source

但是有没有办法获得“元素源”？

elem.source   # <-- returns the HTML as a string

用于Python的Selenium WebDriver文档基本上不存在，我在代码中没有看到任何支持该功能的内容。

访问元素（及其子元素）的HTML的最佳方式是什么？

版权提示：非本站文章仅供存储任何法律责任由作者承担▷违法举报◁▷新闻不符◁▷我要投稿◁
免责声明：部分内容来自用户上传发布或新闻客户端自媒体如有侵权请反馈站长处理
原创转载：阅读转载说明>>> https://www.playezu.com/zixun/jiaoliu/shiyongpythonzaiselenium-webdriverzhonghuoquwebelementdehtmlai.html

automated-tests
Python

如何在rest assured中验证Json响应字段的数据类型

测试交流 210 07/27

敏捷实践 docker+jenkins+python 接口自动化部署

测试交流 267 07/13

Can I compose variable to create new varibales in Postman?

测试交流 191 07/12

测试开发技术测试开发工程必备技能之一：Mock 的使用

测试交流 165 07/14

评论 18 访客 18

Phillip 9
2022-07-11 17:06:10 未知地区 10F
回复
I hope this could help:
http://selenium.googlecode.com/svn/trunk/docs/api/java/org/openqa/selenium/WebElement.html
Here is described Java method:
java.lang.String getText()
But unfortunately it’s not available in Python. So you can translate the method names to Python from Java and try another logic using present methods without getting the whole page source…
E.g.
my_id = elem[0].get_attribute(‘my-id’)
Peter Mortensen 9
2022-07-11 17:06:10 未知地区 9F
回复
InnerHTML will return the element inside the selected element and outerHTML will return the inside HTML along with the element you have selected
Example:
Now suppose your Element is as below
<tr id="myRow"><td>A</td><td>B</td></tr>
innerHTML element output
<td>A</td><td>B</td>
outerHTML element output
<tr id="myRow"><td>A</td><td>B</td></tr>
Live Example:
http://www.java2s.com/Tutorials/JavascriptDemo/f/find_out_the_difference_between_innerhtml_and_outerhtml_in_javascript_example.htm
Below you will find the syntax which require as per different binding. Change the innerHTML to outerHTML as per required.
Python:
element.get_attribute(‘innerHTML’)
Java:
elem.getAttribute("innerHTML");
If you want whole page HTML, use the below code:
driver.getPageSource();
WltrRpo 9
2022-07-11 17:06:09 未知地区 8F
回复
Java with Selenium 2.53.0
driver.getPageSource();
undetected Selenium 9
2022-07-11 17:06:09 未知地区 7F
回复
The other answers provide a lot of details about retrieving the markup of a WebElement. However, an important aspect is, modern websites are increasingly implementing JavaScript, ReactJS, jQuery, Ajax, Vue.js, Ember.js, GWT, etc. to render the dynamic elements within the DOM tree. Hence there is a necessity to wait for the element and its children to completely render before retrieving the markup.
Python
Hence, ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using get_attribute("outerHTML"):
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#my-id")))
print(element.get_attribute("outerHTML"))
Using execute_script():
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#my-id")))
print(driver.execute_script("return arguments[0].outerHTML;", element))
Note: You have to add the following imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Peter Mortensen 9
2022-07-11 17:06:09 未知地区 6F
回复
It looks outdated, but let it be here anyway. The correct way to do it in your case:
elem = wd.find_element_by_css_selector(‘#my-id’)
html = wd.execute_script("return arguments[0].innerHTML;", elem)
or
html = elem.get_attribute(‘innerHTML’)
Both are working for me (selenium-server-standalone-2.35.0).
Peter Mortensen 9
2022-07-11 17:06:09 未知地区 5F
回复
Using the attribute method is, in fact, easier and more straightforward.
Using Ruby with the Selenium and PageObject gems, to get the class associated with a certain element, the line would be element.attribute(Class).
The same concept applies if you wanted to get other attributes tied to the element. For example, if I wanted the string of an element, element.attribute(String).
Ajinkya 9
2022-07-11 17:06:09 未知地区 4F
回复
In Ruby, using selenium-webdriver (2.32.1), there is a page_source method that contains the entire page source.
Samuel RIGAUD 9
2022-07-11 17:06:09 未知地区 3F
回复
Sure we can get all HTML source code with this script below in Selenium Python:
elem = driver.find_element_by_xpath(“//*”)
source_code = elem.get_attribute(“outerHTML”)
If you you want to save it to file:
with open(‘c:/html_source_code.html’, ‘w’) as f:
f.write(source_code.encode(‘utf-8’))
I suggest saving to a file because source code is very very long.
Peter Mortensen 9
2022-07-11 17:06:09 未知地区 2F
回复
There is not really a straightforward way of getting the HTML source code of a webelement. You will have to use JavaScript. I am not too sure about python bindings, but you can easily do like this in Java. I am sure there must be something similar to JavascriptExecutor class in Python.
WebElement element = driver.findElement(By.id("foo"));
String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);
Peter Mortensen 9
2022-07-11 17:06:08 未知地区 1F
回复
You can read the innerHTML attribute to get the source of the content of the element or outerHTML for the source with the current element.
Python:
element.get_attribute(‘innerHTML’)
Java:
elem.getAttribute("innerHTML");
C#:
element.GetAttribute("innerHTML");
Ruby:
element.attribute("innerHTML")
JavaScript:
element.getAttribute(‘innerHTML’);
PHP:
$element->getAttribute(‘innerHTML’);
It was tested and worked with the ChromeDriver.