功能强大的python包（十）：selenium（浏览器机器人）

作者: 可爱多多少 | 来源:发表于2021-09-02 14:02 被阅读0次

功能强大的python包（十）：selenium（浏览器机器人）
selenium自动化（8种定位方式）
2.selenium（wb自动化工具使用）
Selenium+Chrome浏览器环境搭建
【转】selenium+python自动登录脚本
抖音数据采集，最全python库selenium自动化使用！
009.Python学习笔记：Day8-使用python+sel
selenium在IE/Chrome/Firefox浏览器中使用
idea连接jdk包
初识selenium

1.selenium简介

Selenium是一个用程序操作浏览器的工具，利用它可以实现浏览器自动化、自动化测试、辅助爬虫等。

我们使用浏览器时的所有操作都是基于鼠标和键盘进行交互的，selenium就是用程序的形式来代替我们的键鼠操作，实现自动化的操作。

利用scrapy编写爬虫时，我们可以使用selenium来驱动浏览器加载页面，获取JavaScrapt渲染后的页面HTML代码，而无须考虑网页的加载形式、接口是否加密等一系列复杂的问题。

2.selenium总览

selenium

浏览器驱动

通过指定操作的浏览器驱动，我们可以通过selenium用代码来操作浏览器。

image

驱动	代码实现
Chrome浏览器	driver = webdriver.Chrome( )
IE浏览器	driver = webdriver.Ie( )
Edge浏览器	driver = webdriver.Edge( )
Opera浏览器	driver = webdriver.Opera( )
PhantomJS浏览器	driver = webdriver.PhantomJS( )

元素定位

利用元素定位可以找到加载页面中的任何对象，类似于我们查看加载的页面，并找到我们的目标信息，以便执行下一步的操作。

元素定位

元素定位	代码实现
id定位	find_element_by_id( )、find_element(By.ID,'id')
name定位	find_element_by_name( )、find_element(By.NAME,'name')
class定位	find_element_by_class_name( )、find_element(By.CLASS_NAME,'class_name')
link定位	find_element_by_link_text( )、find_element(By.LINK_TEXT,'link_text')
tag定位	find_element_by_tag_name( )、find_element(By.TAG_NAME,'tag_name')
xpath定位	find_element_by_xpath( )、find_element(By.XPATH,'xpath')
css定位	find_element_by_css( )、find_element(By.CSS,'css')

浏览器操作

浏览器操作是针对浏览器客户端的一些操作，如我们常用的最大化、最小化等。

image

浏览器操作	代码实现
最大化	browser.maximize_window( )
最小化	browser.minimize_window( )
设置窗口大小	browser.set_window_size（ )
前进	browser.forword( )
后退	browser.back( )
刷新	browser.refresh( )

操作测试对象

操作测试对象是我们在自动化测试中常用的一些方法，主要是对定位到的元素进行操作。

image

操作测试对象	代码实现
点击对象	click( )
模拟按键输入	send_keys( )
清除对象内容	clear( )
提交对象内容	submit( )
获取元素文本信息	text( )

键盘事件

在操作测试对象中，send_keys( )中可以传递键盘事件，相当于我们按下一下特殊的按键。

键盘事件

键盘事件	代码实现
TAB	send_keys(Keys.TAB)
ENTER	send_keys(Keys.ENTER)
BackSpace	send_keys(Keys.BackSpace)
Space	send_keys(Keys.Space)
Esc	send_keys(Keys.Esc)
F1	send_keys(Keys.F1)
F12	send_keys(Keys.F12)
全选	send_keys(Keys.CONTROL,'a')
复制	send_keys(Keys.CONTROL,'c')
剪切	send_keys(Keys.CONTROL,'x')
粘贴	send_keys(Keys.CONTROL,'v')

鼠标事件

鼠标事件能够用于执行所有鼠标能够完成的操作。

鼠标事件

鼠标事件	代码实现
执行ActionChains中的操作	perform( )
右击	content_click( )
双击	double_click( )
拖动	drag_and_drop( )
鼠标悬停	move_to_element( )

窗口、框架切换

当打开多个网页时，利用窗口、框架切换方法可以切换显示网页。

窗口切换

获取断言信息

image

cookie操作

在这里插入图片描述

3.selenium应用于爬虫

selenium应用于爬虫，主要是为了解决scrapy无法解决的问题：获取JavaScrapt渲染后的页面HTML代码。

在之前讲解scrapy库的文章中，我们了解到引擎于爬虫之间存在一个下载器中间件，scrapy就是通过这种下载器中间件来下载网页源码的；但面对JavaScrapt渲染的网页，这个下载器中间件就无能为力了，这时selenium就起到了替代下载器中间件的作用。

selenium在爬虫中的主要应用流程如下图：

image

"""苏宁易购查找iphone"""

from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from selenium import webdriver

driver = webdriver.Edge(executable_path='msedgedriver.exe')
driver.get('https://www.suning.com')

input = driver.find_element_by_id('searchKeywords')

input.clear
input.send_keys('iphone')
input.send_keys(Keys.RETURN)

wait = WebDriverWait(driver,10)
wait.until(EC.presence_of_element_located((By.CLASS_NAME,'root990')))
print(driver.page_source)

"""自动下拉页面"""

from selenium import webdriver
import time

driver = webdriver.Edge(executable_path='msedgedriver.exe')
driver.get('https://www.suning.com/')
time.sleep(4)

input = driver.find_element_by_id('searchKeywords')
input.clear
input.send_keys('iphone')
input.send_keys(Keys.RETURN)
driver.execute_script('window.scrollTo(0,document.body.scrollHeight)')

"""定位元素"""

from selenium import webdriver

driver = webdriver.Edge(executable_path='msedgedriver.exe')
driver.get('https://www.suning.com/')

input_id = driver.find_element_by_id('searchKeywords')
input_name = driver.find_element_by_name('index1_none_search_ss2')
input_xpath = driver.find_element_by_xpath("//input[@id='searchKeywords']")
input_css = driver.find_element_by_css_selector('#searchKeywords')
print(input_id,input_name,input_xpath,input_css)

"""等待页面加载完成"""

from selenium import webdriver
from selenium.common.exceptions import TimeoutException

driver = webdriver.Edge(executable_path='msedgedriver.exe')

#设置页面加载的超时时间
driver.set_page_load_timeout(5)
try:
    driver.get('https://www.suning.com/')
    driver.execute_script('window.scrollTo(0,document.body.scrollHeight)')
    print(driver.page_source)
except TimeoutException:
    print('timeout')
driver.quit()

"""隐式等待"""

from selenium import webdriver

driver = webdriver.Edge(executable_path='msedgedriver.exe')
driver.implicitly_wait(5)
driver.get("https://www.suning.com/")
print(driver.page_source)


"""显示等待"""

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

driver = webdriver.Edge(executable_path='msedgedriver.exe')
driver.get('https://www.suning.com/')

try:
    input = WebDriverWait(driver,10).until(EC.presence_of_element_located((By.ID,"searchKeywords")))
    print(input)
except TimeoutException:
    print('time out!')
driver.quit()

 scrapy框架只能爬取静态网站，如需爬取动态网站，需要结合selenium库进行js的渲染，方可爬取到动态页面。

写在最后

欢迎大家关注公众号：人类之奴！
一起学习，一起进步！

功能强大的python包（十）：selenium（浏览器机器人）
1.selenium简介 Selenium是一个用程序操作浏览器的工具，利用它可以实现浏览器自动化、自动化测试、辅...
selenium自动化（8种定位方式）
安装selenium包：cmd > pip install selenium安装与浏览器相匹配的浏览器驱动导包：f...
2.selenium（wb自动化工具使用）
selenium: 提供了操作浏览器页面的方法 python:调用/使用selenium 中的方法浏览器：chr...
Selenium+Chrome浏览器环境搭建
Selenium+Chrome浏览器环境搭建 Selenium的Python项目官方文档：https://sele...
【转】selenium+python自动登录脚本
前提：Python，selenium，IEDriverServer.exe，ie浏览器首先安装Python2.7...
抖音数据采集，最全python库selenium自动化使用！
一、安装selenium pip install Selenium 二、初始化浏览器 PS：如有需要Python学...
009.Python学习笔记：Day8-使用python+sel
Python学习笔记：Day8-使用selenium控制手chrome浏览器使用Selenium控制Chrome...
selenium在IE/Chrome/Firefox浏览器中使用
首先需要引入selenium 的jar包，然后对应不同的浏览器进行使用一、火狐浏览器 selenium从3.0版...
idea连接jdk包
maven管理代码依赖架包 selenium管理驱动浏览器的依赖包
初识selenium
selenium是什么 selenium模块可以让 Python 直接控制浏览器，可以实际点击链接，填写登录信息等...

功能强大的python包（十）：selenium（浏览器机器人）

1.selenium简介

2.selenium总览

浏览器驱动

元素定位

浏览器操作

操作测试对象

键盘事件

鼠标事件

窗口、框架切换

获取断言信息

cookie操作

3.selenium应用于爬虫

写在最后

相关文章

功能强大的python包（十）：selenium（浏览器机器人）

selenium自动化（8种定位方式）

2.selenium（wb自动化工具使用）

Selenium+Chrome浏览器环境搭建

【转】selenium+python自动登录脚本

抖音数据采集，最全python库selenium自动化使用！

009.Python学习笔记：Day8-使用python+sel

selenium在IE/Chrome/Firefox浏览器中使用

idea连接jdk包

初识selenium

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

测试相关

关于自动化

胖达君的数据运营

待看科技

Python应用，DDD领域设计，Service Mesh以及自动化测试

python_爬虫

Python

Python库