Python selenium模擬網(wǎng)頁(yè)點(diǎn)擊爬蟲(chóng)交管12123違章數(shù)據(jù)
在上一篇文章《Python教程—模擬網(wǎng)頁(yè)點(diǎn)擊爬蟲(chóng)定位系統(tǒng)》講解怎么通過(guò)模擬點(diǎn)擊方式爬取車輛定位數(shù)據(jù),本次介紹怎么以模擬點(diǎn)擊方式進(jìn)入交管12123爬取車輛違章數(shù)據(jù),本文直接講解過(guò)程,使用的命令解釋見(jiàn)上一篇文章。本文同《Python教程—模擬網(wǎng)頁(yè)點(diǎn)擊爬蟲(chóng)定位系統(tǒng)》同樣為企業(yè)中實(shí)際的爬蟲(chóng)案例,如果之后想進(jìn)入車企行業(yè)可以做個(gè)了解。
準(zhǔn)備工具:spyder、selenium庫(kù)、google瀏覽器及對(duì)應(yīng)版本的chromedriver.exe
效果
注:分享此案例目的是為了幫助同行解放雙手,更好管理企業(yè)資產(chǎn),本文程序以刪除網(wǎng)址、賬號(hào)密碼,該網(wǎng)址比較麻煩的一點(diǎn)是開(kāi)始點(diǎn)擊登錄的時(shí)候網(wǎng)頁(yè)可能會(huì)有其他彈窗出現(xiàn),使得原有路徑改變,程序會(huì)因?yàn)檎也坏綄?duì)應(yīng)路徑而報(bào)錯(cuò),重新執(zhí)行程序即可。除了模擬點(diǎn)擊登錄,還可以直接通過(guò)Cookie直接登錄網(wǎng)頁(yè),這種方式就可以繞過(guò)登錄的繁瑣步驟。
調(diào)用庫(kù)from selenium import webdriverimport timeimport csvimport datetimefrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as ECfrom selenium.webdriver.support.wait import WebDriverWaitimport mathimport xlrd
讀取需要查詢的車牌號(hào)
data = xlrd.open_workbook(’cheliang.xlsx’)
創(chuàng)建瀏覽,打開(kāi)網(wǎng)頁(yè)
opt = webdriver.ChromeOptions() #創(chuàng)建瀏覽#opt.set_headless() #無(wú)窗口模式driver = webdriver.Chrome(options=opt) #創(chuàng)建瀏覽器對(duì)象driver.maximize_window() #最大化窗口print('正在打開(kāi)網(wǎng)頁(yè)')driver.get(’’) #打開(kāi)網(wǎng)頁(yè)
依次點(diǎn)擊單位登錄、輸入賬號(hào)、密碼、點(diǎn)擊驗(yàn)證碼填寫(xiě)區(qū)域觸發(fā)圖片、勾選、輸入驗(yàn)證碼、點(diǎn)擊登錄
time.sleep(3) #加載等待print('點(diǎn)擊單位登錄')time.sleep(3) #加載等待driver.find_element_by_xpath('/html/body/div[1]/div[2]/div/div[2]/div[2]/button').click()#點(diǎn)擊單位登錄time.sleep(3) #加載等待print('正在填寫(xiě)賬號(hào)')elem = driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[1]/div/input')# 清空原有內(nèi)容elem.clear()# 填入賬號(hào)elem.send_keys('')time.sleep(1) #加載等待print('正在填寫(xiě)密碼')elem = driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[2]/div/input')# 清空原有內(nèi)容elem.clear()# 填入密碼elem.send_keys('')time.sleep(1) #加載等待print('正在查看驗(yàn)證碼')driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input').click()#查看驗(yàn)證碼print('請(qǐng)輸入驗(yàn)證碼')yanzhengma=input()time.sleep(1) #加載等待driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[4]/div/label/input').click()#勾選time.sleep(1) #加載等待# 填入驗(yàn)證碼elem = driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input')elem.clear()elem.send_keys(str(yanzhengma))time.sleep(1) #加載等待print('正在登陸')driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button').click()#點(diǎn)擊
點(diǎn)擊違法查詢,設(shè)置查詢時(shí)間
driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button').click()#點(diǎn)擊 time.sleep(3) #加載等待driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/ul/li[5]/a').click()#點(diǎn)擊違法查詢 time.sleep(1) #加載等待driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[1]/div/div[1]/span/i').click()#點(diǎn)擊選擇日期 for i in range(3): time.sleep(0.5) #加載等待 driver.find_element_by_xpath('/html/body/div[6]/div[4]/table/thead/tr/th[1]/i').click()#點(diǎn)擊 time.sleep(0.5) #加載等待driver.find_element_by_xpath('/html/body/div[6]/div[4]/table/tbody/tr/td/span[1]').click()#點(diǎn)擊 time.sleep(0.5) #加載等待driver.find_element_by_xpath('/html/body/div[6]/div[3]/table/tbody/tr[2]/td[1]').click()#點(diǎn)擊
循環(huán)依次查詢每個(gè)車牌違章信息,每次都需要清空上次輸入,填寫(xiě)本次查詢車牌,識(shí)別有多少條數(shù)據(jù),共多少頁(yè),每頁(yè)最多展示10條,最后一頁(yè)有多少條數(shù)據(jù)
for ii in range(0,nrows): rowValues= table.row_values(ii) #某一行數(shù)據(jù) print(’正在讀取第’+str(ii+1)+’輛車’)# 填寫(xiě)車牌 time.sleep(0.5) #加載等待 elem = driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[3]/div/input') elem.clear() elem.send_keys(rowValues)#輸入車牌 time.sleep(0.1) #加載等待 driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[4]/button').click()#點(diǎn)擊查詢 time.sleep(0.5) #加載等待 result=driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[2]/div[1]/div/p/span').text#總違章條數(shù) result=int(result) a=math.ceil(result/10)#總頁(yè)數(shù) b=result%10 #除余
讀取列表中的數(shù)據(jù),其中扣分和罰款需要點(diǎn)擊'查看詳情',從彈窗中讀取數(shù)據(jù)
result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[1]'))).textresult2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[2]'))).textresult3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[3]'))).textresult4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[4]'))).textresult5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[5]'))).textresult6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[6]'))).textresult7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[7]'))).textWebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[8]/a'))).click()#查看詳情,打開(kāi)彈窗time.sleep(1) #加載等待result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[7]/span[2]'))).textresult9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[8]/span[2]'))).textresult=[result1,result2,result3,result4,result5,result6,result7,result8,result9]R.append(result)WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//div[@class=’modal-footer ui_modal’]/button'))).click()#關(guān)閉彈窗time.sleep(0.5) #加載等待
每讀取一輛車的數(shù)據(jù)就寫(xiě)入表格中
with open(wenjian,’w’,encoding=’utf-8’,newline=’’) as fp: writer = csv.writer(fp) writer.writerows(R) #寫(xiě)入數(shù)據(jù)完整代碼
from selenium import webdriverimport timeimport csvimport datetimefrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as ECfrom selenium.webdriver.support.wait import WebDriverWaitimport mathimport xlrddata = xlrd.open_workbook(’cheliang.xlsx’)table = data.sheets()[0]nrows = table.nrows #行數(shù)ncols = table.ncols #列數(shù) opt = webdriver.ChromeOptions() #創(chuàng)建瀏覽#opt.set_headless() #無(wú)窗口模式driver = webdriver.Chrome(options=opt) #創(chuàng)建瀏覽器對(duì)象driver.maximize_window() #最大化窗口 print('正在打開(kāi)網(wǎng)頁(yè)')driver.get(’’) #打開(kāi)網(wǎng)頁(yè) time.sleep(3) #加載等待print('點(diǎn)擊單位登錄')time.sleep(3) #加載等待driver.find_element_by_xpath('/html/body/div[1]/div[2]/div/div[2]/div[2]/button').click()#點(diǎn)擊單位登錄 time.sleep(3) #加載等待print('正在填寫(xiě)賬號(hào)')elem = driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[1]/div/input')# 清空原有內(nèi)容elem.clear()# 填入賬號(hào)elem.send_keys('') time.sleep(1) #加載等待print('正在填寫(xiě)密碼')elem = driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[2]/div/input')# 清空原有內(nèi)容elem.clear()# 填入密碼elem.send_keys('') time.sleep(1) #加載等待print('正在查看驗(yàn)證碼')driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input').click()#查看驗(yàn)證碼print('請(qǐng)輸入驗(yàn)證碼')yanzhengma=input() time.sleep(1) #加載等待driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[4]/div/label/input').click()#勾選 time.sleep(1) #加載等待# 填入驗(yàn)證碼elem = driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input')elem.clear()elem.send_keys(str(yanzhengma)) time.sleep(1) #加載等待print('正在登陸')driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button').click()#點(diǎn)擊 time.sleep(3) #加載等待driver.find_element_by_xpath('/html/body/div[4]/div/div[1]/ul/li[5]/a').click()#點(diǎn)擊違法查詢 time.sleep(1) #加載等待driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[1]/div/div[1]/span/i').click()#點(diǎn)擊選擇日期 for i in range(3): time.sleep(0.5) #加載等待 driver.find_element_by_xpath('/html/body/div[6]/div[4]/table/thead/tr/th[1]/i').click()#點(diǎn)擊 time.sleep(0.5) #加載等待driver.find_element_by_xpath('/html/body/div[6]/div[4]/table/tbody/tr/td/span[1]').click()#點(diǎn)擊 time.sleep(0.5) #加載等待driver.find_element_by_xpath('/html/body/div[6]/div[3]/table/tbody/tr[2]/td[1]').click()#點(diǎn)擊 wenjian=datetime.datetime.now().strftime(’%Y-%m-%d-%H%M%S’) #以開(kāi)始時(shí)間作為數(shù)據(jù)導(dǎo)出的表格文件名wenjian=wenjian+’.csv’ R=[]for ii in range(0,nrows): rowValues= table.row_values(ii) #某一行數(shù)據(jù) print(’正在讀取第’+str(ii+1)+’輛車’) # 填寫(xiě)車牌 time.sleep(0.5) #加載等待 elem = driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[3]/div/input') elem.clear() elem.send_keys(rowValues)#輸入車牌 time.sleep(0.1) #加載等待 driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[4]/button').click()#點(diǎn)擊查詢 time.sleep(0.5) #加載等待 result=driver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div[2]/div[1]/div/p/span').text#總違章條數(shù) result=int(result) a=math.ceil(result/10)#總頁(yè)數(shù) b=result%10 #除余for i in range(1,a):for j in range(1,11):result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[1]'))).text result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[2]'))).text result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[3]'))).text result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[4]'))).text result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[5]'))).text result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[6]'))).text result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[7]'))).text #result1=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[1]').text #result2=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[2]').text #result3=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[3]').text #result4=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[4]').text #result5=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[5]').text #result6=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[6]').text #result7=driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[7]').text WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[8]/a'))).click()#查看詳情,打開(kāi)彈窗 time.sleep(1) #加載等待 #driver.find_element_by_xpath('//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[8]/a').click()#點(diǎn)擊列表中的元素 #time.sleep(0.5) #加載等待 result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[7]/span[2]'))).text result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[8]/span[2]'))).text #result8=driver.find_element_by_xpath('//form[@class=’form-horizontal’]/div[7]/span[2]').text #result9=driver.find_element_by_xpath('//form[@class=’form-horizontal’]/div[8]/span[2]').text result=[result1,result2,result3,result4,result5,result6,result7,result8,result9] R.append(result) WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//div[@class=’modal-footer ui_modal’]/button'))).click()#關(guān)閉彈窗 time.sleep(0.5) #加載等待 #driver.find_element_by_xpath('//div[@class=’modal-footer ui_modal’]/button').click()#點(diǎn)擊列表中的元素 #time.sleep(0.5) #加載等待 driver.find_element_by_link_text('下一頁(yè)').click()#翻頁(yè)time.sleep(0.5) #加載等待 if b>0:for j in range(1,b+1): result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[1]'))).text result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[2]'))).text result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[3]'))).text result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[4]'))).text result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[5]'))).text result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[6]'))).text result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[7]'))).text WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[8]/a'))).click()#查看詳情,打開(kāi)彈窗 time.sleep(1) #加載等待 result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[7]/span[2]'))).text result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[8]/span[2]'))).text result=[result1,result2,result3,result4,result5,result6,result7,result8,result9] R.append(result) WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//div[@class=’modal-footer ui_modal’]/button'))).click()#關(guān)閉彈窗 time.sleep(0.5) #加載等待 if b==0:for j in range(1,11): result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[1]'))).text result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[2]'))).text result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[3]'))).text result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[4]'))).text result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[5]'))).text result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[6]'))).text result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[7]'))).text WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//table[@id=’my-msg-list’]/tbody/tr['+str(j)+']/td[8]/a'))).click()#查看詳情,打開(kāi)彈窗 time.sleep(1) #加載等待 result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[7]/span[2]'))).text result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//form[@class=’form-horizontal’]/div[8]/span[2]'))).text result=[result1,result2,result3,result4,result5,result6,result7,result8,result9] R.append(result) WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,'//div[@class=’modal-footer ui_modal’]/button'))).click()#關(guān)閉彈窗 time.sleep(0.5) #加載等待 time.sleep(0.5) #加載等待 with open(wenjian,’w’,encoding=’utf-8’,newline=’’) as fp:writer = csv.writer(fp)writer.writerows(R) #寫(xiě)入數(shù)據(jù)
到此這篇關(guān)于Python selenium模擬網(wǎng)頁(yè)點(diǎn)擊爬蟲(chóng)交管12123違章數(shù)據(jù)的文章就介紹到這了,更多相關(guān)Python selenium模擬點(diǎn)擊爬蟲(chóng)內(nèi)容請(qǐng)搜索好吧啦網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持好吧啦網(wǎng)!
相關(guān)文章:
1. 解決docker與vmware的沖突問(wèn)題2. Django中的AutoField字段使用3. Python基于jieba, wordcloud庫(kù)生成中文詞云4. IntelliJ Idea 2020.1 正式發(fā)布,官方支持中文(必看)5. IntelliJ IDEA設(shè)置自動(dòng)提示功能快捷鍵的方法6. asp.net core應(yīng)用docke部署到centos7的全過(guò)程7. Java 3D的動(dòng)畫(huà)展示(Part1-使用JMF)8. Django ORM實(shí)現(xiàn)按天獲取數(shù)據(jù)去重求和例子9. Android 實(shí)現(xiàn)徹底退出自己APP 并殺掉所有相關(guān)的進(jìn)程10. 刪除docker里建立容器的操作方法
