美文网首页
BeautifulSoup使用二

BeautifulSoup使用二

作者: suntwo | 来源:发表于2019-05-04 16:38 被阅读0次

title: BeautifulSoup使用二
date: 2019-03-04 17:48:15
tags:


[TOC]

关联选择

子节点

示例代码如下

import requests
from bs4 import BeautifulSoup
data="""<div class="subnav">
  <ul class="navbar">
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:7}"
          href="/board/7"
      >热映口碑榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:6}"
          href="/board/6"
      >最受期待榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:1}"
          data-state-val="{subnavId:1}"
          class="active" href="javascript:void(0);"
      >国内票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:2}"
          href="/board/2"
      >北美票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:4}"
          href="/board/4"
      >TOP100榜</a>
    </li>
  </ul>
</div>
"""

data=BeautifulSoup(data,'html.parser')
print(type(data.ul.contents))
print(data.ul.contents)

结果如下

<class 'list'>
['\n', <li>
<a data-act="subnav-click" data-val="{subnavClick:7}" href="/board/7">热映口碑榜</a>
</li>, '\n', <li>
<a data-act="subnav-click" data-val="{subnavClick:6}" href="/board/6">最受期待榜</a>
</li>, '\n', <li>
<a class="active" data-act="subnav-click" data-state-val="{subnavId:1}" data-val="{subnavClick:1}" href="javascript:void(0);">国内票房榜</a>
</li>, '\n', <li>
<a data-act="subnav-click" data-val="{subnavClick:2}" href="/board/2">北美票房榜</a>
</li>, '\n', <li>
<a data-act="subnav-click" data-val="{subnavClick:4}" href="/board/4">TOP100榜</a>
</li>, '\n']
>>> 

解释如下

type(data.ul.contents)
<class 'list'>

可以看到data.ul.contents返回的是一个列表,每个元素都是这个标签的直接子标签,但是返回元素的标签是bs4.element.Tag类型的

print(type(data.ul.contents[1]))
<class 'bs4.element.Tag'>
父节点和祖先节点

示例如下

import requests
from bs4 import BeautifulSoup
data="""<div class="subnav">
  <ul class="navbar">
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:7}"
          href="/board/7"
      >热映口碑榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:6}"
          href="/board/6"
      >最受期待榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:1}"
          data-state-val="{subnavId:1}"
          class="active" href="javascript:void(0);"
      >国内票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:2}"
          href="/board/2"
      >北美票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:4}"
          href="/board/4"
      >TOP100榜</a>
    </li>
    
  </ul>
  
</div>
<div>hfdfd</div>
"""

data=BeautifulSoup(data,'html.parser')
print(type(data.ul.parent))
print(data.ul.parent)

结果如下

=============== RESTART: C:\Users\Administrator\Desktop\aaa.py ===============
<class 'bs4.element.Tag'>
<div class="subnav">
<ul class="navbar">
<li>
<a data-act="subnav-click" data-val="{subnavClick:7}" href="/board/7">热映口碑榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:6}" href="/board/6">最受期待榜</a>
</li>
<li>
<a class="active" data-act="subnav-click" data-state-val="{subnavId:1}" data-val="{subnavClick:1}" href="javascript:void(0);">国内票房榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:2}" href="/board/2">北美票房榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:4}" href="/board/4">TOP100榜</a>
</li>
</ul>
</div>
>>> 

解释如下

可以看到data.ul.parent返回的是bs4.element.Tag类型的对象,并且返回了他的直接父节点的bs4.element.Tag对象

示例如下

import requests
from bs4 import BeautifulSoup
data="""<div class="subnav">
  <ul class="navbar">
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:7}"
          href="/board/7"
      >热映口碑榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:6}"
          href="/board/6"
      >最受期待榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:1}"
          data-state-val="{subnavId:1}"
          class="active" href="javascript:void(0);"
      >国内票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:2}"
          href="/board/2"
      >北美票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:4}"
          href="/board/4"
      >TOP100榜</a>
    </li>
    
  </ul>
  
</div>
<div>hfdfd</div>
"""

data=BeautifulSoup(data,'html.parser')
print(type(data.ul.parents))
print(list(enumerate(data.ul.parents)))

结果如下

<class 'generator'>
[(0, <div class="subnav">
<ul class="navbar">
<li>
<a data-act="subnav-click" data-val="{subnavClick:7}" href="/board/7">热映口碑榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:6}" href="/board/6">最受期待榜</a>
</li>
<li>
<a class="active" data-act="subnav-click" data-state-val="{subnavId:1}" data-val="{subnavClick:1}" href="javascript:void(0);">国内票房榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:2}" href="/board/2">北美票房榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:4}" href="/board/4">TOP100榜</a>
</li>
</ul>
</div>), (1, <div class="subnav">
<ul class="navbar">
<li>
<a data-act="subnav-click" data-val="{subnavClick:7}" href="/board/7">热映口碑榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:6}" href="/board/6">最受期待榜</a>
</li>
<li>
<a class="active" data-act="subnav-click" data-state-val="{subnavId:1}" data-val="{subnavClick:1}" href="javascript:void(0);">国内票房榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:2}" href="/board/2">北美票房榜</a>
</li>
<li>
<a data-act="subnav-click" data-val="{subnavClick:4}" href="/board/4">TOP100榜</a>
</li>
</ul>
</div>
<div>hfdfd</div>
)]

解释如下

使用data.ul.parents不仅仅返回的是直接父节点,还有其祖先节点

兄弟节点

代码如下

import requests
from bs4 import BeautifulSoup
data="""<div class="subnav">
  <ul class="navbar">
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:7}"
          href="/board/7"
      >热映口碑榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:6}"
          href="/board/6"
      >最受期待榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:1}"
          data-state-val="{subnavId:1}"
          class="active" href="javascript:void(0);"
      >国内票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:2}"
          href="/board/2"
      >北美票房榜</a>
    </li>
    <li>
      <a data-act="subnav-click" data-val="{subnavClick:4}"
          href="/board/4"
      >TOP100榜</a>
    </li>
    
  </ul>
  
</div>
<div>hfdfd</div>
"""

data=BeautifulSoup(data,'html.parser')
print(type(data.li.next_sibling))
#print(data.li.previous_sibling)
print(data.li.next_sibling)
#print(list(enumerate(data.li.previous_siblings)))
print(list(enumerate(data.li.next_siblings)))

结果如下

=============== RESTART: C:\Users\Administrator\Desktop\aaa.py ===============
<class 'bs4.element.NavigableString'>


[(0, '\n'), (1, <li>
<a data-act="subnav-click" data-val="{subnavClick:6}" href="/board/6">最受期待榜</a>
</li>), (2, '\n'), (3, <li>
<a class="active" data-act="subnav-click" data-state-val="{subnavId:1}" data-val="{subnavClick:1}" href="javascript:void(0);">国内票房榜</a>
</li>), (4, '\n'), (5, <li>
<a data-act="subnav-click" data-val="{subnavClick:2}" href="/board/2">北美票房榜</a>
</li>), (6, '\n'), (7, <li>
<a data-act="subnav-click" data-val="{subnavClick:4}" href="/board/4">TOP100榜</a>
</li>), (8, '\n')]
>>> 

解释如下

相关文章

网友评论

      本文标题:BeautifulSoup使用二

      本文链接:https://www.haomeiwen.com/subject/pxxfoqtx.html