18910140161

Python-用beautifulsoup从span中提取元素-堆栈溢出

顺晟科技

2022-10-19 13:28:06

95

我正试图从这个网站中提取下图中用红色圈出的元素: https://eresearch.fidelity.com/eresearch/markets_sectors/sectors_in_market.jhtml?tab=learn§or=25

但是,它总是给我一个错误“ResultSet对象没有属性'find'。您可能将一个元素列表视为单个元素。当您打算调用find()时,是否调用了find_all()?”

我的想法是将搜索范围缩小到“td”标记,并使用find从“span”标记中获取元素,但我就是无法使其工作。我尝试使用find()和find_all(),但都给我带来了这个错误,下面是我的代码:

from bs4 import BeautifulSoup
url = "https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml? 
tab=learn&sector=25"
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
b = soup.select('div.left-content table.snapshot-data-tbl tr td') 
print(b.find("span", class_='positive').text) 

我能得到一些帮助吗?谢谢!


顺晟科技:

返回一个容器,其中包含来自提供的CSS选择器的匹配项。方法只能应用于页源中的元素节点,而不能应用于结果集。相反,您可以遍历的结果从目标s:

获取文本
from bs4 import BeautifulSoup
url = "https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml? 
tab=learn&sector=25"
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
b = soup.select('div.left-content table.snapshot-data-tbl tr td') 
print(b.find("span", class_='positive').text) 

输出:

from bs4 import BeautifulSoup
url = "https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml? 
tab=learn&sector=25"
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
b = soup.select('div.left-content table.snapshot-data-tbl tr td') 
print(b.find("span", class_='positive').text) 

通常,如果可能的话,通过较长的CSS选择器按类或id查找元素会更干净。在本例中,您需要的数据位于前3个td元素中的class=“snapshot-data-tbl”的表中。

这里有另一个提取值的解决方案:

输出:

from bs4 import BeautifulSoup
url = "https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml? 
tab=learn&sector=25"
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
b = soup.select('div.left-content table.snapshot-data-tbl tr td') 
print(b.find("span", class_='positive').text) 
  • TAG:
相关文章
我们已经准备好了,你呢?
2024我们与您携手共赢,为您的企业形象保驾护航