18910140161

html-当div类名使用Python是动态的时,如何从div标记中提取数据?-堆栈溢出

顺晟科技

2022-10-19 12:14:36

23

我正在搜索网站tickertape在此处输入链接描述,以提取有关产品的信息。 解析网站后的预期结果

我面临的问题,div类信息非常动态

我为提取信息而开发的代码

<div data-section-tag="key-metrics" class="jsx-382396230 ratios-card sp-card"><h2 class="jsx-382396230"><span class="jsx-382396230 content">Key Metrics</span></h2><div class="jsx-382396230 stats"><div class="jsx-1785027547 statbox "><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Realtime NAV</span><span class="jsx-559150734 ellipsis  mob--only">Realtime NAV</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Realtime NAV</h4><p class="jsx-559150734 lh-138">Value of each share's portion of the underlying assets and cash</p></div></span></div><div class="value   text-15 ellipsis">₹ 181.73</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">AUM</span><span class="jsx-559150734 ellipsis text-center mob--only">AUM</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-middle  content-top content-middle  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">AUM</h4><p class="jsx-559150734 lh-138">The total market value of funds managed by the Asset Management Company</p></div></span></div><div class="value   text-15 ellipsis">₹ 1,335.35cr</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Expense Ratio</span><span class="jsx-559150734 ellipsis text-right mob--only">Expense Ratio</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Expense Ratio</h4><p class="jsx-559150734 lh-138">The operating and administrative costs of running the fund measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.12%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Exp Ratio</span><span class="jsx-559150734 ellipsis  mob--only">Cat. Expense Rat.</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Exp Ratio</h4><p class="jsx-559150734 lh-138">Average of the operating and administrative costs of running ETFs of the same sector measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.22%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Tracking Error</span><span class="jsx-559150734 ellipsis text-center mob--only">Tracking Error</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Tracking Error</h4><p class="jsx-559150734 lh-138">The difference between the performance of the security and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.08%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Tracking Err</span><span class="jsx-559150734 ellipsis text-right mob--only">Cat. Tracking Err.</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Tracking Err</h4><p class="jsx-559150734 lh-138">Average of the difference between the performance of the ETF's peers and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.27%</div></div></div></div></div>

当前我正在提取文本并根据长度拆分数组

是否有更好的方法提取信息?


顺晟科技:

我得到了所需的输出。我只使用scrapy来应用XPath。因为xpath帮助我轻松获取数据。

代码:

<div data-section-tag="key-metrics" class="jsx-382396230 ratios-card sp-card"><h2 class="jsx-382396230"><span class="jsx-382396230 content">Key Metrics</span></h2><div class="jsx-382396230 stats"><div class="jsx-1785027547 statbox "><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Realtime NAV</span><span class="jsx-559150734 ellipsis  mob--only">Realtime NAV</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Realtime NAV</h4><p class="jsx-559150734 lh-138">Value of each share's portion of the underlying assets and cash</p></div></span></div><div class="value   text-15 ellipsis">₹ 181.73</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">AUM</span><span class="jsx-559150734 ellipsis text-center mob--only">AUM</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-middle  content-top content-middle  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">AUM</h4><p class="jsx-559150734 lh-138">The total market value of funds managed by the Asset Management Company</p></div></span></div><div class="value   text-15 ellipsis">₹ 1,335.35cr</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Expense Ratio</span><span class="jsx-559150734 ellipsis text-right mob--only">Expense Ratio</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Expense Ratio</h4><p class="jsx-559150734 lh-138">The operating and administrative costs of running the fund measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.12%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Exp Ratio</span><span class="jsx-559150734 ellipsis  mob--only">Cat. Expense Rat.</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Exp Ratio</h4><p class="jsx-559150734 lh-138">Average of the operating and administrative costs of running ETFs of the same sector measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.22%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Tracking Error</span><span class="jsx-559150734 ellipsis text-center mob--only">Tracking Error</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Tracking Error</h4><p class="jsx-559150734 lh-138">The difference between the performance of the security and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.08%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Tracking Err</span><span class="jsx-559150734 ellipsis text-right mob--only">Cat. Tracking Err.</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Tracking Err</h4><p class="jsx-559150734 lh-138">Average of the difference between the performance of the ETF's peers and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.27%</div></div></div></div></div>

刮痧输出:

<div data-section-tag="key-metrics" class="jsx-382396230 ratios-card sp-card"><h2 class="jsx-382396230"><span class="jsx-382396230 content">Key Metrics</span></h2><div class="jsx-382396230 stats"><div class="jsx-1785027547 statbox "><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Realtime NAV</span><span class="jsx-559150734 ellipsis  mob--only">Realtime NAV</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Realtime NAV</h4><p class="jsx-559150734 lh-138">Value of each share's portion of the underlying assets and cash</p></div></span></div><div class="value   text-15 ellipsis">₹ 181.73</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">AUM</span><span class="jsx-559150734 ellipsis text-center mob--only">AUM</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-middle  content-top content-middle  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">AUM</h4><p class="jsx-559150734 lh-138">The total market value of funds managed by the Asset Management Company</p></div></span></div><div class="value   text-15 ellipsis">₹ 1,335.35cr</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Expense Ratio</span><span class="jsx-559150734 ellipsis text-right mob--only">Expense Ratio</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Expense Ratio</h4><p class="jsx-559150734 lh-138">The operating and administrative costs of running the fund measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.12%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Exp Ratio</span><span class="jsx-559150734 ellipsis  mob--only">Cat. Expense Rat.</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Exp Ratio</h4><p class="jsx-559150734 lh-138">Average of the operating and administrative costs of running ETFs of the same sector measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.22%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Tracking Error</span><span class="jsx-559150734 ellipsis text-center mob--only">Tracking Error</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Tracking Error</h4><p class="jsx-559150734 lh-138">The difference between the performance of the security and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.08%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Tracking Err</span><span class="jsx-559150734 ellipsis text-right mob--only">Cat. Tracking Err.</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Tracking Err</h4><p class="jsx-559150734 lh-138">Average of the difference between the performance of the ETF's peers and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.27%</div></div></div></div></div>

CSV中的输出:

<div data-section-tag="key-metrics" class="jsx-382396230 ratios-card sp-card"><h2 class="jsx-382396230"><span class="jsx-382396230 content">Key Metrics</span></h2><div class="jsx-382396230 stats"><div class="jsx-1785027547 statbox "><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Realtime NAV</span><span class="jsx-559150734 ellipsis  mob--only">Realtime NAV</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Realtime NAV</h4><p class="jsx-559150734 lh-138">Value of each share's portion of the underlying assets and cash</p></div></span></div><div class="value   text-15 ellipsis">₹ 181.73</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">AUM</span><span class="jsx-559150734 ellipsis text-center mob--only">AUM</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-middle  content-top content-middle  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">AUM</h4><p class="jsx-559150734 lh-138">The total market value of funds managed by the Asset Management Company</p></div></span></div><div class="value   text-15 ellipsis">₹ 1,335.35cr</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Expense Ratio</span><span class="jsx-559150734 ellipsis text-right mob--only">Expense Ratio</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Expense Ratio</h4><p class="jsx-559150734 lh-138">The operating and administrative costs of running the fund measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.12%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Exp Ratio</span><span class="jsx-559150734 ellipsis  mob--only">Cat. Expense Rat.</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Exp Ratio</h4><p class="jsx-559150734 lh-138">Average of the operating and administrative costs of running ETFs of the same sector measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.22%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Tracking Error</span><span class="jsx-559150734 ellipsis text-center mob--only">Tracking Error</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Tracking Error</h4><p class="jsx-559150734 lh-138">The difference between the performance of the security and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.08%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Tracking Err</span><span class="jsx-559150734 ellipsis text-right mob--only">Cat. Tracking Err.</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Tracking Err</h4><p class="jsx-559150734 lh-138">Average of the difference between the performance of the ETF's peers and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.27%</div></div></div></div></div>

我将在标记中提取包含所有页面数据的JS对象,并用包进行解析,然后提取所需的值:

<div data-section-tag="key-metrics" class="jsx-382396230 ratios-card sp-card"><h2 class="jsx-382396230"><span class="jsx-382396230 content">Key Metrics</span></h2><div class="jsx-382396230 stats"><div class="jsx-1785027547 statbox "><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Realtime NAV</span><span class="jsx-559150734 ellipsis  mob--only">Realtime NAV</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Realtime NAV</h4><p class="jsx-559150734 lh-138">Value of each share's portion of the underlying assets and cash</p></div></span></div><div class="value   text-15 ellipsis">₹ 181.73</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">AUM</span><span class="jsx-559150734 ellipsis text-center mob--only">AUM</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-middle  content-top content-middle  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">AUM</h4><p class="jsx-559150734 lh-138">The total market value of funds managed by the Asset Management Company</p></div></span></div><div class="value   text-15 ellipsis">₹ 1,335.35cr</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Expense Ratio</span><span class="jsx-559150734 ellipsis text-right mob--only">Expense Ratio</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Expense Ratio</h4><p class="jsx-559150734 lh-138">The operating and administrative costs of running the fund measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.12%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Exp Ratio</span><span class="jsx-559150734 ellipsis  mob--only">Cat. Expense Rat.</span><div class="jsx-324047672 tooltip-root  arrow-bottom arrow-left  content-top content-left  font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Exp Ratio</h4><p class="jsx-559150734 lh-138">Average of the operating and administrative costs of running ETFs of the same sector measured as the percentage of fund assets</p></div></span></div><div class="value   text-15 ellipsis">0.22%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Tracking Error</span><span class="jsx-559150734 ellipsis text-center mob--only">Tracking Error</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Tracking Error</h4><p class="jsx-559150734 lh-138">The difference between the performance of the security and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.08%</div></div><div><div class="title  font-medium text-dark text-14 pointer"><span class="jsx-559150734 key-ratio-title relative"><span class="jsx-559150734 ellipsis desktop--only">Category Tracking Err</span><span class="jsx-559150734 ellipsis text-right mob--only">Cat. Tracking Err.</span><div class="jsx-324047672 tooltip-root    font-regular text-13 lh-138" style="color: rgb(255, 255, 255);"><h4 class="jsx-559150734 tooltip-head mb4 font-medium">Category Tracking Err</h4><p class="jsx-559150734 lh-138">Average of the difference between the performance of the ETF's peers and the benchmark index that it tracks</p></div></span></div><div class="value   text-15 ellipsis">0.27%</div></div></div></div></div>
  • TAG:
相关文章
我们已经准备好了,你呢?
2024我们与您携手共赢,为您的企业形象保驾护航