18910140161

JavaScript-Python bs4从<head>-Stack溢出

顺晟科技

2022-10-19 12:33:56

212

查找脚本内容

我试图从中获取脚本标记的内容,但它不起作用。

这是我正在尝试web Scrap的脚本标记

<script type="text/javascript">_rg.update({"bootstrap":{"apps":{"comingLeaving":{},"canonicalsDir":{"data":[],"isLoading":false,"hasLoaded":false,"loadError":false},"sitemap":{}},"entities":{"entries":{"movie:3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f:@global":{"title":"My Neighbor Totoro","released_on":"1988-04-16T00:00:00","imdb_rating":8.2,"rt_critics_rating":94,"rg_content_score":100,"has_poster":true,"has_backdrop":true,"slug":"my-neighbor-totoro-1988","rg_id":"3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f...</script>

这是我的代码:

<script type="text/javascript">_rg.update({"bootstrap":{"apps":{"comingLeaving":{},"canonicalsDir":{"data":[],"isLoading":false,"hasLoaded":false,"loadError":false},"sitemap":{}},"entities":{"entries":{"movie:3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f:@global":{"title":"My Neighbor Totoro","released_on":"1988-04-16T00:00:00","imdb_rating":8.2,"rt_critics_rating":94,"rg_content_score":100,"has_poster":true,"has_backdrop":true,"slug":"my-neighbor-totoro-1988","rg_id":"3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f...</script>

当我打印内容时,它不打印任何内容。 有什么帮助吗??


顺晟科技:

<script type="text/javascript">_rg.update({"bootstrap":{"apps":{"comingLeaving":{},"canonicalsDir":{"data":[],"isLoading":false,"hasLoaded":false,"loadError":false},"sitemap":{}},"entities":{"entries":{"movie:3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f:@global":{"title":"My Neighbor Totoro","released_on":"1988-04-16T00:00:00","imdb_rating":8.2,"rt_critics_rating":94,"rg_content_score":100,"has_poster":true,"has_backdrop":true,"slug":"my-neighbor-totoro-1988","rg_id":"3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f...</script>

我尝试了同样的请求,看看头部包含什么。并且没有脚本。

这是头部:

<script type="text/javascript">_rg.update({"bootstrap":{"apps":{"comingLeaving":{},"canonicalsDir":{"data":[],"isLoading":false,"hasLoaded":false,"loadError":false},"sitemap":{}},"entities":{"entries":{"movie:3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f:@global":{"title":"My Neighbor Totoro","released_on":"1988-04-16T00:00:00","imdb_rating":8.2,"rt_critics_rating":94,"rg_content_score":100,"has_poster":true,"has_backdrop":true,"slug":"my-neighbor-totoro-1988","rg_id":"3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f...</script>

您需要在请求中设置2个标头才能获得预期的页面源。一旦提取了包含所需数据的JavaScript对象,您需要修复Unrecaped“以生成有效的JSON,然后可以使用JSON包进行解析。我使用@tobias_k(以下引用)的一些代码来处理修复。

<script type="text/javascript">_rg.update({"bootstrap":{"apps":{"comingLeaving":{},"canonicalsDir":{"data":[],"isLoading":false,"hasLoaded":false,"loadError":false},"sitemap":{}},"entities":{"entries":{"movie:3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f:@global":{"title":"My Neighbor Totoro","released_on":"1988-04-16T00:00:00","imdb_rating":8.2,"rt_critics_rating":94,"rg_content_score":100,"has_poster":true,"has_backdrop":true,"slug":"my-neighbor-totoro-1988","rg_id":"3fe720fa-13dd-4421-9e0b-0ce6a2efdd4f...</script>
  • TAG:
相关文章
我们已经准备好了,你呢?
2024我们与您携手共赢,为您的企业形象保驾护航