使用 PHP 获取 <script type="application/ld+json"> 的内容

Posted

技术标签:

【中文标题】使用 PHP 获取 <script type="application/ld+json"> 的内容【英文标题】:Get content of <script type="application/ld+json"> using PHP 【发布时间】:2016-06-28 17:19:59 【问题描述】:

我找不到用于 Vine 的 API 来获取页面内容的标题、描述和图像。 JSON 位于页面本身的正文中的脚本标记中: .如何使用 php 获取此脚本标签的内容(JSON)以便对其进行解析?

藤页:

https://vine.co/v/igO3EbIXDlI

来自页面来源

<script type="application/ld+json">
            
              "@context": "http://schema.org",
              "@type": "SocialMediaPosting",
              "url": "https://vine.co/v/igO3EbIXDlI",
              "datePublished": "2016-03-01T00:58:35",
              "author": 
                "@type": "Person",
                "name": "MotorAddicts\u2122",
                "image": "https://v.cdn.vine.co/r/avatars/39FEFED72B1242718633613316096_pic-r-1439261422661708f3e9755.jpg.jpg?versionId=LPjQUQ4KmTIPLu3iDbXw4FipgjEpC6fw",
                "url": "https://vine.co/u/989736283540746240"
              ,
              "articleBody": "Mmm...  Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
              "image": "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
              "interactionCount": [
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserLikes",
                "value": "1382"
              , 
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserShares",
                "value": "368"
              , 
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserComments",
                "value": "41"
              , 
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserViews",
                "value": "80575"
              ],

              "sharedContent": 
                "@type": "VideoObject",
                "name" : "Mmm...  Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
                "description" : "",
                "thumbnailUrl" : "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
                "uploadDate" : "2016-03-01T00:58:35",
                "contentUrl" : "https://v.cdn.vine.co/r/videos_h264high/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.mp4?versionId=w7ugLPYtj5LWeVUsXaH1bt2VuK8QE0qv",
                "embedUrl" : "https://vine.co/v/igO3EbIXDlI/embed/simple",
                "interactionCount" : "82366"
              
            
          </script>

之后该怎么办?

$html = 'https://vine.co/v/igO3EbIXDlI';
$dom = new DOMDocument;
$dom->loadHTML($html);

更新:

我在这里找到了 Vine API 的说明:

https://dev.twitter.com/web/vine/oembed

要查询 JSON 的 Vine API,请从以下位置获取请求:

https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2F[videoid]

例子:

https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2FMl16lZVTTxe

【问题讨论】:

使用simplehtmldom.sourceforge.net,因为它有很多例子,你很容易理解。 【参考方案1】:

您可以为此使用DOMDocumentDOMXpath

$html = file_get_contents( $url );
$dom  = new DOMDocument();
libxml_use_internal_errors( 1 );
$dom->loadHTML( $html );
$xpath = new DOMXpath( $dom );
$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
$json = trim( $jsonScripts->item(0)->nodeValue );

$data = json_decode( $json );

phpFiddle demo

使用此 xPath 模式,您可以搜索所有 &lt;script&gt; 具有属性 type 为“application/ld+json”的节点:

//                              Following path no matter where they are in the document
script                          Elements <script>
[@type="application/ld+json"]   with attribute “tipe” as “application/ld+json”

然后您检索您的 JSON 字符串,获取第一个返回的&lt;script&gt; 节点的-&gt;nodeValue

如果您事先不知道节点的存在和/或其位置,请使用:

$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
if( $jsonScripts->length < 1 )

    die( "Error: No script node found" );

else

    foreach( $jsonScripts as $node )
    
        $json = json_decode( $node->nodeValue );

        // your stuff with JSON ...
    

【讨论】:

非常好。我刚刚使用 simple_html_dom.php 自己解决了我的答案,然后回到你的答案。现在我不必在页面中要求 simple_html_dom.php 在我看来$json = trim( $script-&gt;item(0)-&gt;nodeValue ); 应该是$json = trim( $jsonScripts-&gt;item(0)-&gt;nodeValue ); $jsonScript->length length 如果您没有收到输出,请尝试查看 json 格式是否错误。它发生在我身上,我按照这个建议进行了整理***.com/a/20845642/801949【参考方案2】:
$html_content = file_get_contents('https://vine.co/v/igO3EbIXDlI');

$target_class = 'script';

$dom_object = new DOMDocument;
$dom_object->loadHTML($html_content);
$xpath_object = new DOMXpath($dom_object);

$elements = $xpath_object->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' $target_class ')]");

$output = []
foreach ($elements as $element)

    $output[] = $dom_object->saveHTML($element);


# you now have a list of strings, each containing the contents of a 
# non-overlapping script tag

【讨论】:

以上是关于使用 PHP 获取 <script type="application/ld+json"> 的内容的主要内容,如果未能解决你的问题,请参考以下文章

jsonp跨域请求php接口

使用后退按钮后获取检查值

js和php在同一个文件中相互获取值的问题

php 怎么接收前端传来的json数据

JavaScript获取当前域名

PHP中的<script>部分的值怎么传递出来呢?