使用带有PHP的DOM获取元素的文本但返回错误[重复]

Posted

技术标签:

【中文标题】使用带有PHP的DOM获取元素的文本但返回错误[重复]【英文标题】:Get text of element using DOM with PHP but return error [duplicate] 【发布时间】:2021-03-26 01:52:04 【问题描述】:

我快疯了,我不明白为什么它不起作用......

我有这个带有这个 html 的网页:

<table>
  <tbody>
    <tr><tr>
    <tr>
        <td> 
            <table class="style124">
                <tbody>
                    <tr>
                        <td>A</td>
                    </tr>
                    <tr>
                        <td>B</td> //THIS !!
                    </tr>
                </tbody>
            </table>
        </td>
    </tr>
    <tr></tr>
  </tbody>
</table>

我想在&lt;td&gt; 中打印带有注释(“This”)的文本,我正在尝试:

$vuoto= $dom-> getElementsByClassName('style124') -> getElementsByTagName('tbody') -> getElementsByTagName('tr')[1] -> getElementsByTagName('td') ->textContent;
echo $vuoto;

但我有这个错误:

未捕获的错误:调用未定义的方法 DOMDocument::getElementsByClassName()

编辑

 $url = "https://www.nihilscio.it/Manuali/Lingua%20latina/Verbi/Coniugazione_latino.asp?verbo=fero1+&lang=IT_";


$ch = curl_init($url);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17');
curl_setopt($ch, CURLOPT_AUTOREFERER, true); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);

$html = curl_exec($ch);
$dom = new DOMDocument();

// set error level
$internalErrors = libxml_use_internal_errors(true);

$dom->loadHTML($html);
// Restore error level
libxml_use_internal_errors($internalErrors);

编辑 2

    <body>
      Il sito è in manutenzione,
      <a href="../../../Posta/contatti.asp?testoerrore=fero1 /">segnalateci</a> eventuali errori.

      <table>
        <!-- INIZIO TABELLA PRINCIPALE-->
        <tbody>
          <tr>
            <td style="height: 34px">
              Buona navigazione con NihilScio!&nbsp;&nbsp;<a target="_blank" href="https://www.facebook.com/NihilScio/"><img src="https://www.nihilscio.it/images/fb.jpg" ></a>

            </td>
          </tr>

          <tr>
            <td class="style105">
              <!--title="Herculaneum (Ercolano) - bassorilievo"-->
              <br>

              <span class="style112"><em>NS NihilScio</em></span>
              <br><a>Educational search engine</a>&nbsp;&nbsp;&nbsp;&nbsp;<a href="../../../index.asp">Home</a>
            </td>
          </tr>

          <tr>
            <td>

              <table cellspacing="0" cellpadding="0">


                <tbody>
                  <tr>
                    <td>

                      <form name="uscita">

                        <table class="style123" cellspacing="0" cellpadding="0">

                          <tbody>
                            <tr>
                              <td style="height: 22px">
                                Coniugazione/declinazione
                              </td>
                              <td style="height: 22px">
                              </td>

                            </tr>

                            <tr title="Per favore non utilizzare caratteri accentati o speciali">
                              <td>
                                <input type="text" id="verbo" name="verbo" value="fero1 " size="60" style="font-size: 16pt; width: 200px">
                                <!--    <input type="submit" onclick="javascript:validainput()" value="&gt;&gt;" style="font-size: 14pt; width: 37px; background-color: #6699cc; color:blue; height: 25px;"/>&nbsp; -->
                                <input type="submit" onclick="speciali(' ')" value=" >> " style=" width: 40px;  height: 28px;  font-size:large; background-color:#99CCFF;  ">
                                <br>
                                <a onclick="speciali('á')"> <span>á&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('é')"> <span>é&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('è')"> <span>è&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ì')"> <span>ì&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('í')"> <span>í&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ò')"> <span>ò</span> </a><br><br>
                                <a onclick="speciali('ó')"> <span>ó&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ù')"> <span>ù&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ú')"> <span>ú&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ü')"> <span>ü&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ñ')"> <span>ñ&nbsp;&nbsp;&nbsp;&nbsp;</span> </a>
                                <a onclick="speciali('ç')"> <span>ç</span> </a>


                              </td>
                              <td>

                                <a href="Coniugazione_latino.asp?verbo=fero1 &amp;lang=EN_"> English</a>

                              </td>
                            </tr>

                            <tr>

                              <td title="Questa opzione è necessaria solo quando il vocabolo (non latino) da cercare  può essere confuso con parole latine" style="height: 21px; ">
                                Traduci in Latino
                                <input type="checkbox" name="tradinv" value="1">


                              </td>
                              <td style="height: 21px">

                                <a href="Coniugazione_latino.asp?verbo=fero1 &amp;lang=ES_">Español</a>
                              </td>
                            </tr>

                            <tr>

                              <td style="height: 22px; " valign="bottom"><span class="HTML_TAG">


                                  It<input type="radio" name="lang" value="IT_" checked="checked">

                                  En<input type="radio" name="lang" value="EN_">

                                  Es<input type="radio" name="lang" value="ES_">



                                </span>
                              </td>
                              <td style="height: 22px">

                                <img  src="https://www.nihilscio.it/Ita.png">Italiano

                              </td>
                            </tr>

                          </tbody>
                        </table>
                      </form>

                    </td>
                  </tr>

                </tbody>
              </table>

            </td>
          </tr>
          <tr>
            <td>
              <!--style="height: 136px>------ SECONDA RIGA T. P. -->
              <button id="frasi" onclick="javascript:PopupCentrata1('https://www.nihilscio.it/NS/EstraiFrasiClassici.asp?verbo=fero1 /&amp;verboconsel=&amp;lang0=IT_&amp;lang=LT_')" class="style119"> Vocabolari e frasi</button>
              <!--  <table cellspacing="0"> -->
            </td>
          </tr>

          <tr>
            <td>
              <table class="style124">
                <tbody>
                  <tr>
                    <td><span class="HTML_TAG">Vocaboli trovati: </span></td>
                  </tr>
                  <!--<td></td></tr> -->

                  <tr>
                    <td>

                    </td>
                    <!--<td title="">more...</td> -->
                  </tr>

                </tbody>
              </table>
            </td>
          </tr>

            

    <tr><td>
        
        <span class="style13"> fero1:</span>
        <span class="HTML_ELM">vocabolo non trovato.</span><a href="../../../Posta/contatti.asp?testoerrore=fero1 /">Comunica errori</a>
        

    </td></tr>
    <tr><td>
        La parola non è ancora in archivio.<br>Grazie per averla cercata<br>Riprova più tardi<br>Si raccomanda di non utilizzare segni di punteggiatura e caratteri speciali e di scrivere al massimo 5 parole interspaziate.<br>es.: puella a bonis pueris amata est <br>In caso di errore ci scusiame e la preghiamo di <a href="../../../Posta/contatti.asp?testoerrore=fero1 /">comunicarcelo.</a>
      
    </td></tr>


         <tr><td>
        <a href="https://www.nihilscio.it/nihilscio.asp?cat=Latino&amp;keyw=fero1 /" class="HTML_ELM">
        <span class="style43"><strong>Continua la ricerca sul web con NihilScio</strong></span></a>
        </td></tr>
        
    <tr><td><span><em>NS-NihilScio</em>©2009-2020</span></td></tr>
    </tbody></table>
                


    <script type="text/javascript">


    function Popup(apri)
    

    var asse_x = event.clientX;
        var asse_y = event.clientY;
        
     var stile = "top="+asse_y + ",left="+asse_x + ",width=250, height=30, lacation=no, status=no, menubar=no, toolbar=no, scrollbars=no";
       
    let newWin = window.open("", "traduzione", stile  ); /*"width=200,height=200"); about:blank*/
    /*newWin.document.write("Hello, world!"); */
    /*let newWin = window.open("traduzione", "hello", "width=200,height=200");*/

    newWin.document.write(apri); 
    //newWin.close();
     /* window.open(apri, "", stile);*/
    

            
    function PopupCentrata1(apri)
        
        var w = 900;
        var h = 350;
        document.getElementById("frasi").innerHTML = '<iframe' + ' src=' + '"' + apri + '"' + ' width=' + w  + ' height=' + h + '></iframe>'; //"\"" + + "\"" 
        


    function visualizzatuttoatt0() 
        document.getElementById("ConCompAtt0").innerHTML =  "";
        
            
    function visualizzatuttopass0() 
        document.getElementById("ConCompPass0").innerHTML =  "";
        
            
    function visualizzatuttoatt1() 
        document.getElementById("ConCompAtt1").innerHTML =  "";
        

    function visualizzatuttopass1() 
        document.getElementById("ConCompPass1").innerHTML = "";
        
                
    function visualizzatuttoatt2() 
        document.getElementById("ConCompAtt2").innerHTML = "";
        
                
    function visualizzatuttopass2() 
        document.getElementById("ConCompPass2").innerHTML = "";
        
                
    function visualizzatuttoatt3() 
        document.getElementById("ConCompAtt3").innerHTML =  "";
        
                
    function visualizzatuttopass3() 
        document.getElementById("ConCompPass3").innerHTML =  "";
         
        </script>  


        
    <script type="text/javascript">
    function speciali(lettera) 
    document.uscita.verbo.value+= lettera;
    document.uscita.verbo.focus();
    ;
    </script>

    </body>

【问题讨论】:

你在哪里见过getElementsByClassName 【参考方案1】:

没有getElementsByClassName 函数。您可以使用https://www.php.net/manual/en/domdocument.getelementsbytagname.php 遍历特定类型的所有元素,然后检查它们的类以获取指定值https://www.php.net/manual/en/domelement.getattribute.php。或者,您可以使用xpath

$html = '<table>
  <tbody>
    <tr><tr>
    <tr>
        <td> 
            <table class="style124">
                <tbody>
                    <tr>
                        <td>A</td>
                    </tr>
                    <tr>
                        <td>B</td> //THIS !!
                    </tr>
                </tbody>
            </table>
        </td>
    </tr>
    <tr></tr>
  </tbody>
</table>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$query = '//table[contains(@class, "style124")]/tbody/tr[2]/td';
$entries = $xpath->query($query);
foreach($entries as $entry)
    echo $entry->nodeValue;

https://3v4l.org/pDFhB

另外,getElementsByTagName 返回一个结果集,你不能只用-&gt; 访问它,你需要告诉它你想要哪个。

替代方法可能是:

$html = '<table>
  <tbody>
    <tr><tr>
    <tr>
        <td> 
            <table class="style124">
                <tbody>
                    <tr>
                        <td>A</td>
                    </tr>
                    <tr>
                        <td>B</td> //THIS !!
                    </tr>
                </tbody>
            </table>
        </td>
    </tr>
    <tr></tr>
  </tbody>
</table>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$tables = $dom->getElementsByTagName('table');
foreach($tables as $table)
    if(preg_match('/\bstyle124\b/', $table->getAttribute('class')))
        $trs = $table->getElementsByTagName('tr');
        $tds = $trs[1]->getElementsByTagName('td');
        echo $tds[0]->nodeValue;
    

注意 PHP 和 xpath 之间的基本索引差异,$trs[1]tr[2]

https://3v4l.org/hLEFV

【讨论】:

我使用你的第一种方法并且有效!但是如果 是空的,我怎么知道呢? @Borja 喜欢3v4l.org/Gpaq9 或3v4l.org/Nfr7a? 啊,只是空的 :D 我试过 strlen ...非常感谢! 如果只是空格也无效,也可以使用trim。 3v4l.org/h0H8q nothing...not work :(此 html 在网页内,我尝试使用您的代码但没有...这是网页:nihilscio.it/Manuali/Lingua%20latina/Verbi/…

以上是关于使用带有PHP的DOM获取元素的文本但返回错误[重复]的主要内容,如果未能解决你的问题,请参考以下文章

DOM 基础

DOM 关于dom的

jacascript DOM节点获取

我的代码框上带有 react & parcel 的“目标容器不是 DOM 元素”错误

在 XPath 中获取(文本)

使用 webdriver 从 javascript 警报中获取文本元素