正则表达式选择 div 中的所有内容

Posted

技术标签:

【中文标题】正则表达式选择 div 中的所有内容【英文标题】:Regex to select all the content within the div 【发布时间】:2017-12-27 17:51:18 【问题描述】:

我正在尝试为多行字符串构建正则表达式 php 模式,以删除具有 sn_published 类的 div

我已经尝试过(?s)<div class="sn_published".*?<\/div>,它可以工作,但无法获得 div 中的全部内容。它在第一个结束 div 处停止。

有什么想法吗?提前致谢。

  <div class="sn_published">

            <div id="snfwid4" class="sn_published_inner clearfix">
                <script type="text/javascript">
                /* <![CDATA[ */
                if(!$.browser.msie||$.browser.version>7)if($.browser.msie&&$.browser.version<9)
                    document.write('<div class="fb"><iframe src="/9jagallery//www.facebook.com/plugins/like.php;jsessionid=4C9A788DF2A51CA6C494B8A4150A4605?href=http://www.9jagallery.ng/6999862?utm_source=facebook&utm_medium=social&utm_campaign=9jagallery_web&amp;action=recommend&amp;send=false&amp;layout=button_count&amp;width=168&amp;show_faces=false&amp;font&amp;colorscheme=light&amp;height=21&amp;appId=964432556951162" scrolling="no" frameborder="0" style="border:none;overflow:hidden;width:168px;height:21px;" allowTransparency="true"></iframe></div>');
                else
                    document.write('<div id="fwid5" class="fb"><fb:share-button type="button_count" href="http://www.9jagallery.ng/6999862?utm_source=facebook&utm_medium=social&utm_campaign=9jagallery_web"></fb:share-button></div>');
                    fbAsyncIds.push('fwid5');
                    fbStatUrls.push('https://www.blick.ch/stats/?rt=1&amp;objId=6999862&amp;type=article&amp;ctxId=1912&amp;pubId=2&amp;cat=news&amp;meta=like&amp;title=See+the+male+celebrities+in+trad%2C+dapper+looks+on+the+red+carpet&amp;url=http%3A%2F%2Fwww.9jagallery.ng%2Ffashion%2Famaa-2017-see-the-male-celebrities-in-trad-dapper-looks-on-the-red-carpet-id6999862.html');
                
                /* ]]> */
            </script><div id="fwid5" class="fb"><fb:share-button type="button_count" href="http://www.9jagallery.ng/6999862?utm_source=facebook&amp;utm_medium=social&amp;utm_campaign=9jagallery_web" class=" fb_iframe_widget" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=&amp;container_width=0&amp;href=http%3A%2F%2Fwww.9jagallery.ng%2F6999862%3Futm_source%3Dfacebook%26utm_medium%3Dsocial%26utm_campaign%3D9jagallery_web&amp;locale=en_US&amp;sdk=joey&amp;type=button_count"><span style="vertical-align: bottom; width: 69px; height: 20px;"><iframe name="fa3d670dae4582"   frameborder="0" allowtransparency="true" allowfullscreen="true" scrolling="no" title="fb:share_button Facebook Social Plugin" src="https://www.facebook.com/plugins/share_button.php?app_id=&amp;channel=http%3A%2F%2Fstaticxx.facebook.com%2Fconnect%2Fxd_arbiter%2Fr%2FXBwzv5Yrm_1.js%3Fversion%3D42%23cb%3Df3203d50df82472%26domain%3Dwww.9jagallery.ng%26origin%3Dhttp%253A%252F%252Fwww.9jagallery.ng%252Ff31a62e488eb1d4%26relation%3Dparent.parent&amp;container_width=0&amp;href=http%3A%2F%2Fwww.9jagallery.ng%2F6999862%3Futm_source%3Dfacebook%26utm_medium%3Dsocial%26utm_campaign%3D9jagallery_web&amp;locale=en_US&amp;sdk=joey&amp;type=button_count" class="" style="border: none; visibility: visible; width: 69px; height: 20px;"></iframe></span></fb:share-button></div>
    <div class="gp">
        <div id="___plusone_0" style="text-indent: 0px; margin: 0px; padding: 0px; background-color: transparent; border-style: none; float: none; line-height: normal; font-size: 1px; vertical-align: baseline; display: inline-block; width: 90px; height: 20px; background-position: initial initial; background-repeat: initial initial;"><iframe ng-non-bindable="" frameborder="0" hspace="0" margin margin scrolling="no" style="position: static; top: 0px; width: 90px; margin: 0px; border-style: none; left: 0px; visibility: visible; height: 20px;" tabindex="0" vspace="0"  id="I0_1500641915432" name="I0_1500641915432" src="https://apis.google.com/u/0/se/0/_/+1/fastbutton?usegapi=1&amp;size=medium&amp;hl=en&amp;origin=http%3A%2F%2Fwww.9jagallery.ng&amp;url=http%3A%2F%2Fwww.9jagallery.ng%2F6999862&amp;gsrc=3p&amp;jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en.m8KuVzGTpkA.O%2Fm%3D__features__%2Fam%3DAQ%2Frt%3Dj%2Fd%3D1%2Frs%3DAGLTcCNcaOvNVX1pvUOBoBGzpH6DVnAaSQ#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&amp;id=I0_1500641915432&amp;parent=http%3A%2F%2Fwww.9jagallery.ng&amp;pfname=&amp;rpctoken=76330505" data-gapiattached="true" title="G+"></iframe></div>
    </div><div class="tw" style="margin-right:0px;">
        <iframe id="twitter-widget-1" scrolling="no" frameborder="0" allowtransparency="true" class="twitter-share-button twitter-share-button-rendered twitter-tweet-button" title="Twitter Tweet Button" src="http://platform.twitter.com/widgets/tweet_button.5f60791584f95f2ec483faec8b16a58b.en.html#dnt=false&amp;id=twitter-widget-1&amp;lang=en&amp;original_referer=http%3A%2F%2Fwww.9jagallery.ng%2Ffashion%2Famaa-2017-see-the-male-celebrities-in-trad-dapper-looks-on-the-red-carpet-id6999862.html&amp;size=m&amp;text=AMAA%202017%3A%20See%20the%20male%20celebrities%20in%20trad%2C%20dapper%20looks%20on%20the%20red%20carpet%20%409jagalleryNigeria247&amp;time=1500641916562&amp;type=share&amp;url=http%3A%2F%2Fwww.9jagallery.ng%2F6999862%3Futm_source%3Dtwitter%26utm_medium%3Dsocial%26utm_campaign%3D9jagallery_web" data-url="http://www.9jagallery.ng/6999862?utm_source=twitter&amp;utm_medium=social&amp;utm_campaign=9jagallery_web" style="position: static; visibility: visible; width: 61px; height: 20px;"></iframe>
    </div>
    </div>
        </div>
<div foobar>jhuj </div>

【问题讨论】:

你的预期输出是什么? 在另一个问题上见this answer。 存在 DOM 解析器时不要使用正则表达式解析 HTML 你试过 DOMDocument 吗? 我很少会放弃发布指向 the best answer on Stack Overflow ever 的链接的机会 - 谢谢! 【参考方案1】:

你不能。对于这种情况,您应该使用 DomDocument。这会让你的工作轻松很多。

您可以在 Dom 中加载 HTML,使用 getElementsByTagName 方法遍历所有 Div 并过滤您需要的内容。

【讨论】:

以上是关于正则表达式选择 div 中的所有内容的主要内容,如果未能解决你的问题,请参考以下文章

jquery的选择器中可以使用使用正则表达式吗?

Sublime中的正则表达式如何选择包括换行符在内的所有内容到终点

正则表达式 - 从 PHP 中的 html 字符串获取表格

如何使用正则表达式选择除捕获组之外的所有内容?

正则表达式选择所有不在引号中的空格?

求一条c# 正则表达式,来获取HTML标签的内容