正则表达式选择 div 中的所有内容
Posted
技术标签:
【中文标题】正则表达式选择 div 中的所有内容【英文标题】:Regex to select all the content within the div 【发布时间】:2017-12-27 17:51:18 【问题描述】:我正在尝试为多行字符串构建正则表达式 php 模式,以删除具有 sn_published 类的 div
我已经尝试过(?s)<div class="sn_published".*?<\/div>
,它可以工作,但无法获得 div 中的全部内容。它在第一个结束 div 处停止。
有什么想法吗?提前致谢。
<div class="sn_published">
<div id="snfwid4" class="sn_published_inner clearfix">
<script type="text/javascript">
/* <![CDATA[ */
if(!$.browser.msie||$.browser.version>7)if($.browser.msie&&$.browser.version<9)
document.write('<div class="fb"><iframe src="/9jagallery//www.facebook.com/plugins/like.php;jsessionid=4C9A788DF2A51CA6C494B8A4150A4605?href=http://www.9jagallery.ng/6999862?utm_source=facebook&utm_medium=social&utm_campaign=9jagallery_web&action=recommend&send=false&layout=button_count&width=168&show_faces=false&font&colorscheme=light&height=21&appId=964432556951162" scrolling="no" frameborder="0" style="border:none;overflow:hidden;width:168px;height:21px;" allowTransparency="true"></iframe></div>');
else
document.write('<div id="fwid5" class="fb"><fb:share-button type="button_count" href="http://www.9jagallery.ng/6999862?utm_source=facebook&utm_medium=social&utm_campaign=9jagallery_web"></fb:share-button></div>');
fbAsyncIds.push('fwid5');
fbStatUrls.push('https://www.blick.ch/stats/?rt=1&objId=6999862&type=article&ctxId=1912&pubId=2&cat=news&meta=like&title=See+the+male+celebrities+in+trad%2C+dapper+looks+on+the+red+carpet&url=http%3A%2F%2Fwww.9jagallery.ng%2Ffashion%2Famaa-2017-see-the-male-celebrities-in-trad-dapper-looks-on-the-red-carpet-id6999862.html');
/* ]]> */
</script><div id="fwid5" class="fb"><fb:share-button type="button_count" href="http://www.9jagallery.ng/6999862?utm_source=facebook&utm_medium=social&utm_campaign=9jagallery_web" class=" fb_iframe_widget" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=&container_width=0&href=http%3A%2F%2Fwww.9jagallery.ng%2F6999862%3Futm_source%3Dfacebook%26utm_medium%3Dsocial%26utm_campaign%3D9jagallery_web&locale=en_US&sdk=joey&type=button_count"><span style="vertical-align: bottom; width: 69px; height: 20px;"><iframe name="fa3d670dae4582" frameborder="0" allowtransparency="true" allowfullscreen="true" scrolling="no" title="fb:share_button Facebook Social Plugin" src="https://www.facebook.com/plugins/share_button.php?app_id=&channel=http%3A%2F%2Fstaticxx.facebook.com%2Fconnect%2Fxd_arbiter%2Fr%2FXBwzv5Yrm_1.js%3Fversion%3D42%23cb%3Df3203d50df82472%26domain%3Dwww.9jagallery.ng%26origin%3Dhttp%253A%252F%252Fwww.9jagallery.ng%252Ff31a62e488eb1d4%26relation%3Dparent.parent&container_width=0&href=http%3A%2F%2Fwww.9jagallery.ng%2F6999862%3Futm_source%3Dfacebook%26utm_medium%3Dsocial%26utm_campaign%3D9jagallery_web&locale=en_US&sdk=joey&type=button_count" class="" style="border: none; visibility: visible; width: 69px; height: 20px;"></iframe></span></fb:share-button></div>
<div class="gp">
<div id="___plusone_0" style="text-indent: 0px; margin: 0px; padding: 0px; background-color: transparent; border-style: none; float: none; line-height: normal; font-size: 1px; vertical-align: baseline; display: inline-block; width: 90px; height: 20px; background-position: initial initial; background-repeat: initial initial;"><iframe ng-non-bindable="" frameborder="0" hspace="0" margin margin scrolling="no" style="position: static; top: 0px; width: 90px; margin: 0px; border-style: none; left: 0px; visibility: visible; height: 20px;" tabindex="0" vspace="0" id="I0_1500641915432" name="I0_1500641915432" src="https://apis.google.com/u/0/se/0/_/+1/fastbutton?usegapi=1&size=medium&hl=en&origin=http%3A%2F%2Fwww.9jagallery.ng&url=http%3A%2F%2Fwww.9jagallery.ng%2F6999862&gsrc=3p&jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en.m8KuVzGTpkA.O%2Fm%3D__features__%2Fam%3DAQ%2Frt%3Dj%2Fd%3D1%2Frs%3DAGLTcCNcaOvNVX1pvUOBoBGzpH6DVnAaSQ#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&id=I0_1500641915432&parent=http%3A%2F%2Fwww.9jagallery.ng&pfname=&rpctoken=76330505" data-gapiattached="true" title="G+"></iframe></div>
</div><div class="tw" style="margin-right:0px;">
<iframe id="twitter-widget-1" scrolling="no" frameborder="0" allowtransparency="true" class="twitter-share-button twitter-share-button-rendered twitter-tweet-button" title="Twitter Tweet Button" src="http://platform.twitter.com/widgets/tweet_button.5f60791584f95f2ec483faec8b16a58b.en.html#dnt=false&id=twitter-widget-1&lang=en&original_referer=http%3A%2F%2Fwww.9jagallery.ng%2Ffashion%2Famaa-2017-see-the-male-celebrities-in-trad-dapper-looks-on-the-red-carpet-id6999862.html&size=m&text=AMAA%202017%3A%20See%20the%20male%20celebrities%20in%20trad%2C%20dapper%20looks%20on%20the%20red%20carpet%20%409jagalleryNigeria247&time=1500641916562&type=share&url=http%3A%2F%2Fwww.9jagallery.ng%2F6999862%3Futm_source%3Dtwitter%26utm_medium%3Dsocial%26utm_campaign%3D9jagallery_web" data-url="http://www.9jagallery.ng/6999862?utm_source=twitter&utm_medium=social&utm_campaign=9jagallery_web" style="position: static; visibility: visible; width: 61px; height: 20px;"></iframe>
</div>
</div>
</div>
<div foobar>jhuj </div>
【问题讨论】:
你的预期输出是什么? 在另一个问题上见this answer。 存在 DOM 解析器时不要使用正则表达式解析 HTML 你试过 DOMDocument 吗? 我很少会放弃发布指向 the best answer on Stack Overflow ever 的链接的机会 - 谢谢! 【参考方案1】:你不能。对于这种情况,您应该使用 DomDocument。这会让你的工作轻松很多。
您可以在 Dom 中加载 HTML,使用 getElementsByTagName 方法遍历所有 Div 并过滤您需要的内容。
【讨论】:
以上是关于正则表达式选择 div 中的所有内容的主要内容,如果未能解决你的问题,请参考以下文章