Robots鍗忚

Posted LuBaogui

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Robots鍗忚相关的知识,希望对你有一定的参考价值。

Robots鍗忚

Robots 鍗忚涔熺О浣滄満鍣ㄤ汉鍗忚锛屼富瑕佺敤浜庢悳绱㈠紩鎿庡幓鎶撳彇缃戠珯椤甸潰銆傪煍嶉€氬父瀛樻斁鍦ㄧ綉绔欐牴鐩綍涓嬬殑robots.txt鏂囦欢銆傝鍗忚涓昏鍏嶅幓涓嶅繀瑕佺殑缃戠珯璺緞杩涜鐖彇銆傚浜庨拡瀵规€х殑鐖櫕銆傪煒勪篃灏辨病浠€涔堟剰涔変簡銆傚氨濂芥瘮鍛婅瘔灏忓伔锛屽埆鍋蜂綘鐨勪笢瑗裤€?/p>

鍩轰簬閬靛惊Robots鍗忚杩涜鐨勭埇铏紝棣栧厛浼氭鏌ョ珯鐐规牴鐩綍涓嬫槸鍚﹀瓨鍦╮obots.txt鏂囦欢锛屽鏋滃瓨鍦ㄥ垯鏍规嵁鍏朵腑瀹氫箟鐨勭埇鍙栬寖鍥磋繘琛岀埇鍙栥€傚鏋滄病鏈夊垯鐩存帴璁块棶椤甸潰銆?/p>

Robots瑙勮寖

鐢ㄦ埛浠g悊鎸囦护

浣跨敤user-agent鎸囦护鐢ㄤ簬鎸囧畾瑙勫垯閫傜敤浜庢墍鏈夌埇缃戠▼搴忥細

User-agent锛?*

涓昏鏈塆ooglebot銆丅aiduSpider绛夋爣璇?/p>

绂佹鎸囦护

閫氳繃涓€涓垨澶氫釜disallow 鎸囦护鏉ラ伒寰敤鎴蜂唬鐞嗚锛?/p>

User-agent锛? 
Disallow锛?User

disallow鎸囧畾url鍚庣紑绱ф帴鐫€/User鐨勯摼鎺ュ垯琚樆姝€?/p>

鍏佽鎸囦护

閫氳繃allow鎸囦护鍙互閬垮紑disallow闃绘鐨勯摼鎺ワ細

User-agent锛? 
Allow锛?User/007
Disallow锛?User

鍦╠isallow鎸囧畾url鍚庣紑绱ф帴鐫€/User鐨勯〉闈㈠垯琚樆姝㈠悗锛屽厑璁哥埇鍙?User/007閾炬帴鍦板潃銆?/p>

Sitemap 鎸囦护

涓昏鐢ㄤ簬鏍囪瘑缃戠珯鍦板浘锛?/p>

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://www.lubaogui.com/wp-sitemap.xml

鍦ㄥ悇涓悳绱㈠紩鎿庣珯闀垮伐鍏蜂笂浣跨敤锛屼究浜庢悳绱㈠紩鎿庢敹褰曫煍嶃€?/p>

Robots 浣跨敤

浣跨敤robotparser妯″潡杩涜妫€楠屾槸鍚﹂伒寰猺obots鍗忚锛屼唬鐮佸涓嬶細

from urllib.robotparser import RobotFileParser 
from urllib.request import urlopen

rp =RobotFileParser()
rp.parse(urlopen(\'https://www.lubaogui.com/robots.txt\')read()decode(\'utf-8\').split(\'n\')) 
print(rp.can_fetch(\'*\', \'https://www.lubaogui.com/96\'))

=== 鎵撳嵃缁撴灉 ===
True

馃懆鈥嶐煉昏櫧鐒剁埇铏櫘閬嶅苟娌$敤鍘婚伒寰猺obots.txt鍗忚锛屼絾鏄缓璁悇浣嶈繘琛岀埇鍙栨椂锛屽悎鐞嗗鐞嗘彁鍙栨晥鐜囥€傞伩鍏嶅奖鍝嶇洰鏍囩珯鐐圭殑璐熻浇銆?/p>

以上是关于Robots鍗忚的主要内容,如果未能解决你的问题,请参考以下文章

ppp鍗忚

KoP 姝e紡寮€婧愶細鍦?Apache Pulsar 涓婃敮鎸佸師鐢?Kafka 鍗忚

钃濈墮SDP鍗忚姒傝堪

HTTP1.1鍗忚涓枃鐗?RFC2616

璁$畻鏈虹綉缁溿€?銆戔€斺€?CSMA/CD鍗忚

娣卞叆鐞嗚В Http 鍜?Https