BeautifulSoup实现博文简介与过滤恶意标签（xxs攻击）

Posted 2021-01-05 changwoo

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了BeautifulSoup实现博文简介与过滤恶意标签（xxs攻击）相关的知识，希望对你有一定的参考价值。

一、BeautifulSoup模块

二、博文简介

三、过滤恶意标签

一、BeautifulSoup模块

pip install bs4  # 安装bs4
 
from bs4 import BeautifulSoup  # 导入BeautifulSoup

二、博文简介

from bs4 import BeautifulSoup
 
content = ‘<a href="http://example.com/">I linked to <i>example.com</i></a>‘
soup = BeautifulSoup(content, ‘html.parser‘)
overview = soup.text[0:9]
print(overview)

三、过滤恶意标签

from bs4 import BeautifulSoup
 
content = ‘<a href="http://example.com/">I linked to <i>example.com</i></a><div><img src=""></img>image</div><a>link</a><script>alert(123)</script>‘
soup = BeautifulSoup(content, ‘html.parser‘)
print(soup)  # 这里带有script标签的脚本
 
for tag in soup.find_all():
    if tag.name in [‘script‘, ‘link‘]:
        tag.decompose()
 
print(soup)  # 这里已经把带有script标签的脚本去掉了

以上是关于BeautifulSoup实现博文简介与过滤恶意标签（xxs攻击）的主要内容，如果未能解决你的问题，请参考以下文章