在 Django 后端使用 Clamav 设置文件上传流扫描
Posted
技术标签:
【中文标题】在 Django 后端使用 Clamav 设置文件上传流扫描【英文标题】:Setting up a file upload stream scan using Clamav in a Django back-end 【发布时间】:2018-05-24 00:14:40 【问题描述】:正在开发 React/Django 应用程序。我有用户通过 React 前端上传的文件,这些文件最终在 Django/DRF 后端。我们在服务器上不断运行防病毒 (AV),但我们希望在将其写入磁盘之前添加流扫描。
如何设置它有点让我头疼。以下是我正在查看的一些来源。
How do you virus scan a file being uploaded to your java webapp as it streams?
虽然公认的最佳答案描述了它“...非常容易”设置,但我正在苦苦挣扎。
我显然需要 cat testfile | clamscan -
每个帖子和相应的文档:
How do you virus scan a file being uploaded to your java webapp as it streams?
如果我的后端如下所示:
class SaveDocumentAPIView(APIView):
permission_classes = [IsAuthenticated]
def post(self, request, *args, **kwargs):
# this is for handling the files we do want
# it writes the files to disk and writes them to the database
for f in request.FILES.getlist('file'):
max_id = Uploads.objects.all().aggregate(Max('id'))
if max_id['id__max'] == None:
max_id = 1
else:
max_id = max_id['id__max'] + 1
data =
'user_id': request.user.id,
'sur_id': kwargs.get('sur_id'),
'co': User.objects.get(id=request.user.id).co,
'date_uploaded': datetime.datetime.now(),
'size': f.size
filename = str(data['co']) + '_' + \
str(data['sur_id']) + '_' + \
str(max_id) + '_' + \
f.name
data['doc_path'] = filename
self.save_file(f, filename)
serializer = SaveDocumentSerializer(data=data)
if serializer.is_valid(raise_exception=True):
serializer.save()
return Response(status=HTTP_200_OK)
# Handling the document
def save_file(self, file, filename):
with open('fileupload/' + filename, 'wb+') as destination:
for chunk in file.chunks():
destination.write(chunk)
我想我需要在save_file
方法中添加一些内容,例如:
for chunk in file.chunks():
# run bash comman from python
cat chunk | clamscan -
if passes_clamscan:
destination.write(chunk)
return HttpResponse('It passed')
else:
return HttpResponse('Virus detected')
所以我的问题是:
1) 如何从 Python 运行 Bash?
2) 如何从扫描中接收结果响应,以便将其发送回用户,并且可以通过后端的响应完成其他事情? (例如创建逻辑以向用户和管理员发送一封电子邮件,说明他们的文件有病毒)。
我一直在玩这个,但运气不佳。
Running Bash commands in Python
此外,Github 存储库声称将 Clamav 与 Django 结合得很好,但它们要么多年未更新,要么现有文档非常糟糕。请参阅以下内容:
https://github.com/vstoykov/django-clamd
https://github.com/musashiXXX/django-clamav-upload
https://github.com/QueraTeam/django-clamav
【问题讨论】:
文件的一部分不太可能被检测为病毒。扫描仪可能需要整个文件。 【参考方案1】:好的,可以使用 clamd。我将SaveDocumentAPIView
修改为以下内容。这会在将文件写入磁盘之前对其进行扫描,并在它们被感染时防止它们被写入。仍然允许未受感染的文件通过,因此用户不必重新上传它们。
class SaveDocumentAPIView(APIView):
permission_classes = [IsAuthenticated]
def post(self, request, *args, **kwargs):
# create array for files if infected
infected_files = []
# setup unix socket to scan stream
cd = clamd.ClamdUnixSocket()
# this is for handling the files we do want
# it writes the files to disk and writes them to the database
for f in request.FILES.getlist('file'):
# scan stream
scan_results = cd.instream(f)
if (scan_results['stream'][0] == 'OK'):
# start to create the file name
max_id = Uploads.objects.all().aggregate(Max('id'))
if max_id['id__max'] == None:
max_id = 1
else:
max_id = max_id['id__max'] + 1
data =
'user_id': request.user.id,
'sur_id': kwargs.get('sur_id'),
'co': User.objects.get(id=request.user.id).co,
'date_uploaded': datetime.datetime.now(),
'size': f.size
filename = str(data['co']) + '_' + \
str(data['sur_id']) + '_' + \
str(max_id) + '_' + \
f.name
data['doc_path'] = filename
self.save_file(f, filename)
serializer = SaveDocumentSerializer(data=data)
if serializer.is_valid(raise_exception=True):
serializer.save()
elif (scan_results['stream'][0] == 'FOUND'):
send_mail(
'Virus Found in Submitted File',
'The user %s %s with email %s has submitted the following file ' \
'flagged as containing a virus: \n\n %s' % \
(
user_obj.first_name,
user_obj.last_name,
user_obj.email,
f.name
),
'The Company <no-reply@company.com>',
['admin@company.com']
)
infected_files.append(f.name)
return Response('filename': infected_files, status=HTTP_200_OK)
# Handling the document
def save_file(self, file, filename):
with open('fileupload/' + filename, 'wb+') as destination:
for chunk in file.chunks():
destination.write(chunk)
【讨论】:
到目前为止,此实现对您的效果如何?你会推荐这种扫描图像/PDF的方法吗? 效果很好,没有任何问题!我们只有几个被感染的试图上传的文件。发生时向我们发送电子邮件,以便我们与他们联系。 很高兴听到这个消息。我现在只使用 PDF,所以我使用的是 PDFiD(但正在考虑添加 ClamAV)。为什么你认为没有更高评价/活跃的 python 包来集成 ClamCV?这似乎是许多 webapps 应该有的东西——我很惊讶在过去的 3 年里我不得不努力挖掘这个帖子来讨论它。以上是关于在 Django 后端使用 Clamav 设置文件上传流扫描的主要内容,如果未能解决你的问题,请参考以下文章
Archlinux/Manjaro安装配置使用clamav杀毒软件