使用 Gmail API 从 Gmail 下载附件
Posted
技术标签:
【中文标题】使用 Gmail API 从 Gmail 下载附件【英文标题】:Download attachments from Gmail using Gmail API 【发布时间】:2014-11-08 01:40:26 【问题描述】:我使用Gmail API 访问我的Gmail 数据和Google Python API client。
根据获取消息的文档附件,他们为 Python 提供了一个 sample。但是我尝试了相同的代码,然后出现错误:
AttributeError: 'Resource' 对象没有属性 'user'
我得到错误的那一行:
message = service.user().messages().get(userId=user_id, id=msg_id).execute()
所以我尝试通过替换user()
来尝试users()
:
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
但我在for part in message['payload']['parts']
中没有收到part['body']['data']
。
【问题讨论】:
【参考方案1】:扩展 @Eric 答案,我从文档中编写了以下 GetAttachments 函数的更正版本:
# based on Python example from
# https://developers.google.com/gmail/api/v1/reference/users/messages/attachments/get
# which is licensed under Apache 2.0 License
import base64
from apiclient import errors
def GetAttachments(service, user_id, msg_id):
"""Get and store attachment from Message with given id.
:param service: Authorized Gmail API service instance.
:param user_id: User's email address. The special value "me" can be used to indicate the authenticated user.
:param msg_id: ID of Message containing attachment.
"""
try:
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
for part in message['payload']['parts']:
if part['filename']:
if 'data' in part['body']:
data = part['body']['data']
else:
att_id = part['body']['attachmentId']
att = service.users().messages().attachments().get(userId=user_id, messageId=msg_id,id=att_id).execute()
data = att['data']
file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
path = part['filename']
with open(path, 'w') as f:
f.write(file_data)
except errors.HttpError, error:
print 'An error occurred: %s' % error
【讨论】:
对于那些不能写入文件的人,使用'wb',因为有时数据不是字符串,它实际上是二进制的。 和内嵌图片?【参考方案2】:您仍然可以通过 @Ilya V. Schurov 或 @Cam T 的回答错过附件,原因是电子邮件结构可能会根据 @987654322 有所不同@。
受this answer 的启发,这是我解决问题的方法。
import base64
from apiclient import errors
def GetAttachments(service, user_id, msg_id, store_dir=""):
"""Get and store attachment from Message with given id.
Args:
service: Authorized Gmail API service instance.
user_id: User's email address. The special value "me"
can be used to indicate the authenticated user.
msg_id: ID of Message containing attachment.
store_dir: The directory used to store attachments.
"""
try:
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
parts = [message['payload']]
while parts:
part = parts.pop()
if part.get('parts'):
parts.extend(part['parts'])
if part.get('filename'):
if 'data' in part['body']:
file_data = base64.urlsafe_b64decode(part['body']['data'].encode('UTF-8'))
#self.stdout.write('FileData for %s, %s found! size: %s' % (message['id'], part['filename'], part['size']))
elif 'attachmentId' in part['body']:
attachment = service.users().messages().attachments().get(
userId=user_id, messageId=message['id'], id=part['body']['attachmentId']
).execute()
file_data = base64.urlsafe_b64decode(attachment['data'].encode('UTF-8'))
#self.stdout.write('FileData for %s, %s found! size: %s' % (message['id'], part['filename'], attachment['size']))
else:
file_data = None
if file_data:
#do some staff, e.g.
path = ''.join([store_dir, part['filename']])
with open(path, 'w') as f:
f.write(file_data)
except errors.HttpError as error:
print 'An error occurred: %s' % error
【讨论】:
你是如何处理那些丢失的附件的?我看到的只是一个file_data = None
,然后它什么也不做。
看看while
声明,这就是差异的来源。最后一个else: file_data = None
只是为了代码安全。
啊,我明白了,不同的是你还要处理最上层的数据(payload['body']['data']
),而其他答案只看部分内的正文(payload['parts']
)
'wb' 而不是 'w'
我可以使用您的方法下载 pdf 附件,但是当我打开这些 pdf 时,它们无法打开。我收到错误“打开此文档时出现错误。文件已损坏,无法修复'【参考方案3】:
我测试了上面的代码,但没有用。我为其他帖子更新了一些东西。 WriteFileError
import base64
from apiclient import errors
def GetAttachments(service, user_id, msg_id, prefix=""):
"""Get and store attachment from Message with given id.
Args:
service: Authorized Gmail API service instance.
user_id: User's email address. The special value "me"
can be used to indicate the authenticated user.
msg_id: ID of Message containing attachment.
prefix: prefix which is added to the attachment filename on saving
"""
try:
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
for part in message['payload'].get('parts', ''):
if part['filename']:
if 'data' in part['body']:
data=part['body']['data']
else:
att_id=part['body']['attachmentId']
att=service.users().messages().attachments().get(userId=user_id, messageId=msg_id,id=att_id).execute()
data=att['data']
file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
path = prefix+part['filename']
with open(path, 'wb') as f:
f.write(file_data)
except errors.HttpError as error:
print('An error occurred: %s' % error)
【讨论】:
【参考方案4】:肯定是users()
。
响应消息的格式很大程度上取决于您使用的格式参数。如果使用默认值 (FULL),则部件将具有 part['body']['data']
,或者当数据很大时,具有可以传递给 messages().attachments().get()
的 attachment_id
字段。
如果您查看附件文档,您会看到: https://developers.google.com/gmail/api/v1/reference/users/messages/attachments
(如果在主消息文档页面上也提到了这一点,那就太好了。)
【讨论】:
【参考方案5】:我对上面的代码进行了以下更改,并且对于每个包含附件文档的电子邮件 ID 都可以正常工作,我希望这会有所帮助,因为通过 API 示例,您会得到一个错误密钥。
def GetAttachments(service, user_id, msg_id, store_dir):
"""Get and store attachment from Message with given id.
Args:
service: Authorized Gmail API service instance.
user_id: User's email address. The special value "me"
can be used to indicate the authenticated user.
msg_id: ID of Message containing attachment.
prefix: prefix which is added to the attachment filename on saving
"""
try:
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
for part in message['payload']['parts']:
newvar = part['body']
if 'attachmentId' in newvar:
att_id = newvar['attachmentId']
att = service.users().messages().attachments().get(userId=user_id, messageId=msg_id, id=att_id).execute()
data = att['data']
file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
print(part['filename'])
path = ''.join([store_dir, part['filename']])
f = open(path, 'wb')
f.write(file_data)
f.close()
except errors.HttpError, error:
print 'An error occurred: %s' % error
Google Official API for Attachments
【讨论】:
'wb' 而不是 'w'【参考方案6】:感谢 @Ilya V. Schurov 和 @Todor 的解答。如果同一搜索字符串有带附件和不带附件的邮件,您仍然可能会错过消息。这是我为两种类型的邮件(即带附件和不带附件)获取邮件正文的方法。
def get_attachments(service, msg_id):
try:
message = service.users().messages().get(userId='me', id=msg_id).execute()
for part in message['payload']['parts']:
if part['filename']:
if 'data' in part['body']:
data = part['body']['data']
else:
att_id = part['body']['attachmentId']
att = service.users().messages().attachments().get(userId='me', messageId=msg_id,id=att_id).execute()
data = att['data']
file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
path = part['filename']
with open(path, 'wb') as f:
f.write(file_data)
except errors.HttpError as error:
print ('An error occurred: %s') % error
def get_message(service,msg_id):
try:
message = service.users().messages().get(userId='me', id=msg_id).execute()
if message['payload']['mimeType'] == 'multipart/mixed':
for part in message['payload']['parts']:
for sub_part in part['parts']:
if sub_part['mimeType'] == 'text/plain':
data = sub_part['body']['data']
break
if data:
break
else:
for part in message['payload']['parts']:
if part['mimeType'] == 'text/plain':
data = part['body']['data']
break
content = base64.b64decode(data).decode('utf-8')
print(content)
return content
except errors.HttpError as error:
print("An error occured : %s") %error
【讨论】:
【参考方案7】:from __future__ import print_function
import base64
import os.path
import oauth2client
from googleapiclient.discovery import build
from oauth2client import file,tools
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
store_dir=os.getcwd()
def attachment_download():
store = oauth2client.file.Storage('credentials_gmail.json')
creds = store.get()
if not creds or creds.invalid:
flow = oauth2client.client.flow_from_clientsecrets('client_secrets.json', SCOPES)
creds = oauth2client.tools.run_flow(flow, store)
try:
service = build('gmail', 'v1', credentials=creds)
results = service.users().messages().list(userId='me', labelIds=['XXXX']).execute() # XXXX is label id use INBOX to download from inbox
messages = results.get('messages', [])
for message in messages:
msg = service.users().messages().get(userId='me', id=message['id']).execute()
for part in msg['payload'].get('parts', ''):
if part['filename']:
if 'data' in part['body']:
data = part['body']['data']
else:
att_id = part['body']['attachmentId']
att = service.users().messages().attachments().get(userId='me', messageId=message['id'],id=att_id).execute()
data = att['data']
file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
filename = part['filename']
print(filename)
path = os.path.join(store_dir + '\\' 'Downloaded files' + '\\' + filename)
with open(path, 'wb') as f:
f.write(file_data)
f.close()
except Exception as error:
print(error)
请看: 获取标签 ID https://developers.google.com/gmail/api/v1/reference/users/labels/list
【讨论】:
以上是关于使用 Gmail API 从 Gmail 下载附件的主要内容,如果未能解决你的问题,请参考以下文章
使用 Python 进行 base64 解码,并使用基于 REST 的 Gmail API 下载附件
如何让用户浏览器直接从 GMAIL 下载 gmail 附件?