使用 Gmail API 从 Gmail 下载附件

Posted

技术标签:

【中文标题】使用 Gmail API 从 Gmail 下载附件【英文标题】:Download attachments from Gmail using Gmail API 【发布时间】:2014-11-08 01:40:26 【问题描述】:

我使用Gmail API 访问我的Gmail 数据和Google Python API client。

根据获取消息的文档附件,他们为 Python 提供了一个 sample。但是我尝试了相同的代码,然后出现错误:

AttributeError: 'Resource' 对象没有属性 'user'

我得到错误的那一行:

message = service.user().messages().get(userId=user_id, id=msg_id).execute()

所以我尝试通过替换user() 来尝试users()

message = service.users().messages().get(userId=user_id, id=msg_id).execute()

但我在for part in message['payload']['parts'] 中没有收到part['body']['data']

【问题讨论】:

【参考方案1】:

扩展 @Eric 答案,我从文档中编写了以下 GetAttachments 函数的更正版本:

# based on Python example from 
# https://developers.google.com/gmail/api/v1/reference/users/messages/attachments/get
# which is licensed under Apache 2.0 License

import base64
from apiclient import errors

def GetAttachments(service, user_id, msg_id):
    """Get and store attachment from Message with given id.

    :param service: Authorized Gmail API service instance.
    :param user_id: User's email address. The special value "me" can be used to indicate the authenticated user.
    :param msg_id: ID of Message containing attachment.
    """
    try:
        message = service.users().messages().get(userId=user_id, id=msg_id).execute()

        for part in message['payload']['parts']:
            if part['filename']:
                if 'data' in part['body']:
                    data = part['body']['data']
                else:
                    att_id = part['body']['attachmentId']
                    att = service.users().messages().attachments().get(userId=user_id, messageId=msg_id,id=att_id).execute()
                    data = att['data']
                file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
                path = part['filename']

                with open(path, 'w') as f:
                    f.write(file_data)

    except errors.HttpError, error:
        print 'An error occurred: %s' % error

【讨论】:

对于那些不能写入文件的人,使用'wb',因为有时数据不是字符串,它实际上是二进制的。 和内嵌图片?【参考方案2】:

您仍然可以通过 @Ilya V. Schurov@Cam T 的回答错过附件,原因是电子邮件结构可能会根据 @987654322 有所不同@。

受this answer 的启发,这是我解决问题的方法。

import base64
from apiclient import errors

def GetAttachments(service, user_id, msg_id, store_dir=""):
    """Get and store attachment from Message with given id.
        Args:
            service: Authorized Gmail API service instance.
            user_id: User's email address. The special value "me"
                can be used to indicate the authenticated user.
            msg_id: ID of Message containing attachment.
            store_dir: The directory used to store attachments.
    """
    try:
        message = service.users().messages().get(userId=user_id, id=msg_id).execute()
        parts = [message['payload']]
        while parts:
            part = parts.pop()
            if part.get('parts'):
                parts.extend(part['parts'])
            if part.get('filename'):
                if 'data' in part['body']:
                    file_data = base64.urlsafe_b64decode(part['body']['data'].encode('UTF-8'))
                    #self.stdout.write('FileData for %s, %s found! size: %s' % (message['id'], part['filename'], part['size']))
                elif 'attachmentId' in part['body']:
                    attachment = service.users().messages().attachments().get(
                        userId=user_id, messageId=message['id'], id=part['body']['attachmentId']
                    ).execute()
                    file_data = base64.urlsafe_b64decode(attachment['data'].encode('UTF-8'))
                    #self.stdout.write('FileData for %s, %s found! size: %s' % (message['id'], part['filename'], attachment['size']))
                else:
                    file_data = None
                if file_data:
                    #do some staff, e.g.
                    path = ''.join([store_dir, part['filename']])
                    with open(path, 'w') as f:
                        f.write(file_data)
    except errors.HttpError as error:
        print 'An error occurred: %s' % error

【讨论】:

你是如何处理那些丢失的附件的?我看到的只是一个file_data = None,然后它什么也不做。 看看while 声明,这就是差异的来源。最后一个else: file_data = None 只是为了代码安全。 啊,我明白了,不同的是你还要处理最上层的数据(payload['body']['data']),而其他答案只看部分内的正文(payload['parts'] 'wb' 而不是 'w' 我可以使用您的方法下载 pdf 附件,但是当我打开这些 pdf 时,它们无法打开。我收到错误“打开此文档时出现错误。文件已损坏,无法修复'【参考方案3】:

我测试了上面的代码,但没有用。我为其他帖子更新了一些东西。 WriteFileError

    import base64
    from apiclient import errors


    def GetAttachments(service, user_id, msg_id, prefix=""):
       """Get and store attachment from Message with given id.

       Args:
       service: Authorized Gmail API service instance.
       user_id: User's email address. The special value "me"
       can be used to indicate the authenticated user.
       msg_id: ID of Message containing attachment.
       prefix: prefix which is added to the attachment filename on saving
       """
       try:
           message = service.users().messages().get(userId=user_id, id=msg_id).execute()

           for part in message['payload'].get('parts', ''):
              if part['filename']:
                  if 'data' in part['body']:
                     data=part['body']['data']
                  else:
                     att_id=part['body']['attachmentId']
                     att=service.users().messages().attachments().get(userId=user_id, messageId=msg_id,id=att_id).execute()
                     data=att['data']
            file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
            path = prefix+part['filename']

            with open(path, 'wb') as f:
                f.write(file_data)

        except errors.HttpError as error:
            print('An error occurred: %s' % error)

【讨论】:

【参考方案4】:

肯定是users()

响应消息的格式很大程度上取决于您使用的格式参数。如果使用默认值 (FULL),则部件将具有 part['body']['data'],或者当数据很大时,具有可以传递给 messages().attachments().get()attachment_id 字段。

如果您查看附件文档,您会看到: https://developers.google.com/gmail/api/v1/reference/users/messages/attachments

(如果在主消息文档页面上也提到了这一点,那就太好了。)

【讨论】:

【参考方案5】:

我对上面的代码进行了以下更改,并且对于每个包含附件文档的电子邮件 ID 都可以正常工作,我希望这会有所帮助,因为通过 API 示例,您会得到一个错误密钥。

def GetAttachments(service, user_id, msg_id, store_dir):

"""Get and store attachment from Message with given id.

Args:
service: Authorized Gmail API service instance.
user_id: User's email address. The special value "me"
can be used to indicate the authenticated user.
msg_id: ID of Message containing attachment.
prefix: prefix which is added to the attachment filename on saving
"""
try:
    message = service.users().messages().get(userId=user_id, id=msg_id).execute()
    for part in message['payload']['parts']:
        newvar = part['body']
        if 'attachmentId' in newvar:
            att_id = newvar['attachmentId']
            att = service.users().messages().attachments().get(userId=user_id, messageId=msg_id, id=att_id).execute()
            data = att['data']
            file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
            print(part['filename'])
            path = ''.join([store_dir, part['filename']])
            f = open(path, 'wb')
            f.write(file_data)
            f.close()
except errors.HttpError, error:
    print 'An error occurred: %s' % error

Google Official API for Attachments

【讨论】:

'wb' 而不是 'w'【参考方案6】:

感谢 @Ilya V. Schurov@Todor 的解答。如果同一搜索字符串有带附件和不带附件的邮件,您仍然可能会错过消息。这是我为两种类型的邮件(即带附件和不带附件)获取邮件正文的方法。

def get_attachments(service, msg_id):
try:
    message = service.users().messages().get(userId='me', id=msg_id).execute()

    for part in message['payload']['parts']:
        if part['filename']:
            if 'data' in part['body']:
                data = part['body']['data']
            else:
                att_id = part['body']['attachmentId']
                att = service.users().messages().attachments().get(userId='me', messageId=msg_id,id=att_id).execute()
                data = att['data']
            file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
            path = part['filename']

            with open(path, 'wb') as f:
                f.write(file_data)
    

except errors.HttpError as error:
    print ('An error occurred: %s') % error

def get_message(service,msg_id):
try:
    message = service.users().messages().get(userId='me', id=msg_id).execute()
    if message['payload']['mimeType'] == 'multipart/mixed':
        for part in message['payload']['parts']:
            for sub_part in part['parts']:
                if sub_part['mimeType'] == 'text/plain':
                    data = sub_part['body']['data']
                    break
            if data:
                break           
    else:
        for part in message['payload']['parts']:
            if part['mimeType'] == 'text/plain':
                data = part['body']['data']
                break
    
    content = base64.b64decode(data).decode('utf-8')
    print(content)

    return content    

except errors.HttpError as error:
    print("An error occured : %s") %error

【讨论】:

【参考方案7】:
from __future__ import print_function
import base64
import os.path
import oauth2client
from googleapiclient.discovery import build
from oauth2client import file,tools
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
store_dir=os.getcwd()

def attachment_download():

store = oauth2client.file.Storage('credentials_gmail.json')
creds = store.get()
if not creds or creds.invalid:
    flow = oauth2client.client.flow_from_clientsecrets('client_secrets.json', SCOPES)
    creds = oauth2client.tools.run_flow(flow, store)


try:
    service = build('gmail', 'v1', credentials=creds)
    results = service.users().messages().list(userId='me', labelIds=['XXXX']).execute() # XXXX is label id use INBOX to download from inbox
    messages = results.get('messages', [])
    for message in messages:
        msg = service.users().messages().get(userId='me', id=message['id']).execute()
        for part in msg['payload'].get('parts', ''):

            if part['filename']:
                if 'data' in part['body']:
                    data = part['body']['data']
                else:
                    att_id = part['body']['attachmentId']
                    att = service.users().messages().attachments().get(userId='me', messageId=message['id'],id=att_id).execute()
                    data = att['data']
                file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))

                filename = part['filename']
                print(filename)
                path = os.path.join(store_dir + '\\' 'Downloaded files' + '\\' + filename)

                with open(path, 'wb') as f:
                    f.write(file_data)
                    f.close()
except Exception as error:
    print(error)

请看: 获取标签 ID https://developers.google.com/gmail/api/v1/reference/users/labels/list

【讨论】:

以上是关于使用 Gmail API 从 Gmail 下载附件的主要内容,如果未能解决你的问题,请参考以下文章

使用 Python 进行 base64 解码,并使用基于 REST 的 Gmail API 下载附件

如何使用 nodejs 下载 Gmail API 的附件?

如何让用户浏览器直接从 GMAIL 下载 gmail 附件?

检索从 gmail 下载的文件表单附件中的内容,因为它存储在 gmail 服务器中

我也想获取邮件 Gmail API 附带的附件

如何从 gmail 下载许多电子邮件附件