如何增加 AWS lambda 部署包的最大大小 (RequestEntityTooLargeException)？

Posted 2023-02-19

技术标签:

【中文标题】如何增加 AWS lambda 部署包的最大大小 (RequestEntityTooLargeException)？【英文标题】：How to increase the maximum size of the AWS lambda deployment package (RequestEntityTooLargeException)? 【发布时间】：2019-07-05 00:47:34 【问题描述】：

我从 AWS codebuild 上传了我的 lambda 函数源。我的 Python 脚本使用 NLTK，因此需要大量数据。我的 .zip 包太大，出现RequestEntityTooLargeException。我想知道如何增加通过 UpdateFunctionCode 命令发送的部署包的大小。

我使用 AWS CodeBuild 将源代码从 GitHub 存储库转换为 AWS Lambda。这是相关的构建规范文件：

version: 0.2
phases:
 install:
   commands:
     - echo "install step"
     - apt-get update
     - apt-get install zip -y
     - apt-get install python3-pip -y
     - pip install --upgrade pip
     - pip install --upgrade awscli
     # Define directories
     - export HOME_DIR=`pwd`
     - export NLTK_DATA=$HOME_DIR/nltk_data
 pre_build:
   commands:
     - echo "pre_build step"
     - cd $HOME_DIR
     - virtualenv venv
     - . venv/bin/activate
     # Install modules
     - pip install -U requests
     # NLTK download
     - pip install -U nltk
     - python -m nltk.downloader -d $NLTK_DATA wordnet stopwords punkt
     - pip freeze > requirements.txt
 build:
   commands:
     - echo 'build step'
     - cd $HOME_DIR
     - mv $VIRTUAL_ENV/lib/python3.6/site-packages/* .
     - sudo zip -r9 algo.zip .
     - aws s3 cp --recursive --acl public-read ./ s3://hilightalgo/
     - aws lambda update-function-code --function-name arn:aws:lambda:eu-west-3:671560023774:function:LaunchHilight --zip-file fileb://algo.zip
     - aws lambda update-function-configuration --function-name arn:aws:lambda:eu-west-3:671560023774:function:LaunchHilight --environment 'Variables=NLTK_DATA=/var/task/nltk_data'
 post_build:
   commands:
     - echo "post_build step"

当我启动管道时，我有RequestEntityTooLargeException，因为我的 .zip 包中有太多数据。请参阅下面的构建日志：

[Container] 2019/02/11 10:48:35 Running command aws lambda update-function-code --function-name arn:aws:lambda:eu-west-3:671560023774:function:LaunchHilight --zip-file fileb://algo.zip
 An error occurred (RequestEntityTooLargeException) when calling the UpdateFunctionCode operation: Request must be smaller than 69905067 bytes for the UpdateFunctionCode operation
 [Container] 2019/02/11 10:48:37 Command did not exit successfully aws lambda update-function-code --function-name arn:aws:lambda:eu-west-3:671560023774:function:LaunchHilight --zip-file fileb://algo.zip exit status 255
[Container] 2019/02/11 10:48:37 Phase complete: BUILD Success: false
[Container] 2019/02/11 10:48:37 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: aws lambda update-function-code --function-name arn:aws:lambda:eu-west-3:671560023774:function:LaunchHilight --zip-file fileb://algo.zip. Reason: exit status 255

当我减少要下载的 NLTK 数据时，一切正常（我只尝试使用包 stopwords 和 wordnet。

有没有人想办法解决这个“尺寸限制问题”？

【问题讨论】：

【参考方案1】：

您无法增加 Lambda 的部署包大小。 AWS Lambda devopler guide 中描述了 AWS Lambda 限制。有关这些限制如何工作的更多信息，请参阅here。本质上，您的解压缩包大小必须小于 250MB（262144000 字节）。

PS：使用层并不能解决大小问题，但有助于管理和更快的冷启动。包大小包括层 - Lambda layers。

一个函数一次最多可以使用 5 层。函数和所有层的总解压大小不能超过解压部署包大小限制 250 MB。

PPS：根据AWS blog，正如用户 jonnocraig in this answer 所指出的那样，如果您为您的应用程序构建一个容器并在 Lambda 上运行它，则可以克服这些限制。

【讨论】：

因此，如果我尝试包含 pandas，它会引入 numpy，它本身就占 126 MB。添加 botocore，还有 48MB。是的。我对python不够熟悉，但请检查botocore是否包含在python SDK中。如果是，那么您不必包含它并且不计入包装大小。删除 pycache 文件和测试文件可能有助于减小大小 github.com/aws/sagemaker-python-sdk/issues/1200【参考方案2】：

您无法增加包大小，但您可以使用 AWS Lambda 层来存储一些应用程序依赖项。

https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html#configuration-layers-path

在此层之前，解决此限制的常用模式是从 S3 下载大量依赖项。

【讨论】：

这并不能解决大小问题。包大小包括层 - docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html。

A function can use up to 5 layers at a time. The total unzipped size of the function and all layers can't exceed the unzipped deployment package size limit of 250 MB.

【参考方案3】：

我自己没有尝试过，但Zappa 的人描述了一个可能有帮助的技巧。引用https://blog.zappa.io/posts/slim-handler:

Zappa 压缩大型应用程序并将项目 zip 文件发送到 S3。其次，Zappa 创建了一个非常小的 slim 处理程序，它只包含 Zappa 及其依赖项并将其发送到 Lambda。

当在冷启动时调用 slim 处理程序时，它会从 S3 下载大型项目 zip 并将其解压缩到 Lambda 的共享 /tmp 空间中。对该暖 Lambda 的所有后续调用共享 /tmp 空间并可以访问项目文件；因此，如果 Lambda 保持温暖，文件可能只下载一次。

这样你应该在 /tmp 中获得 500MB。

更新：

我在几个项目的lambdas中使用过如下代码，它是基于zappa使用的方法，但是可以直接使用。

# Based on the code in https://github.com/Miserlou/Zappa/blob/master/zappa/handler.py
# We need to load the layer from an s3 bucket into tmp, bypassing the normal
# AWS layer mechanism, since it is too large, AWS unzipped lambda function size
# including layers is 250MB.
def load_remote_project_archive(remote_bucket, remote_file, layer_name):
    
    # Puts the project files from S3 in /tmp and adds to path
    project_folder = '/tmp/0!s'.format(layer_name)
    if not os.path.isdir(project_folder):
        # The project folder doesn't exist in this cold lambda, get it from S3
        boto_session = boto3.Session()

        # Download zip file from S3
        s3 = boto_session.resource('s3')
        archive_on_s3 = s3.Object(remote_bucket, remote_file).get()

        # unzip from stream
        with io.BytesIO(archive_on_s3["Body"].read()) as zf:

            # rewind the file
            zf.seek(0)

            # Read the file as a zipfile and process the members
            with zipfile.ZipFile(zf, mode='r') as zipf:
                zipf.extractall(project_folder)

    # Add to project path
    sys.path.insert(0, project_folder)

    return True

然后可以按如下方式调用它（我通过 env 变量将带有层的存储桶传递给 lambda 函数）：

load_remote_project_archive(os.environ['MY_ADDITIONAL_LAYERS_BUCKET'], 'lambda_my_extra_layer.zip', 'lambda_my_extra_layer')

在我写这段代码的时候，tmp也是有上限的，我想是250MB，但是上面对zipf.extractall(project_folder)的调用可以换成直接提取到内存：unzipped_in_memory = name: zipf.read(name) for name in zipf.namelist() 我为一些机器学习模型所做的，我想@rahul 的答案对此更加通用。

【讨论】：

扎帕很好！但我用无服务器 (serverless.com) 解决了我的问题：一个很棒的框架 :) @LouisSinger，太好了！无服务器有类似的东西吗？或者你是怎么和他们一起做的？看看这篇文章：read.iopipe.com/… 谢谢，介绍的不错。 Louis，你是如何解决 servrless 的问题的。我查看了 iopipe 的东西，并没有看到解决问题的方法。【参考方案4】：

其实你可以要求增加部署包大小的限制

1。登录 AWS 控制台后转到 AWS Support

https://docs.aws.amazon.com/awssupport/latest/user/getting-started.html

2。然后要求提高服务限制。

3。并填写所需的最大尺寸。

AWS 支持工程师将就限制增加批准与您联系

【讨论】：

这是不正确的 - 要求增加功能代码的大小（250MB 限制）。您可以增加的是所有功能和层的大小（默认 - 75GB） - 而不是一个功能的代码大小。见docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html 250MB 是硬限制，不能增加。这包括打包的文件、依赖项和层。好的伙计们，我有好朋友在亚马逊工作，好吗？这种方法是可行的，可能的，你们只是没有尝试并投反对票，这对你我来说都是可悲的【参考方案5】：

您可以尝试很棒的 serverless-python-requirements 插件中使用的解决方法。

如果能达到目的，理想的解决方案是使用 lambda 层。如果总依赖项大于 250MB，那么您可以在运行时利用 /tmp 目录中提供的 512 MB 从 S3 存储桶旁加载较少使用的依赖项。压缩后的依赖项存储在 S3 中，lambda 可以在初始化期间从 S3 获取文件。解压依赖包并将路径添加到 sys 路径。

请注意，python 依赖项需要构建在 Amazon Linux 上，这是 lambda 容器的操作系统。我使用 EC2 实例创建了 zip 包。

您检查 serverless-python-requirements here 中使用的代码

【讨论】：

【参考方案6】：

AWS Lambda 函数可以挂载 EFS。您可以使用 EFS 加载大于 AWS Lambda 的 250 MB 包部署大小限制的库或包。

详细设置步骤如下： https://aws.amazon.com/blogs/aws/new-a-shared-file-system-for-your-lambda-functions/

概括地说，这些变化包括：

创建和设置 EFS 文件系统将 EFS 与 lambda 函数结合使用在 EFS 访问点内安装 pip 依赖项设置 PYTHONPATH 环境变量以告知在哪里查找依赖项

【讨论】：

【参考方案7】：

以下是 Lambda 的硬性限制（将来可能会更改）：

3 MB 用于控制台内编辑 50 MB 压缩包上传解压后为 250 MB，包括图层

解决此问题的明智方法是从您的 Lambda 挂载 EFS。这不仅可以用于加载库，还可以用于其他存储。

浏览这些博客：

https://aws.amazon.com/blogs/compute/using-amazon-efs-for-aws-lambda-in-your-serverless-applications/ https://aws.amazon.com/blogs/aws/new-a-shared-file-system-for-your-lambda-functions/

【讨论】：

【参考方案8】：

如果有人在 2020 年 12 月后偶然发现此问题，那么 AWS 进行了一项重大更新，以支持 Lambda 函数作为容器映像（最大 10GB ！！）。更多信息here

【讨论】：

【参考方案9】：

来自 AWS documentation：

如果您的部署包大于 50 MB，我们建议将您的函数代码和依赖项上传到 Amazon S3 存储桶。

您可以创建一个部署包并将 .zip 文件上传到您的您要在其中创建 Lambda 的 AWS 区域中的 Amazon S3 存储桶功能。创建 Lambda 函数时，指定 S3 存储桶 Lambda 控制台上的名称和对象键名称，或使用 AWS 命令行界面 (AWS CLI)。

您可以使用 AWS CLI 来部署包，而不是使用 --zip-file 参数来传递部署包，您可以使用 --code 参数。例如：

aws lambda create-function --function-name my_function --code S3Bucket=my_bucket,S3Key=my_file

【讨论】：

【参考方案10】：

在 AWS 中使用大型 lambda 项目的技巧是在 AWS ECR 服务中使用 docker 映像存储而不是 ZIP 文件。您可以使用最高 10GO 的 docker 映像。

AWS 文档提供了一个示例来帮助您： Create an image from an AWS base image for Lambda

【讨论】：

【参考方案11】：

聚会可能会迟到，但您可以使用 Docker 映像来绕过 lambda 层约束。这可以使用无服务器堆栈开发或仅通过控制台来完成。

【讨论】：

最佳答案之一已经建议了这一点。【参考方案12】：

在 2021 年之前，最好的方法是将 jar 文件部署到 S3，并用它创建 AWS lambda。

从 2021 年开始，AWS Lambda 开始支持容器镜像。在这里阅读：https://aws.amazon.com/de/blogs/aws/new-for-aws-lambda-container-image-support/

所以从现在开始，您可能应该考虑将您的 Lambda 函数打包并部署为容器映像（最多 10 GB）。

【讨论】：

【参考方案13】：

这个来自 github (https://github.com/awslabs/aws-data-wrangler/releases) 的 aws wrangler zip 文件包括许多其他库，例如 pandas 和 pymysql。就我而言，它是我唯一需要的层，因为它还有很多其他的东西。可能对某些人有用。

【讨论】：

以上是关于如何增加 AWS lambda 部署包的最大大小 (RequestEntityTooLargeException)？的主要内容，如果未能解决你的问题，请参考以下文章