如何通过 Python 使用 Azure Blob 存储

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何通过 Python 使用 Azure Blob 存储相关的知识,希望对你有一定的参考价值。

参考技术A

Overview

Azure Blob storage is a service that stores unstructured data in the cloud as objects/blobs. Blob storage can store any type of text or binary data, such as a document, media file, or application installer. Blob storage is also referred to as object storage.

This article will show you how to perform common scenarios using Blob storage. The samples are written in Python and use the Microsoft Azure Storage SDK for Python. The scenarios covered include uploading, listing, downloading, and deleting blobs.

What is Blob Storage?

Azure Blob storage is a service for storing large amounts of unstructured object data, such as text or binary data, that can be accessed from anywhere in the world via HTTP or HTTPS. You can use Blob storage to expose data publicly to the world, or to store application data privately.

Common uses of Blob storage include:

    Serving images or documents directly to a browser

    Storing files for distributed access

    Streaming video and audio

    Storing data for backup and restore, disaster recovery, and archiving

    Storing data for analysis by an on-premises or Azure-hosted service

    Blob service concepts

    The Blob service contains the following components:

    Storage Account: All access to Azure Storage is done through a storage account. This storage account can be a General-purpose storage account or a Blob storage accountwhich is specialized for storing objects/blobs. See About Azure storage accounts for more information.

    Container: A container provides a grouping of a set of blobs. All blobs must be in a container. An account can contain an unlimited number of containers. A container can store an unlimited number of blobs. Note that the container name must be lowercase.

    Blob: A file of any type and size. Azure Storage offers three types of blobs: block blobs, page blobs, and append blobs.

    Block blobs are ideal for storing text or binary files, such as documents and media files. Append blobs are similar to block blobs in that they are made up of blocks, but they are optimized for append operations, so they are useful for logging scenarios. A single block blob can contain up to 50,000 blocks of up to 100 MB each, for a total size of slightly more than 4.75 TB (100 MB X 50,000). A single append blob can contain up to 50,000 blocks of up to 4 MB each, for a total size of slightly more than 195 GB (4 MB X 50,000).

    Page blobs can be up to 1 TB in size, and are more efficient for frequent read/write operations. Azure Virtual Machines use page blobs as OS and data disks.

    For details about naming containers and blobs, see Naming and Referencing Containers, Blobs, and Metadata.

    Create an Azure storage account

    The easiest way to create your first Azure storage account is by using the Azure portal. To learn more, see Create a storage account.

    You can also create an Azure storage account by using Azure PowerShell, Azure CLI, or the Storage Resource Provider Client Library for .NET.

    If you prefer not to create a storage account at this time, you can also use the Azure storage emulator to run and test your code in a local environment. For more information, see Use the Azure Storage Emulator for Development and Testing.

    Download and Install Azure Storage SDK for Python

    Azure Storage SDK for Python requires Python 2.7, 3.3, 3.4, 3.5, or 3.6, and comes in 4 different packages: azure-storage-blob, azure-storage-file, azure-storage-table and azure-storage-queue. In this tutorial we are going to use azure-storage-blob package.

    Install via PyPi

    To install via the Python Package Index (PyPI), type:

    bashCopy

    pip install azure-storage-blob

    Note

    If you are upgrading from the Azure Storage SDK for Python version 0.36 or earlier, you will first need to uninstall using pip uninstall azure-storage as we are no longer releasing the Storage SDK for Python in a single package.

    For alternative installation methods, visit the Azure Storage SDK for Python on Github.

    Create a container

    Based on the type of blob you would like to use, create a BlockBlobService, AppendBlobService, or PageBlobService object. The following code uses a BlockBlobServiceobject. Add the following near the top of any Python file in which you wish to programmatically access Azure Block Blob Storage.

    PythonCopy

    from azure.storage.blob import BlockBlobService

    The following code creates a BlockBlobService object using the storage account name and account key. Replace 'myaccount' and 'mykey' with your account name and key.2

    PythonCopy

    block_blob_service = BlockBlobService(account_name='myaccount', account_key='mykey')

    Every blob in Azure storage must reside in a container. The container forms part of the blob name. For example, mycontainer is the name of the container in these sample blob URIs:2

    Copy



    A container name must be a valid DNS name, conforming to the following naming rules:

    Container names must start with a letter or number, and can contain only letters, numbers, and the dash (-) character.

    Every dash (-) character must be immediately preceded and followed by a letter or number; consecutive dashes are not permitted in container names.

    All letters in a container name must be lowercase.

    Container names must be from 3 through 63 characters long.

    Important

    Note that the name of a container must always be lowercase. If you include an upper-case letter in a container name, or otherwise violate the container naming rules, you may receive a 400 error (Bad Request).

    In the following code example, you can use a BlockBlobService object to create the container if it doesn't exist.

    PythonCopy

    block_blob_service.create_container('mycontainer')

    By default, the new container is private, so you must specify your storage access key (as you did earlier) to download blobs from this container. If you want to make the blobs within the container available to everyone, you can create the container and pass the public access level using the following code.

    PythonCopy

    from azure.storage.blob import PublicAccess
    block_blob_service.create_container('mycontainer', public_access=PublicAccess.Container)

    Alternatively, you can modify a container after you have created it using the following code.

    PythonCopy

    block_blob_service.set_container_acl('mycontainer', public_access=PublicAccess.Container)

    After this change, anyone on the Internet can see blobs in a public container, but only you can modify or delete them.

    Upload a blob into a container

    To create a block blob and upload data, use the create_blob_from_path, create_blob_from_stream, create_blob_from_bytes or create_blob_from_text methods. They are high-level methods that perform the necessary chunking when the size of the data exceeds 64 MB.

    create_blob_from_path uploads the contents of a file from the specified path, and create_blob_from_stream uploads the contents from an already opened file/stream. create_blob_from_bytes uploads an array of bytes, and create_blob_from_text uploads the specified text value using the specified encoding (defaults to UTF-8).

    The following example uploads the contents of the sunset.png file into the myblockblob blob.

    PythonCopy

    from azure.storage.blob import ContentSettings
    block_blob_service.create_blob_from_path(    'mycontainer',    'myblockblob',    'sunset.png',
       content_settings=ContentSettings(content_type='image/png')
               )

    List the blobs in a container

    To list the blobs in a container, use the list_blobs method. This method returns a generator. The following code outputs the name of each blob in a container to the console.

    PythonCopy

    generator = block_blob_service.list_blobs('mycontainer')for blob in generator:
       print(blob.name)

    Download blobs

    To download data from a blob, use get_blob_to_path, get_blob_to_stream, get_blob_to_bytes, or get_blob_to_text. They are high-level methods that perform the necessary chunking when the size of the data exceeds 64 MB.

    The following example demonstrates using get_blob_to_path to download the contents of the myblockblob blob and store it to the out-sunset.png file.2

    PythonCopy

    block_blob_service.get_blob_to_path('mycontainer', 'myblockblob', 'out-sunset.png')

    Delete a blob

    Finally, to delete a blob, call delete_blob.

    PythonCopy

    block_blob_service.delete_blob('mycontainer', 'myblockblob')

    Writing to an append blob

    An append blob is optimized for append operations, such as logging. Like a block blob, an append blob is comprised of blocks, but when you add a new block to an append blob, it is always appended to the end of the blob. You cannot update or delete an existing block in an append blob. The block IDs for an append blob are not exposed as they are for a block blob.

    Each block in an append blob can be a different size, up to a maximum of 4 MB, and an append blob can include a maximum of 50,000 blocks. The maximum size of an append blob is therefore slightly more than 195 GB (4 MB X 50,000 blocks).

    The example below creates a new append blob and appends some data to it, simulating a simple logging operation.

    PythonCopy

    from azure.storage.blob import AppendBlobService
    append_blob_service = AppendBlobService(account_name='myaccount', account_key='mykey')# The same containers can hold all types of blobsappend_blob_service.create_container('mycontainer')# Append blobs must be created before they are appended toappend_blob_service.create_blob('mycontainer', 'myappendblob')
    append_blob_service.append_blob_from_text('mycontainer', 'myappendblob', u'Hello, world!')

    append_blob = append_blob_service.get_blob_to_text('mycontainer', 'myappendblob')

以上是关于如何通过 Python 使用 Azure Blob 存储的主要内容,如果未能解决你的问题,请参考以下文章

如何通过 python 将 JSON 数据附加到存储在 Azure blob 存储中的现有 JSON 文件?

Python 操作 Azure Blob Storage

如何在 Python 中生成 Azure blob SAS URL?

列出并恢复软删除的 blob - azure python

从容器中删除 blob 时如何在 python 中对 Azure PartialBatchErrorException 进行异常处理

如何通过引用文件下载 Azure Blob?