Редактиране

Споделяне чрез


How to mount Azure Blob Storage as a file system with BlobFuse v1

Important

BlobFuse2 is the latest version of BlobFuse and has many significant improvements over the version discussed in this article, BlobFuse v1. To learn about the improvements made in BlobFuse2, see the list of BlobFuse2 enhancements.

BlobFuse is a virtual file system driver for Azure Blob Storage. BlobFuse allows you to access your existing block blob data in your storage account through the Linux file system. BlobFuse uses the virtual directory scheme with the forward-slash '/' as a delimiter.

This guide shows you how to use BlobFuse v1 and mount a Blob Storage container on Linux and access data. To learn more about BlobFuse v1, see the readme and wiki.

Warning

BlobFuse doesn't guarantee 100% POSIX compliance as it simply translates requests into Blob REST APIs. For example, rename operations are atomic in POSIX, but not in BlobFuse. For a full list of differences between a native file system and BlobFuse, visit the BlobFuse source code repository.

Install BlobFuse v1 on Linux

BlobFuse binaries are available on the Microsoft software repositories for Linux for Ubuntu, Debian, SUSE, Oracle Linux and RHEL distributions. To install BlobFuse on those distributions, configure one of the repositories from the list. You can also build the binaries from source code following the Azure Storage installation steps if there are no binaries available for your distribution.

BlobFuse is published in the Linux repo for Ubuntu versions: 16.04, 18.04, and 20.04, RHEL versions: 7.5, 7.8, 7.9, 8.0, 8.1, 8.2, Debian versions: 9.0, 10.0, SUSE version: 15, Oracle Linux 8.1. Run this command to make sure that you have one of those versions deployed:

cat /etc/*-release

Configure the Microsoft package repository

Configure the Linux Package Repository for Microsoft Products.

As an example, on a Redhat Enterprise Linux 8 distribution:

sudo rpm -Uvh https://packages.microsoft.com/config/rhel/8/packages-microsoft-prod.rpm

Similarly, change the URL to .../rhel/7/... to point to a Redhat Enterprise Linux 7 distribution.

Install BlobFuse v1

sudo yum install blobfuse

Prepare for mounting

BlobFuse provides native-like performance by requiring a temporary path in the file system to buffer and cache any open files. For this temporary path, choose the most performant disk, or use a ramdisk for best performance.

Note

BlobFuse stores all open file contents in the temporary path. Make sure to have enough space to accommodate all open files.

(Optional) Use a ramdisk for the temporary path

The following example creates a ramdisk of 16 GB and a directory for BlobFuse. Choose the size based on your needs. This ramdisk allows BlobFuse to open files up to 16 GB in size.

sudo mkdir /mnt/ramdisk
sudo mount -t tmpfs -o size=16g tmpfs /mnt/ramdisk
sudo mkdir /mnt/ramdisk/blobfusetmp
sudo chown <youruser> /mnt/ramdisk/blobfusetmp

Use an SSD as a temporary path

In Azure, you may use the ephemeral disks (SSD) available on your VMs to provide a low-latency buffer for BlobFuse. Depending on the provisioning agent used, the ephemeral disk would be mounted on '/mnt' for cloud-init or '/mnt/resource' for waagent VMs.

Make sure your user has access to the temporary path:

sudo mkdir /mnt/resource/blobfusetmp -p
sudo chown <youruser> /mnt/resource/blobfusetmp

Authorize access to your storage account

You can authorize access to your storage account by using the account access key, a shared access signature, a managed identity, or a service principal. Authorization information can be provided on the command line, in a config file, or in environment variables. For details, see Valid authentication setups in the BlobFuse readme.

For example, suppose you are authorizing with the account access keys and storing them in a config file. The config file should have the following format:

accountName myaccount
accountKey storageaccesskey
containerName mycontainer
authType Key

The accountName is the name of your storage account, and not the full URL. You need to update myaccount, storageaccesskey, and mycontainer with your storage information.

Create this file using:

sudo touch /path/to/fuse_connection.cfg

Once you've created and edited this file, make sure to restrict access so no other users can read it.

sudo chmod 600 /path/to/fuse_connection.cfg

Note

If you have created the configuration file on Windows, make sure to run dos2unix to sanitize and convert the file to Unix format.

Create an empty directory for mounting

sudo mkdir ~/mycontainer

Mount

Note

For a full list of mount options, check the BlobFuse repository.

To mount BlobFuse, run the following command with your user. This command mounts the container specified in '/path/to/fuse_connection.cfg' onto the location '/mycontainer'.

sudo blobfuse ~/mycontainer --tmp-path=/mnt/resource/blobfusetmp  --config-file=/path/to/fuse_connection.cfg -o attr_timeout=240 -o entry_timeout=240 -o negative_timeout=120

Note

If you use an ADLS account, you must include --use-adls=true.

You should now have access to your block blobs through the regular file system APIs. The user who mounts the directory is the only person who can access it, by default, which secures the access. To allow access to all users, you can mount via the option -o allow_other.

sudo cd ~/mycontainer
sudo mkdir test
sudo echo "hello world" > test/blob.txt

Persist the mount

To learn how to persist the mount, see Persisting in the BlobFuse wiki.

Feature support

Support for this feature might be impacted by enabling Data Lake Storage Gen2, Network File System (NFS) 3.0 protocol, or the SSH File Transfer Protocol (SFTP). If you've enabled any of these capabilities, see Blob Storage feature support in Azure Storage accounts to assess support for this feature.

Next steps