Customize file write-back in Azure HPC Cache

HPC Cache users can request that the cache write specific individual files to back-end storage on demand by using the flush_file.py utility. This utility is a separately downloaded software package that you install and use on Linux client machines.

This feature is designed for situations where you want the changes on cached files to be made available as soon as possible to systems that don't mount the cache.

For example, you might use Azure HPC Cache to scale your computing jobs in the cloud, but store your data set permanently in an on-premises data center. If compute tasks happen at the data center that depend on changes created with Azure HPC Cache, you can use this utility to "push" the output or changes generated by a cloud task back to the on-premises NAS storage. This lets the new files be used almost immediately by on-premises compute resources.

Choose between custom write-back and flush

You can force data to be written back with the "storage target flush" option built in to Azure HPC Cache - but this approach might not be right for all situations.

  • Writing all of the modified files back to the storage system can take several minutes or even hours, depending on the quantity of data and the speed of the network link back to the on-premises system. Also, you can't choose only the files you've finished with to be written; files that are still actively being modified will be included in this calculation.

  • The cache might block serving some requests from that storage target during the flush process. This can delay processing if there are other compute clients using files that reside on the same storage target.

  • Triggering this action requires contributor access to the Azure Resource Manager, which end-users might not have.

For example, you can have multiple parallel (but not overlapping) compute jobs that consume data residing on the same HPC Cache storage target. When one job completes, you want to immediately write that job's output from the cache to your long-term storage on the back end.

You have three options:

  • Wait for the cached files to be automatically written back from the cache - but files might sit in the cache for more than an hour before they're completely written back. The timing depends on the write-back delay of your cache usage model, along with other factors such as network link performance and the size of the files. (Read Understand cache usage models to learn more about write-back delay.)

  • Immediately flush the cached files for the entire storage target - but that would disrupt other compute jobs that are also using this storage target's data.

  • Use this customized write-back utility to send a special NFS request to the cache to write back only the specific files you want. This scenario doesn't disrupt access for other clients and can be triggered at any point in the computing task.

About the write-back utility

The write-back utility has a script that you can use to specify individual files that will be written from the cache to the long-term storage system.

The script takes an input stream of the files to write, plus the cache namespace path to your storage target export, and an HPC Cache mount IP address.

The script uses an NFSv3 "commit" call with special arguments enabled. The Linux nfs-common client can't pass these arguments appropriately, so the flush_file.py utility uses an NFS client emulator in a Python library to communicate with the HPC Cache NFS service. The library includes everything needed, which bypasses any limitations that might exist in your compute client's Linux-kernel-based NFS client.

To use this feature, you need to do the following:

Learn more about installing and using the flush_file.py script in the GitHub repository.