Metrics for Azure NetApp Files

Стаття
07/17/2024

Azure NetApp Files provides metrics on allocated storage, actual storage usage, volume IOPS, and latency. By analyzing these metrics, you can gain a better understanding on the usage pattern and volume performance of your NetApp accounts.

Definitions

Understanding the terminology related to performance and capacity in Azure NetApp Files is essential to understanding the metrics available:

Capacity pool: A capacity pool is how capacity is billed in Azure NetApp Files. Capacity pools contain volume.
Volume quota: The amount of capacity provisioned to an Azure NetApp Files volume. Volume quota is directly tied to automatic Quality of Service (QoS), which impacts the volume performance. For more information, see QoS types for capacity pools.
Throughput: The amount of data transmitted across the wire (read/write/other) between Azure NetApp Files and the client. Throughput in Azure NetApp Files is measured in bytes per second.
Latency: Latency is the amount of time for a storage operation to complete within storage from the time it arrives to the time it's processed and is ready to be sent back to the client. Latency in Azure NetApp Files is measured in milliseconds (ms).

About storage performance operation metrics

An operation in Azure NetApp Files is defined as something that happens during a client/server conversation. For instance, when a client requests a file to be read from Azure NetApp Files, read and other operations are sent and received between the client and server.

When monitoring the Azure NetApp Files volume, read and write operations are self-explanatory. Also included in the metrics is a metric called Other IOPS, meaning any operation that isn't a read or write. Other IOPS encompasses operations such as metadata, which is present alongside most read and write operations.

The following types of metadata operations are included in the Other IOPS metric:

NFSv3

NFSv3 metadata calls included in Other IOPS as covered in RFC-1813:

Procedure 0: NULL - Do nothing
Procedure 1: GETATTR - Get file attributes
Procedure 2: SETATTR - Set file attributes
Procedure 3: LOOKUP - Lookup filename
Procedure 4: ACCESS - Check Access Permission
Procedure 5: READLINK - Read from symbolic link
Procedure 8: CREATE - Create a file
Procedure 9: MKDIR - Create a directory
Procedure 10: SYMLINK - Create a symbolic link
Procedure 11: MKNOD - Create a special device
Procedure 12: REMOVE - Remove a File
Procedure 13: RMDIR - Remove a Directory
Procedure 14: RENAME - Rename a File or Directory
Procedure 15: LINK - Create Link to an object
Procedure 16: READDIR - Read From Directory
Procedure 17: READDIRPLUS - Extended read from directory
Procedure 18: FSSTAT - Get dynamic file system information
Procedure 19: FSINFO - Get static file system Information
Procedure 20: PATHCONF - Retrieve POSIX information
Procedure 21: COMMIT - Commit cached data on a server to stable storage

NFSv4.1

NFSv4.1 metadata calls included in Other IOPS as covered in RFC-7530:

Procedure 0: NULL – Do nothing
Procedure 1: COMPOUND – Combining multiple NFS operations into a single request
Operation 3: ACCESS – Check access rights
Operation 4: CLOSE – Close file
Operation 5: COMMIT – Commit cached data
Operation 6: CREATE - Create a nonregular file object
Operation 7: DELEGPURGE - Purge delegations awaiting recovery
Operation 8: DELEGRETURN - Return delegation
Operation 9: GETATTR - Get attributes
Operation 10: GETFH - Get current filehandle
Operation 11: LINK - Create link to a file
Operation 12: LOCK - Create lock
Operation 13: LOCKT - Test for Lock
Operation 14: LOCKU - Unlock file
Operation 15: LOOKUP - Look Up filename
Operation 16: LOOKUPP - Look Up parent directory
Operation 17: NVERIFY - Verify difference in attributes
Operation 18: OPEN - Open a regular file
Operation 19: OPENATTR - Open named attribute directory
Operation 20: OPEN_CONFIRM - Confirm open
Operation 21: OPEN_DOWNGRADE - Reduce open file access
Operation 22: PUTFH - Set current filehandle
Operation 23: PUTPUBFH - Set public filehandle
Operation 24: PUTROOTFH - Set root filehandle
Operation 26: READDIR - Read directory
Operation 27: READLINK - Read symbolic link
Operation 28: REMOVE - Remove file system object
Operation 29: RENAME - Rename directory entry
Operation 30: RENEW - Renew a lease
Operation 32: SAVEFH - Save current filehandle
Operation 33: SECINFO - Obtain available security
Operation 34: SETATTR - Set attributes
Operation 35: SETCLIENTID - Negotiate client ID
Operation 36: SETCLIENTID_CONFIRM - Confirm client ID
Operation 37: VERIFY - Verify same attributes
Operation 39: RELEASE_LOCKOWNER – Release lock-owner state

SMB (includes SMB2 and SMB3.x)

SMB commands included in Other IOPS with opcode value:

SMB command	Opcode value
SMB2 NEGOTIATE	0x0000
SMB2 SESSION_SETUP	0x0001
SMB2 LOGOFF	0x0002
SMB2 TREE_CONNECT	0x0003
SMB2 TREE_DISCONNECT	0x0004
SMB2 CREATE	0x0005
SMB2 CLOSE	0x0006
SMB2 FLUSH	0x0007
SMB2 LOCK	0x000A
SMB2 IOCTL	0x000B
SMB2 CANCEL	0x000C
SMB2 ECHO	0x000D
SMB2 QUERY_DIRECTORY	0x000E
SMB2 CHANGE_NOTIFY	0x000F
SMB2 QUERY_INFO	0x0010
SMB2 SET_INFO	0x0011
SMB2 OPLOCK_BREAK	0x0012

Ways to access metrics

Azure NetApp Files metrics are natively integrated into Azure monitor. From within the Azure portal, you can find metrics for Azure NetApp Files capacity pools and volumes from two locations:

From Azure monitor, select Metrics, select a capacity pool or volume. Then select Metric to view the available metrics:
From the Azure NetApp Files capacity pool or volume, select Metrics. Then select Metric to view the available metrics:

Usage metrics for capacity pools

Pool Allocated Size
The provisioned size of the pool.
Pool Allocated to Volume Size
The total of volume quota (GiB) in a given capacity pool (that is, the total of the volumes' provisioned sizes in the capacity pool).
This size is the size you selected during volume creation.
Pool Consumed Size
The total of logical space (GiB) used across volumes in a capacity pool.
Total Snapshot Size for the Pool
The sum of snapshot size from all volumes in the pool.

Usage metrics for volumes

Percentage Volume Consumed Size
The percentage of the volume consumed, including snapshots.
Aggregation metrics (for example, min, max) aren't supported for percentage volume consumed size.
Volume Allocated Size
The provisioned size of a volume
Volume Quota Size
The quota size (GiB) the volume is provisioned with.
Volume Consumed Size
Logical size of the volume (used bytes).
This size includes logical space used by active file systems and snapshots.
Volume Snapshot Size
The size of all snapshots in a volume.
Throughput limit reached

Throughput limit reached is a boolean metric that denotes the volume is hitting its QoS limits. The value 1 means that the volume has reached its maximum throughput, and throughput for this volume will be throttled. The value 0 means this limit hasn't yet been reached.

Note

The Throughput limit reached metrics is collected every 5 minutes and is displayed as a hit if it has been collected in the last 5 minutes.

If the volume is hitting the throughput limit, it's not sized appropriately for the application's demands. To resolve throughput issues:
- Resize the volume:
  
  Increase the volume size to allocate more throughput to the volume so it's not throttled.
- Modify the service level:
  
  The Premium and Ultra service levels in Azure NetApp Files cater to workloads with higher throughput requirements. Moving the volume to a capacity pool in a higher service level automatically increases these limits for the volume.
- Change the workloads/application:
  
  Consider repurposing the volume and delegating a different volume with a larger size and/or in a higher service level to meet your application requirements. If it's an NFS volume, consider changing mount options to reduce data flow if your application supports those changes.

Performance metrics for volumes

Note

Volume latency for Average Read Latency and Average Write Latency is measured within the storage service and does not include network latency.

Average Read Latency
The average roundtrip time (RTT) for reads from the volume in milliseconds.
Average Write Latency
The average roundtrip time (RTT) for writes from the volume in milliseconds.
Read IOPS
The number of read operations to the volume per second.
Write IOPS
The number of write operations to the volume per second.
Other IOPS The number of other operations to the volume per second.
Total IOPS A sum of the write, read, and other operations to the volume per second.

Volume replication metrics

Note

Network transfer size (for example, the Volume replication total transfer metrics) might differ from the source or destination volumes of a cross-region replication. This behavior is a result of efficient replication engine being used to minimize the network transfer cost.
Volume replication metrics are currently populated for replication destination volumes and not the source of the replication relationship.

Is volume replication status healthy
The condition of the replication relationship. A healthy state is denoted by 1. An unhealthy state is denoted by 0.
Is volume replication transferring
Whether the status of the volume replication is transferring.
Volume replication lag time
Lag time is the actual amount of time the replication lags behind the source. It indicates the age of the replicated data in the destination volume relative to the source volume.

Note

When assessing the health status of the volume replication, consider the volume replication lag time. If the lag time is greater than the replication schedule, the replication volume will not catch up to the source. To resolve this issue, adjust the replication speed or the replication schedule.

Volume replication last transfer duration
The amount of time in seconds it took for the last transfer to complete.
Volume replication last transfer size
The total number of bytes transferred as part of the last transfer.
Volume replication progress
The total amount of data transferred for the current transfer operation.
Volume replication total transfer
The cumulative bytes transferred for the relationship.

Throughput metrics for capacity pools

Pool allocated throughput
Sum of the throughput of all the volumes belonging to the pool.
Provisioned throughput for the pool
Provisioned throughput of this pool.

Throughput metrics for volumes

Read throughput
Read throughput in bytes per second.
Total throughput
Sum of all throughput in bytes per second.
Write throughput
Write throughput in bytes per second.
Other throughput
Other throughput (that isn't read or write) in bytes per second.
Total throughput Sum of all throughput (read, write, and other) in bytes per second.

Volume backup metrics

Is Volume Backup Enabled
Shows whether backup is enabled for the volume. 1 is enabled. 0 is disabled.
Is Volume Backup Operation Complete
Shows whether the last volume backup or restore operation is successfully completed. 1 is successful. 0 is unsuccessful.
Is Volume Backup Suspended
Shows whether the backup policy is suspended for the volume. 1 isn't suspended. 0 is suspended.
Volume Backup Bytes
The total bytes backed up for this volume.
Volume Backup Last Transferred Bytes
The total bytes transferred for the last backup or restore operation.
Volume Backup Operation Last Transferred Bytes
Total bytes transferred for last backup operation.
Volume Backup Restore Operation Last Transferred Bytes
Total bytes transferred for last backup restore operation.

Cool access metrics

Volume cool tier size
Volume footprint for the cool tier.
Volume cool tier data read size
Data read in using GET per volume.
Volume cool tier data write size
Data tiered out using PUT per volume.

Поділитися через

Metrics for Azure NetApp Files

Definitions

About storage performance operation metrics

Ways to access metrics

Usage metrics for capacity pools

Usage metrics for volumes

Performance metrics for volumes

Volume replication metrics

Throughput metrics for capacity pools

Throughput metrics for volumes

Volume backup metrics

Cool access metrics

Next steps

Зворотний зв’язок

Додаткові ресурси