How to create benchmark for DiskIo? How to confidently say that DiskIO is Optimum or not?

We recently migrated from on-prem to azure. I have been exploring disk-io. I executed DiskSpd tool that microsoft provides. Without cache (enabling software cache in DiskSpd) the on-prem performs multiple times better than Azure. The throughput and latency were better in on-prem. Then I enabled software cache and tested. The throughput was similar in both Azure and On-Prem, but latency was much better in On-Prem. On prem is esx storage. Azure is Premium SSD D8 VM series. Coming to Azure disk, I have two VMs one has host caching read/write enabled and other doesn't. The SSD storage which doesn't have host caching enabled, when I execute DiskSpd with software caching enabled, the latency was similar to On-Prem but throughput was slighty worse. If I disable software caching, both latency and throughput is bad.
- Is there a tool that can be used to test diskio on Azure SSD apart from DiskSpd. Do I have to write C/C++ tool to test DiskIo or is DiskSpd the tool to test. How different is Softwate cache compared to BlobCache which Premium SSD use?
- On Perfmon, I can see "Notify ChangeDirectory" event when host caching is enabled. The microsoft blog (on Premium SSD) says that call to flush from cache to disk is synchronous, does it mean Write caching has impact on performance of DiskIo