Hi Anil. It looks like you investigated further and came up with some good answers and insights.
As you point out, starting with chaos testing in a Production environment is not a great idea and it is best to "shift left" and do this validation earlier in the pipeline and in a place, such as a Test stage, where you can control the impact radius and not impact customers.
Chaos Studio enables orchestration of sequential and parallel fault actions so that you can craft experiments that represent real-world scenarios and outages. You can author experiments to disrupt each dependency one-by-one or you can create experiments that represent resiliency scenarios such as an availability zone going down, a DNS outage, an AAD outage, region failover, and so on.
Regarding limitations, as you point out, having faults the fault library is important to be able to impact your dependencies and build the scenarios you are interested in. The good news is that the fault library is expanding and growing each month. At this time, there is no support for customer-added faults there are plans for a BYOF (bring-your-own fault) feature in the future.
Customers use Chaos Studio to validate resilience of both PaaS and IaaS solutions. Network faults, service direct faults, and agent-based faults running in VMs offer a variety of ways to introduce disruptions to validate different scenarios. The key is that Chaos Studio is Azure focused right now.