Hi @Mantas Urnieza,
Thank you for reaching out to Microsoft Q&A forum!
Azure OpenAI's quota feature lets you manage rate limits for model deployments by assigning Tokens-per-Minute (TPM) per region and model. Each deployment has a TPM and Requests-per-Minute (RPM) limit. Once your quota is reached, you can reduce TPM on existing deployments or request an increase. TPM allocation is flexible, allowing you to create multiple deployments as long as the total TPM does not exceed your quota. You can adjust TPM from Azure AI Studio and request quota increases as needed through the portal.
For more info you can refer to: Manage Azure OpenAI Service quota.
To view your quota allocations across deployments in a given region, select Shared Resources> Quota in Azure OpenAI studio and click on the link to increase the quota.
To migrating from OpenAI to Azure OpenAI refer to this documentation.
I hope this helps! Thank you.