In your code where you call the Computer Vision API, you can specify which version of the API to use.
The latest version is 3.2, but 3.1 and 3.0 can also be used to get different results.
For instance, API 1.0 - 3.0 don't seem to return different captions (they can return different tags though).
3.1 and 3.2 return slightly different captions.
If you are comfortable with Python, you can try out different API captions with this sample:
https://github.com/stephen-howell/AI-for-Accessibility-Vision-AI-Captioning
the line
analyze_url = endpoint + "vision/v" + api_version + "/analyze"
is the line where the Azure Computer Vision endpoint is combined with the vision service, the api, and the request to analyze the results into a single URL for the REST call.
Usage: py caption.py api_version image.png
api_version should be 3.2 for the latest one, but try 3.1 and 3.0 to compare too.