@keroido To get more than one description text of the image you can use describeImage API and use the maxCandidates query parameter to the required number. This is how your request URI would look like:
https://eastus.api.cognitive.microsoft.com/vision/v3.0/describe?maxCandidates=3&language=en
The result in this case would be the following:
{
"description": {
"tags": ["outdoor", "building", "photo", "city", "large", "sitting", "old", "water", "skyscraper", "many", "boat", "river", "group", "people", "street", "tall", "field", "bird", "standing"],
"captions": [{
"text": "a large city",
"confidence": 0.95549135022361287
}, {
"text": "an old photo of a large city",
"confidence": 0.93256271335599006
}, {
"text": "an old photo of a city",
"confidence": 0.93156271335599006
}]
},
"requestId": "<request_id>",
"metadata": {
"height": 300,
"width": 239,
"format": "Png"
}
}