Batch-Transcription and Power Ultimate does not work for speaker diarization

Question

Batch-Transcription and Power Ultimate does not work for speaker diarization

Marisa Henn 0

Hello, I am attempting to utilize batch transcription for my research. Following a YouTube tutorial, I used Power Automate to execute the batch transcription (you can find the tutorial here: YouTube Tutorial). Everything worked flawlessly until I attempted to implement speaker diarization. Upon integrating speaker diarization into the workflow, I receive the following JSON output (output in text does not include speaker diarization at all): "}],"recognizedPhrases":[{"recognitionStatus":"Success","channel":0,"speaker":1,"offset":"PT0.56S","duration":"PT2.88S","offsetInTicks":5600000.0,"durationInTicks":28800000.0,"nBest":[{"confidence":0.088088915,"lexical":"this episode of yap is sponsored by shopify","itn":"this episode of yap is sponsored by shopify","maskedITN":"","display":"This episode of Yap is sponsored by Shopify.","words":[{"word":"this","offset":"PT0.56S","duration":"PT0.32S","offsetInTicks":5600000.0,"durationInTicks":3200000.0, (...) I desire to receive an output similar to this: [Sprecher 1 00:00] This episode of Yap is sponsored by Shopify. Shopify simplifies selling online and in person so you can focus on successfully growing your business. Sign up for a $1.00 per month trial period at shopify.com/profiting. [Sprecher 2 00:16] Auto fence Here comes Ana Richte Stolen Narcisch. Then go to Gaussian Lieben van Fossettzenman Zuvi Baiden Audi Gabrock wagon plus Vagin Best some dresses in October Vatten by the Audi Gabrock Wagon Plus partner Fila Modeller. So top leasing condicion and of dish Allah otos Zenzo fart for fikbar to evaluate and auto solisen then fended in Audi that sudia past Ali in force of Audi de podcast auda direct by tiny Minden Audi Gabriel wagon plus partner. Are you aware of the steps I should take to achieve the intended result? Best regards Marisa

1 Antwort

Ihre Antwort

Answer 1

Ivan Dragov (CONCENTRIX Corporation) 2,640 Externe Microsoft-Mitarbeiter

Hallo Marisa,

Da Du Dich im deutschsprachigen Q&A befindest, gehe ich mit der Kommunikation auf Deutsch fort. Diarisierung ist der Vorgang der Erkennung und Aufteilung von Sprechern in Monokanal-Audiodaten. Verwendest Du als Eingang Monokanal-Audiodaten? Weitere Informationen findest Du in diesem Artikel:

Erfassungsclient mit Azure KI Services > Features des Erfassungsclients

Der Dienst funktioniert am besten mit mindestens 7 Sekunden kontinuierlicher Audiowiedergabe von einem einzelnen Sprecher. Dann kann das System die Sprecher ordnungsgemäß unterscheiden. Andernfalls wird die Sprecher-ID als Unknown zurückgegeben, wie hier erläutert:

Schnellstart: Echtzeit-Diarisierung (Vorschau) > Diarisierung aus Datei mit Unterhaltungstranskription

Haben Deine Redner mindestens 7 Sekunden ungestörte Redezeit?

Gruß,

Ivan Dragov

Marisa Henn 0 Zuverlässigkeitspunkte

2024-01-30T20:15:44.2666667+00:00

Hallo Ivan, vielen Dank für deine Antwort. Ja, die Redner haben min. 7 Minuten ungestörte Redezeit. Das JSON File spuckt auch die unterschiedlichen Speaker samt Minutenangabe aus. Nennt dann allerdings immer nur das erste Wort des Speakers und nicht den ganzen Satz. z.B.: "recognizedPhrases":[{"recognitionStatus":"Success","channel":0,"speaker":1,"offset":"PT0.56S","duration":"PT2.88S","offsetInTicks":5600000.0,"durationInTicks":28800000.0,"nBest":[{"confidence":0.088088915,"lexical":"this episode of yap is sponsored by shopify","itn":"this episode of yap is sponsored by shopify","maskedITN":"This episode of Yap is sponsored by Shopify","display":"This episode of Yap is sponsored by Shopify. Wenn ich mir die Textdatei ausgeben lasse, sind die Speaker komplett verschwunden. Super gern würde ich die Ausgabe wie folgt haben: Speaker 1 (1:0000): Text Speaker 2 (1:30): Text Hiernach habe ich die Batch-Transkription aufgebaut und bei Diarization "ja" angegeben. Bei Speakern min. 1 und max. 2. Ich habe mp3 mono als Dateiformat verwendet. Ich hoffe, ich habe es verständlich erklärt. Ganz lieben Dank für deine Hilfe.
LG Marisa
Marisa Henn 0 Zuverlässigkeitspunkte

2024-01-30T20:16:47.46+00:00

Das ist der Link zur Batch Transkription: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/transcribe-audio-to-text-from-blob-storage-without-writing-any/ba-p/3778471

Freigeben über

Batch-Transcription and Power Ultimate does not work for speaker diarization

1 Antwort

Ihre Antwort