일괄 처리 기록 결과 가져오기

아티클
09/12/2024

대화 내용 기록 결과를 가져오려면 먼저 대화 내용 기록 작업의 상태를 확인합니다. 작업이 완료되면 대화 내용 기록 및 대화 내용 기록 보고서를 검색할 수 있습니다.

대화 내용 기록 상태 가져오기

대화 내용 기록 작업의 상태를 가져오려면 음성 텍스트 변환 REST API의 Transcriptions_Get 작업을 호출합니다.

Important

일괄 처리 대화 내용 기록 작업은 최상의 노력으로 예약됩니다. 사용량이 많은 시간에는 대화 내용 기록 작업이 처리를 시작하는 데 최대 30분 이상이 걸릴 수 있습니다. 대부분의 경우 실행 중에 대화 내용 기록 상태는 Running입니다. 이는 작업이 일괄 처리 대화 내용 기록 백 엔드 시스템으로 이동하는 순간 Running 상태가 할당되기 때문입니다. 기본 모델을 사용하는 경우 이 할당은 거의 즉시 발생합니다. 사용자 지정 모델의 경우 약간 느립니다. 따라서 대화 내용 기록 작업이 Running 상태에서 소비하는 시간은 실제 대화 내용 기록 시간에 해당하지 않지만 내부 큐의 대기 시간도 포함합니다.

다음 예제와 같이 URI를 사용하여 HTTP GET 요청을 만듭니다. YourTranscriptionId를 대화 기록 ID로 바꾸고 YourSubscriptionKey를 음성 리소스 키로 바꾸고 YourServiceRegion을 음성 리소스 지역으로 바꿉니다.

curl -v -X GET "https://YourServiceRegion.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/YourTranscriptionId" -H "Ocp-Apim-Subscription-Key: YourSubscriptionKey"

응답 본문은 다음 형식으로 표시되어야 합니다.

{
  "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3",
  "model": {
    "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/models/base/aaa321e9-5a4e-4db1-88a2-f251bbe7b555"
  },
  "links": {
    "files": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files"
  },
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": true,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT3S",
    "languageIdentification": {
      "candidateLocales": [
        "en-US",
        "de-DE",
        "es-ES"
      ]
    }
  },
  "lastActionDateTime": "2024-05-10T18:39:09Z",
  "status": "Succeeded",
  "createdDateTime": "2024-05-10T18:39:07Z",
  "locale": "en-US",
  "displayName": "My Transcription"
}

status 속성은 대화 기록의 현재 상태를 나타냅니다. 대화 내용 기록 및 대화 내용 기록 보고서는 대화 내용 기록 상태가 Succeeded일 때 사용할 수 있습니다.

Important

대화 내용 기록 작업의 상태를 가져오려면 spx batch transcription status 명령을 사용합니다. 다음 지침에 따라 요청 매개 변수를 생성합니다.

transcription 매개 변수를 가져오려는 대화 내용 기록의 ID로 설정합니다.

다음은 대화 내용 기록 상태를 가져오는 음성 CLI 명령의 예입니다.

spx batch transcription status --api-version v3.2 --transcription YourTranscriptionId

응답 본문은 다음 형식으로 표시되어야 합니다.

{
  "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3",
  "model": {
    "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/models/base/aaa321e9-5a4e-4db1-88a2-f251bbe7b555"
  },
  "links": {
    "files": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files"
  },
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": true,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT3S"
  },
  "lastActionDateTime": "2024-05-10T18:39:09Z",
  "status": "Succeeded",
  "createdDateTime": "2024-05-10T18:39:07Z",
  "locale": "en-US",
  "displayName": "My Transcription"
}

대화 내용 기록에 대한 음성 CLI 도움말을 보려면 다음 명령을 실행합니다.

spx help batch transcription

대화 내용 기록 결과 가져오기

Transcriptions_ListFiles 작업은 대화 내용 기록에 대한 결과 파일 목록을 반환합니다. 제출된 각 일괄 처리 기록 작업에 대해 기록 보고서 파일이 제공됩니다. 또한 성공적으로 대화 기록된 각 오디오 파일에 대해 하나의 스크립트 파일(최종 결과)이 제공됩니다.

이전 응답 본문의 "files" URI를 사용하여 HTTP GET 요청을 수행합니다. YourTranscriptionId를 대화 기록 ID로 바꾸고 YourSubscriptionKey를 음성 리소스 키로 바꾸고 YourServiceRegion을 음성 리소스 지역으로 바꿉니다.

curl -v -X GET "https://YourServiceRegion.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/YourTranscriptionId/files" -H "Ocp-Apim-Subscription-Key: YourSubscriptionKey"

응답 본문은 다음 형식으로 표시되어야 합니다.

{
  "values": [
    {
      "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files/2dd180a1-434e-4368-a1ac-37350700284f",
      "name": "contenturl_0.json",
      "kind": "Transcription",
      "properties": {
        "size": 3407
      },
      "createdDateTime": "2024-05-10T18:39:09Z",
      "links": {
        "contentUrl": "YourTranscriptionUrl"
      }
    },
    {
      "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files/c027c6a9-2436-4303-b64b-e98e3c9fc2e3",
      "name": "contenturl_1.json",
      "kind": "Transcription",
      "properties": {
        "size": 8233
      },
      "createdDateTime": "2024-05-10T18:39:09Z",
      "links": {
        "contentUrl": "YourTranscriptionUrl"
      }
    },
    {
      "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files/faea9a41-c95c-4d91-96ff-e39225def642",
      "name": "report.json",
      "kind": "TranscriptionReport",
      "properties": {
        "size": 279
      },
      "createdDateTime": "2024-05-10T18:39:09Z",
      "links": {
        "contentUrl": "YourTranscriptionReportUrl"
      }
    }
  ]
}

자세한 내용이 포함된 각 기록 및 기록 보고서 파일의 위치는 응답 본문에 반환됩니다. contentUrl 속성에는 기록("kind": "Transcription") 또는 기록 보고서("kind": "TranscriptionReport") 파일에 대한 URL이 포함됩니다.

기록 요청의 destinationContainerUrl 속성에 컨테이너를 지정하지 않은 경우 결과는 Microsoft에서 관리하는 컨테이너에 저장됩니다. 대화 기록 작업이 삭제되면 대화 기록 결과 데이터도 삭제됩니다.

spx batch transcription list 명령은 대화 내용 기록에 대한 결과 파일 목록을 반환합니다. 제출된 각 일괄 처리 기록 작업에 대해 기록 보고서 파일이 제공됩니다. 또한 성공적으로 대화 기록된 각 오디오 파일에 대해 하나의 스크립트 파일(최종 결과)이 제공됩니다.

필수 files 플래그를 설정합니다.
필수 transcription 매개 변수를 로그를 가져오려는 대화 내용 기록의 ID로 설정합니다.

다음은 대화 내용 기록에 대한 결과 파일 목록을 가져오는 음성 CLI 명령의 예입니다.

spx batch transcription list --api-version v3.2 --files --transcription YourTranscriptionId

응답 본문은 다음 형식으로 표시되어야 합니다.

{
  "values": [
    {
      "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files/2dd180a1-434e-4368-a1ac-37350700284f",
      "name": "contenturl_0.json",
      "kind": "Transcription",
      "properties": {
        "size": 3407
      },
      "createdDateTime": "2024-05-10T18:39:09Z",
      "links": {
        "contentUrl": "YourTranscriptionUrl"
      }
    },
    {
      "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files/c027c6a9-2436-4303-b64b-e98e3c9fc2e3",
      "name": "contenturl_1.json",
      "kind": "Transcription",
      "properties": {
        "size": 8233
      },
      "createdDateTime": "2024-05-10T18:39:09Z",
      "links": {
        "contentUrl": "YourTranscriptionUrl"
      }
    },
    {
      "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/637d9333-6559-47a6-b8de-c7d732c1ddf3/files/faea9a41-c95c-4d91-96ff-e39225def642",
      "name": "report.json",
      "kind": "TranscriptionReport",
      "properties": {
        "size": 279
      },
      "createdDateTime": "2024-05-10T18:39:09Z",
      "links": {
        "contentUrl": "YourTranscriptionReportUrl"
      }
    }
  ]
}

기본적으로 결과는 Microsoft에서 관리하는 컨테이너에 저장됩니다. 대화 기록 작업이 삭제되면 대화 기록 결과 데이터도 삭제됩니다.

대화 내용 기록 보고서 파일

제출된 각 일괄 처리 대화 내용 기록 작업에 대해 하나의 기록 보고서 파일이 제공됩니다.

각 대화 기록 결과 파일의 콘텐츠는 이 예와 같이 JSON 형식입니다.

{
  "successfulTranscriptionsCount": 2,
  "failedTranscriptionsCount": 0,
  "details": [
    {
      "source": "https://crbn.us/hello.wav",
      "status": "Succeeded"
    },
    {
      "source": "https://crbn.us/whatstheweatherlike.wav",
      "status": "Succeeded"
    }
  ]
}

대화 내용 기록 결과 파일

성공적으로 대화 내용이 기록된 각 오디오 파일에 대해 하나의 대화 내용 기록 결과 파일이 제공됩니다.

각 대화 기록 결과 파일의 콘텐츠는 이 예와 같이 JSON 형식입니다.

{
  "source": "...",
  "timestamp": "2023-07-10T14:28:16Z",
  "durationInTicks": 25800000,
  "duration": "PT2.58S",
  "combinedRecognizedPhrases": [
    {
      "channel": 0,
      "lexical": "hello world",
      "itn": "hello world",
      "maskedITN": "hello world",
      "display": "Hello world."
    }
  ],
  "recognizedPhrases": [
    {
      "recognitionStatus": "Success",
      "channel": 0,
      "offset": "PT0.76S",
      "duration": "PT1.32S",
      "offsetInTicks": 7600000.0,
      "durationInTicks": 13200000.0,
      "nBest": [
        {
          "confidence": 0.5643338,
          "lexical": "hello world",
          "itn": "hello world",
          "maskedITN": "hello world",
          "display": "Hello world.",
          "displayWords": [
            {
              "displayText": "Hello",
              "offset": "PT0.76S",
              "duration": "PT0.76S",
              "offsetInTicks": 7600000.0,
              "durationInTicks": 7600000.0
            },
            {
              "displayText": "world.",
              "offset": "PT1.52S",
              "duration": "PT0.56S",
              "offsetInTicks": 15200000.0,
              "durationInTicks": 5600000.0
            }
          ]
        },
        {
          "confidence": 0.1769063,
          "lexical": "helloworld",
          "itn": "helloworld",
          "maskedITN": "helloworld",
          "display": "helloworld"
        },
        {
          "confidence": 0.49964225,
          "lexical": "hello worlds",
          "itn": "hello worlds",
          "maskedITN": "hello worlds",
          "display": "hello worlds"
        },
        {
          "confidence": 0.4995761,
          "lexical": "hello worm",
          "itn": "hello worm",
          "maskedITN": "hello worm",
          "display": "hello worm"
        },
        {
          "confidence": 0.49418187,
          "lexical": "hello word",
          "itn": "hello word",
          "maskedITN": "hello word",
          "display": "hello word"
        }
      ]
    }
  ]
}

대화 내용 기록을 만들 때 설정한 요청 매개 변수에 따라 대화 내용 기록 파일에 다음 결과 속성이 포함될 수 있습니다.

속성	설명
`channel`	결과의 채널 번호입니다. 스테레오 오디오 스트림의 경우 전사 중 왼쪽 및 오른쪽 채널이 분할됩니다. 각 입력 오디오 파일에 대해 JSON 결과 파일이 만들어집니다.
`combinedRecognizedPhrases`	채널에 대한 모든 구의 연결된 결과입니다.
`confidence`	인식에 대한 신뢰도 값입니다.
`display`	인식된 텍스트의 표시 형식입니다. 추가된 문장 부호와 대문자 표시가 포함됩니다.
`displayWords`	전사의 각 단어에 대한 타임스탬프입니다. `displayFormWordLevelTimestampsEnabled` 요청 속성은 `true`로 설정되어야 합니다. 그렇지 않으면 이 속성이 존재하지 않습니다. 참고: 이 속성은 음성 텍스트 변환 REST API 버전 3.1 이상에서만 사용할 수 있습니다.
`duration`	오디오 지속 시간입니다. 값은 ISO 8601로 인코딩된 기간입니다.
`durationInTicks`	틱 단위의 오디오 지속 시간입니다(1틱은 100나노초).
`itn`	인식된 텍스트의 ITN(역 텍스트 정규화) 형식입니다. "Doctor Smith"에서 "Dr Smith"로의 약어, 전화번호 및 기타 변환이 적용됩니다.
`lexical`	실제 단어를 인식합니다.
`locale`	오디오 입력에서 식별된 로캘입니다. `languageIdentification` 요청 속성을 설정해야 합니다. 그렇지 않으면 이 속성이 존재하지 않습니다. 참고: 이 속성은 음성 텍스트 변환 REST API 버전 3.1 이상에서만 사용할 수 있습니다.
`maskedITN`	욕설 마스킹이 적용된 ITN 형태입니다.
`nBest`	신뢰도할 수 있는 현재 구에 대한 가능한 대화 내용 기록 목록입니다.
`offset`	이 구의 오디오 오프셋입니다. 값은 ISO 8601로 인코딩된 기간입니다.
`offsetInTicks`	이 구의 오디오 오프셋은 틱 단위입니다(1틱은 100나노초).
`recognitionStatus`	인식 상태입니다. 예: "성공" 또는 "실패".
`recognizedPhrases`	각 구에 대한 결과 목록입니다.
`source`	입력 오디오 원본으로 제공된 URL입니다. 원본은 `contentUrls` 또는 `contentContainerUrl` 요청 속성에 해당합니다. `source` 속성은 대화 내용 기록에 대한 오디오 입력을 확인하는 유일한 방법입니다.
`speaker`	식별된 화자입니다. `diarization` 및 `diarizationEnabled` 요청 속성을 설정해야 합니다. 그렇지 않으면 이 속성이 존재하지 않습니다.
`timestamp`	대화 내용 기록의 만들기 날짜 및 시간입니다. 값은 ISO 8601로 인코딩된 타임스탬프입니다.
`words`	구의 각 단어에 대한 어휘 텍스트가 포함된 결과 목록입니다. `wordLevelTimestampsEnabled` 요청 속성은 `true`로 설정되어야 합니다. 그렇지 않으면 이 속성이 존재하지 않습니다.

다음을 통해 공유

일괄 처리 기록 결과 가져오기

대화 내용 기록 상태 가져오기

대화 내용 기록 결과 가져오기

대화 내용 기록 보고서 파일

대화 내용 기록 결과 파일

다음 단계

피드백

추가 리소스