The problem related to Speech to text implementation on LAVIE Tab 10FD3 tablet

Question

We are implementing Azure Speech to Text in our app and tested it on several devices, but we had problems with the LAVIE Tab 10FD3 tablet (https://www.nec-lavie.jp/products/tablet/lavie/laviet10/) .

The Google Speech to Text API works fine, so it does not seem to be a microphone or hardware problem. Is there any improvement to be made?

Here is the code on Kotlin that we implemented:


// init speechRecognizer to get input from microphone
private val speechRecognizer: SpeechRecognizer by lazy {
        speechConfig = SpeechConfig.fromSubscription(SPEECH_SUBSCRIPTION_KEY, SPEECH_REGION)
        destroyMicrophoneStream() // in case it was previously initialized
        microphoneStream = MicrophoneStream()

        SpeechRecognizer(
            speechConfig,
            AudioConfig.fromStreamInput(MicrophoneStream.create()),
        )
    }
	
	
// start record the user voice 
// audio record here to detect if the user doesn't speak anything exceed specific seconds, we will stop the recording.
private fun startRecording() {
        if (ActivityCompat.checkSelfPermission(
                this,
                Manifest.permission.RECORD_AUDIO,
            ) != PackageManager.PERMISSION_GRANTED
        ) {
            return
        }
        audioRecord = AudioRecord(
            MediaRecorder.AudioSource.MIC,
            44100,
            AudioFormat.CHANNEL_IN_MONO,
            AudioFormat.ENCODING_PCM_16BIT,
            bufferSize,
        )

        audioRecord.startRecording()
        speechRecognizer.recognized.addEventListener(eventHandler)

        speechRecognizer.startContinuousRecognitionAsync().get()
        isRecording = true

        val handler = Handler(Looper.getMainLooper())
        val buffer = ShortArray(bufferSize)

        handler.post(object : Runnable {
            var lastVoiceTimestamp = System.currentTimeMillis()

            override fun run() {
                if (!isRecording) return

                val read = audioRecord.read(buffer, 0, buffer.size)
                var sum = 0.0

                for (i in 0 until read) {
                    sum += buffer[i] * buffer[i].toDouble()
                }

                val rms = sqrt(sum / read)

                if (rms > silenceThreshold) {
                    lastVoiceTimestamp = System.currentTimeMillis()
                }

                if (System.currentTimeMillis() - lastVoiceTimestamp > silenceTimeout) {
                    stopRecording()
                } else {
                    handler.postDelayed(this, 100)
                }
            }
        })
    }

Here is the video problem: https://drive.google.com/file/d/1WV_Rkwc5WzcNQh9sYgT6954im2b0scOg/view?usp=sharing

Share via

The problem related to Speech to text implementation on LAVIE Tab 10FD3 tablet