Configure OpenSSL for Linux
With the Speech SDK, OpenSSL is dynamically configured to the host-system version.
This article is only applicable where the Speech SDK is supported on Linux.
To ensure connectivity, verify that OpenSSL certificates have been installed in your system. Run a command:
openssl version -d
The output on Ubuntu/Debian based systems should be:
Check whether there's a
certs subdirectory under OPENSSLDIR. In the example above, it would be
/usr/lib/ssl/certsexists, and if it contains many individual certificate files (with
.pemextension), there's no need for further actions.
If OPENSSLDIR is something other than
/usr/lib/sslor there's a single certificate bundle file instead of multiple individual files, you need to set an appropriate SSL environment variable to indicate where the certificates can be found.
Here are some example environment variables to configure per OpenSSL directory.
- OPENSSLDIR is
/opt/ssl. There's a
certssubdirectory with many
.pemfiles. Set the environment variable
SSL_CERT_DIRto point at
/opt/ssl/certsbefore using the Speech SDK. For example:
- OPENSSLDIR is
/etc/pki/tls(like on RHEL/CentOS based systems). There's a
certssubdirectory with a certificate bundle file, for example
ca-bundle.crt. Set the environment variable
SSL_CERT_FILEto point at that file before using the Speech SDK. For example:
Certificate revocation checks
When the Speech SDK connects to the Speech Service, it checks the Transport Layer Security (TLS/SSL) certificate. The Speech SDK verifies that the certificate reported by the remote endpoint is trusted and hasn't been revoked. This verification provides a layer of protection against attacks involving spoofing and other related vectors. The check is accomplished by retrieving a certificate revocation list (CRL) from a certificate authority (CA) used by Azure. A list of Azure CA download locations for updated TLS CRLs can be found in this document.
If a destination posing as the Speech Service reports a certificate that's been revoked in a retrieved CRL, the SDK will terminate the connection and report an error via a
Canceled event. The authenticity of a reported certificate can't be checked without an updated CRL. Therefore, the Speech SDK will also treat a failure to download a CRL from an Azure CA location as an error.
Large CRL files (>10 MB)
One cause of CRL-related failures is the use of large CRL files. This class of error is typically only applicable to special environments with extended CA chains. Standard public endpoints shouldn't encounter this class of issue.
The default maximum CRL size used by the Speech SDK (10 MB) can be adjusted per config object. The property key for this adjustment is
CONFIG_MAX_CRL_SIZE_KB and the value, specified as a string, is by default "10000" (10 MB). For example, when creating a
SpeechRecognizer object (that manages a connection to the Speech Service), you can set this property in its
SpeechConfig. In the snippet below, the configuration is adjusted to permit a CRL file size up to 15 MB.
Bypassing or ignoring CRL failures
If an environment can't be configured to access an Azure CA location, the Speech SDK will never be able to retrieve an updated CRL. You can configure the SDK either to continue and log download failures or to bypass all CRL checks.
CRL checks are a security measure and bypassing them increases susceptibility to attacks. They should not be bypassed without thorough consideration of the security implications and alternative mechanisms for protecting against the attack vectors that CRL checks mitigate.
To continue with the connection when a CRL can't be retrieved, set the property
"true". An attempt will still be made to retrieve a CRL and failures will still be emitted in logs, but connection attempts will be allowed to continue.
To turn off certificate revocation checks, set the property
"true". Then, while connecting to the Speech Service, there will be no attempt to check or download a CRL and no automatic verification of a reported TLS/SSL certificate.
CRL caching and performance
By default, the Speech SDK will cache a successfully downloaded CRL on disk to improve the initial latency of future connections. When no cached CRL is present or when the cached CRL is expired, a new list will be downloaded.
Some Linux distributions don't have a
TMPDIR environment variable defined, so the Speech SDK won't cache downloaded CRLs. Without
TMPDIR environment variable defined, the Speech SDK will download a new CRL for each connection. To improve initial connection performance in this situation, you can create a
TMPDIR environment variable and set it to the accessible path of a temporary directory..