No, it is not a dropdown.
How to translate database content with Azure Translator by ADF or Synapse notebook?
There is an Azure Database table. Some of columns need to be translated from one language to another into additional columns. Such as from English to Spanish, or Portuguese to English, etc.
I am exploring how I can use ADF or Synapse notebook to leverage Azure Translator archiving this?
There are more similar tables.
I am open to all other possible options.
Azure SQL Database
Azure Translator
Azure Synapse Analytics
Azure Data Factory
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-22T17:46:27.6133333+00:00 Hello Dataholic,
A similar question was Answered here: https://learn.microsoft.com/en-us/answers/questions/1601976/how-to-translate-the-data-with-azure-ai-translator
You can leverage Azure Cognitive Services Translator API to translate from one language to another.
The API can translate text between various languages, including common languages like English, Spanish, French, Chinese, and many others.
https://learn.microsoft.com/en-us/azure/ai-services/translator/
I hope this answers your queston.
Please let me know if you have any further questions.
-
Dataholic 80 Reputation points
2024-04-23T15:02:08.3066667+00:00 I found an example of SynapseML from Microsoft
This looks like what we need!
However when I followed the steps and ran the sample code, I received error message.
Can you help to take a look?
Here is one of sample code
detectDf = spark.createDataFrame([ (["Hello, what is your name?"],) ], ["text",]) detect = (Detect() .setLinkedService(ai_service_name) .setTextCol("text") .setOutputCol("result")) display(detect .transform(detectDf) .withColumn("language", col("result.language")) .select("language"))
Here is the error message
Py4JJavaError Traceback (most recent call last) Cell In [81], line 10
-
Dataholic 80 Reputation points
2024-04-23T15:07:24.57+00:00 not sure if all the error message show, just incase, I past it here again
Py4JJavaError Traceback (most recent call last) Cell In [81], line 10 1 detectDf = spark.createDataFrame([ 2 (["Hello, what is your name?"],) 3 ], ["text",]) 5 detect = (Detect() 6 .setLinkedService(ai_service_name) 7 .setTextCol("text") 8 .setOutputCol("result")) ---> 10 display(detect 11 .transform(detectDf) 12 .withColumn("language", col("result.language")) 13 .select("language")) File ~/cluster-env/env/lib/python3.10/site-packages/notebookutils/visualization/display.py:242, in display(data, summary) 239 success = False 240 log4jLogger \ 241 .error(f"display failed with error, language: python, error: {err}, correlationId={correlation_id}") --> 242 raise err 243 finally: 244 duration_ms = ceil((time.time() - start_time) * 1000) File ~/cluster-env/env/lib/python3.10/site-packages/notebookutils/visualization/display.py:222, in display(data, summary) 218 if is_ipython_enabled(runtime.host_nbutils_version): 219 # pylint: disable=C0415 220 from IPython.display import publish_display_data 221 publish_display_data({ --> 222 "application/vnd.synapse.display-widget+json": sc._jvm.display.getDisplayResultForIPython( 223 df._jdf, summary, correlation_id) 224 }) 225 else: 226 print(sc._jvm.display.getDisplayResult(df._jdf, summary)) File ~/cluster-env/env/lib/python3.10/site-packages/py4j/java_gateway.py:1321, in JavaMember.__call__(self, *args) 1315 command = proto.CALL_COMMAND_NAME +\ 1316 self.command_header +\ 1317 args_command +\ 1318 proto.END_COMMAND_PART 1320 answer = self.gateway_client.send_command(command) -> 1321 return_value = get_return_value( 1322 answer, self.gateway_client, self.target_id, self.name) 1324 for temp_arg in temp_args: 1325 temp_arg._detach() File /opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py:190, in capture_sql_exception.<locals>.deco(*a, **kw) 188 def deco(*a: Any, **kw: Any) -> Any: 189 try: --> 190 return f(*a, **kw) 191 except Py4JJavaError as e: 192 converted = convert_exception(e.java_exception) File ~/cluster-env/env/lib/python3.10/site-packages/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name) 324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( 331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n". 332 format(target_id, ".", name, value)) Py4JJavaError: An error occurred while calling z:com.microsoft.spark.notebook.visualization.display.getDisplayResultForIPython. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 35.0 failed 4 times, most recent failure: Lost task 2.3 in stage 35.0 (TID 125) (vm-15992475 executor 2): org.apache.http.conn.ConnectTimeoutException: Connect to api.cognitive.microsofttranslator.com:443 [api.cognitive.microsofttranslator.com/20.49.96.128] failed: connect timed out at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at com.microsoft.azure.synapse.ml.io.http.HandlingUtils$.sendWithRetries(HTTPClients.scala:93) at com.microsoft.azure.synapse.ml.io.http.HandlingUtils$.advanced(HTTPClients.scala:157) at com.microsoft.azure.synapse.ml.io.http.HandlingUtils$.$anonfun$advancedUDF$1(HTTPClients.scala:170) at org.apache.spark.injections.UDFUtils$$anon$2.call(UDFUtils.scala:29) at org.apache.spark.sql.functions$.$anonfun$udf$93(functions.scala:5325) at org.apache.spark.injections.UDFUtils$$anon$2.call(UDFUtils.scala:29) at org.apache.spark.sql.functions$.$anonfun$udf$93(functions.scala:5325) at org.apache.spark.injections.UDFUtils$$anon$2.call(UDFUtils.scala:29) at org.apache.spark.sql.functions$.$anonfun$udf$93(functions.scala:5325) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedHTTPClient.$anonfun$handle$2(HTTPClients.scala:198) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:57) at scala.concurrent.package$.blocking(package.scala:146) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedHTTPClient.handle(HTTPClients.scala:198) at com.microsoft.azure.synapse.ml.io.http.HTTPClient.$anonfun$sendRequestWithContext$1(HTTPClients.scala:59) at scala.Option.map(Option.scala:230) at com.microsoft.azure.synapse.ml.io.http.HTTPClient.sendRequestWithContext(HTTPClients.scala:58) at com.microsoft.azure.synapse.ml.io.http.HTTPClient.sendRequestWithContext$(HTTPClients.scala:57) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedHTTPClient.sendRequestWithContext(HTTPClients.scala:194) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedClient.$anonfun$sendRequestsWithContext$1(Clients.scala:43) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:764) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:607) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:368) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ... 49 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2682) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2618) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2617) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2617) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1190) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1190) at scala.Option.foreach(Option.scala:407) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1190) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2870) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2812) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2801) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:958) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2398) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2419) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2438) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:542) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:495) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48) at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3894) at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2876) at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3884) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:626) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3882) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:111) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:183) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:97) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3882) at org.apache.spark.sql.Dataset.head(Dataset.scala:2876) at org.apache.spark.sql.Dataset.take(Dataset.scala:3097) at org.apache.spark.sql.GetRowsHelper$.getRowsInJsonString(GetRowsHelper.scala:51) at com.microsoft.spark.notebook.visualization.display$.generateTableConfig(Display.scala:452) at com.microsoft.spark.notebook.visualization.display$.exec(Display.scala:270) at com.microsoft.spark.notebook.visualization.display$.getDisplayResultInternal(Display.scala:197) at com.microsoft.spark.notebook.visualization.display$.getDisplayResultForIPython(Display.scala:113) at com.microsoft.spark.notebook.visualization.display.getDisplayResultForIPython(Display.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to api.cognitive.microsofttranslator.com:443 [api.cognitive.microsofttranslator.com/20.49.96.128] failed: connect timed out at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at com.microsoft.azure.synapse.ml.io.http.HandlingUtils$.sendWithRetries(HTTPClients.scala:93) at com.microsoft.azure.synapse.ml.io.http.HandlingUtils$.advanced(HTTPClients.scala:157) at com.microsoft.azure.synapse.ml.io.http.HandlingUtils$.$anonfun$advancedUDF$1(HTTPClients.scala:170) at org.apache.spark.injections.UDFUtils$$anon$2.call(UDFUtils.scala:29) at org.apache.spark.sql.functions$.$anonfun$udf$93(functions.scala:5325) at org.apache.spark.injections.UDFUtils$$anon$2.call(UDFUtils.scala:29) at org.apache.spark.sql.functions$.$anonfun$udf$93(functions.scala:5325) at org.apache.spark.injections.UDFUtils$$anon$2.call(UDFUtils.scala:29) at org.apache.spark.sql.functions$.$anonfun$udf$93(functions.scala:5325) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedHTTPClient.$anonfun$handle$2(HTTPClients.scala:198) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:57) at scala.concurrent.package$.blocking(package.scala:146) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedHTTPClient.handle(HTTPClients.scala:198) at com.microsoft.azure.synapse.ml.io.http.HTTPClient.$anonfun$sendRequestWithContext$1(HTTPClients.scala:59) at scala.Option.map(Option.scala:230) at com.microsoft.azure.synapse.ml.io.http.HTTPClient.sendRequestWithContext(HTTPClients.scala:58) at com.microsoft.azure.synapse.ml.io.http.HTTPClient.sendRequestWithContext$(HTTPClients.scala:57) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedHTTPClient.sendRequestWithContext(HTTPClients.scala:194) at com.microsoft.azure.synapse.ml.io.http.SingleThreadedClient.$anonfun$sendRequestsWithContext$1(Clients.scala:43) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:764) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:607) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:368) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ... 49 more
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-23T19:07:17.5833333+00:00 from the error message, it seems like there was a connection timeout when trying to connect to the Microsoft Translator API. Can you please check your linked service config values?
-
Dataholic 80 Reputation points
2024-04-23T21:05:04.5866667+00:00 What kind of config values should I check?
The price tire is pay-as-you-go on a managed private endpoint. The connection status is "approved".
What's else should I check? or how can I test the linked service?
Thanks!
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-23T22:38:43.5266667+00:00 Can you please check the Azure congnitve service endpoint, AKV linked service connection and sec
-
Dataholic 80 Reputation points
2024-04-24T12:56:22.5533333+00:00 Here is the screenshot reference,
-
Dataholic 80 Reputation points
2024-04-24T14:20:07.3733333+00:00 repeat
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-24T18:55:13.6866667+00:00 Do you see the secret name as "drop down"? I see you click on Edit and then manually enter the name. Can you please confirm this?
Sign in to comment
1 answer
Sort by: Most helpful
-
Dataholic 80 Reputation points
2024-04-24T21:55:33.4933333+00:00 -
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-25T17:47:22.9833333+00:00 Did you get any error message like "loading failed"
If yes, can you click on "more" and see the error message?
-
Dataholic 80 Reputation points
2024-04-25T19:09:15.17+00:00 No, no error message
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-26T00:22:11.5966667+00:00 Can you select the Secret name from the drop down without clicking the "edit"?
-
Dataholic 80 Reputation points
2024-04-26T13:17:09.4266667+00:00 yes, the Secrete name can be selected without "edit" checked
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-26T16:31:02.7966667+00:00 Thanks for the details.
Sorry, I couldn't reproduce the error on my end. I suggest filing a support request so that a support engineer can further troubleshoot the issue during a screen-sharing session. If you don't have a support plan, please let me know. I can provide a one-time free support request to work on this case.
I look forward to hearing from you.
-
Dataholic 80 Reputation points
2024-04-29T13:38:44.4466667+00:00 If you can provide me with one-time free support, that would be great. Thanks a lot of your helps!
-
BhargavaGunnam-MSFT 26,306 Reputation points • Microsoft Employee
2024-04-29T16:33:47.5733333+00:00 Hello Dataholic,
Thank you. I have sent a private message. Please check and provide the requested details.
Sign in to comment -