Copy Activity - Big Query to Azure SQL Database - GHighThroughputApiError

Gabriel Levaillant 35 Reputation points
2023-07-09T12:34:04.4733333+00:00

I am using a Copy Activity to copy data from Big Query to Azure SQL Database and I get this error :

Failure happened on 'Source' side. ErrorCode=User Error Odbc Operation Failed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ERROR [HY000] [Microsoft][DSI] An error occurred while attempting to retrieve the error message for key 'GHighThroughputApiError' with message parameters ['Error: { code: UNKNOWN, message: Stream removed }'] and component ID 100: Message not found in file "D:\batch\tasks\shared\ODBC Drivers\Microsoft Google BigQuery ODBC Driver_2.4.5.1014\lib..\ErrorMessages\en-US\SimbaBigQueryODBCMessages.xml",Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [HY000] [Microsoft][DSI] An error occurred while attempting to retrieve the error message for key 'GHighThroughputApiError' with message parameters ['Error: { code: UNKNOWN, message: Stream removed }'] and component ID 100: Message not found in file "D:\batch\tasks\shared\ODBC Drivers\Microsoft Google BigQuery ODBC Driver_2.4.5.1014\lib..\ErrorMessages\en-US\SimbaBigQueryODBCMessages.xml",Source=Microsoft ODBC Driver for Google BigQuery,'

Any idea where it could come from ?

PS : I can copy from BigQuery to Azure DL then Azure DL to SQL, it works 100% of the time this way.

Best,

Gabriel

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,559 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,627 questions
{count} votes

Accepted answer
  1. ShadowWalker 90 Reputation points
    2023-10-05T12:52:32.6366667+00:00

    Hi
    I have been facing same issue and it started to appear more often (almost daily on our BQ copy jobs)

    GCP Support: After talking to GCP support, they pointed out that the ODBC drivers used by BigQuery connector in ADF are outdated. As per them, new updates were introduced in July-2023 (3.0.2.1005) whereas Microsoft is still using 2.4.5.1014. This maybe the cause. I don't know how often Microsoft updates the drivers for serverless-instances. On top of that, Microsoft doesn't let us create a support ticket from within the portal. Somehow the automatic troubleshooter finds no problem against pipeline RunID and terminates the request. This is absolutely bizarre.

    For context: Copying data from BigQuery to Azure is done through the ODBC drivers, which is a piece of software that is written by Google as an efficient interface to BigQuery. However, this piece of software must be run on a server, which uses the ODBC driver to load data and then writes it into the destination. This is done by Microsoft Azure. 

    3 people found this answer helpful.

4 additional answers

Sort by: Most helpful
  1. Konstantinos Passadis 19,586 Reputation points MVP
    2023-07-09T12:48:46.5633333+00:00

    Hello @Gabriel Levaillant !

    Welcome to Microsoft QnA!

    I understand that you are facing an issue to Copy Data from Big Query to Azure SQL Database

    Please review this items :

    Verify Azure SQL allows Big Query Connections , also verify Firewalls on Azure SQL

    Once you are 100% sure for Authentication & Authorization try with smaller amounts of Data

    Limitations on the API could probably lead to this message

    Verify the Data itself is matching Source & Destnation configs , avoid using special characters

    Lets start with these and send us your feedback

    The fact that Datalake works , points to Firewall and/or Auth issues

    Links :

    https://learn.microsoft.com/en-us/azure/data-factory/connector-google-bigquery?tabs=data-factory

    https://www.sqlshack.com/link-google-bigquery-resources-into-an-azure-sql-resource/


    I hope this helps!

    Kindly mark the answer as Accepted and Upvote in case it helped!

    Regards


  2. Konstantinos Passadis 19,586 Reputation points MVP
    2023-07-09T16:58:51.36+00:00

    Hello @Gabriel Levaillant !

    Thank you for updating

    I have found the link below which refrences the Simba Drivers

    D:\batch\tasks\shared\ODBC Drivers\Microsoft Google BigQuery ODBC Driver_2.4.5.1014\lib..\ErrorMessages\en-US*SimbaBigQueryODBCMessages.xml*

    https://stackoverflow.com/questions/59368276/errors-when-using-bigquery-storage-api#:~:text=Do%20you%20set%20the%20values%20of%20Minimum%20Query,number%20of%20table%20operations%20per%20day%20is%201%2C000.

    ALSO from the BigQuery Docs regarding ODBC

    Known issues and FAQ

    Can I use these drivers to ingest or export data between BigQuery and my existing environment?

    These drivers leverage the query interface for BigQuery and don't provide functionality to leverage BigQuery's large scale ingestion mechanisms or export functionality.

    While you can use DML to issue small volumes of INSERT requests, it is subject to the limits on DML.

    ALSO :

    https://community.qlik.com/t5/Official-Support-Articles/Random-GHighThroughputApiError-error-when-reloading-data-from/ta-p/1983458

    I think we are looking into BigQuery problems with these Drivers for ODBC

    In case you can export more logs please do ,otherwise open a case into GCP , my suggestion!

    I hope this helps!

    Kindly mark the answer as Accepted and Upvote in case it helped!

    Regards


  3. SSingh-MSFT 16,371 Reputation points Moderator
    2023-07-10T04:56:10.2933333+00:00

    Hi
    Gabriel Levaillant
    •,

    Welcome to Microsoft Q&A forum and thanks for using Azure Services.

    We are sorry about the inconvenience you are facing.

    In addition to the suggestions by Konstantinos Passadis, I would recommend you to raise request with the GCP team as the error on the Source side that is GCP Big Query 'GHighThroughputApiError' seems to be a known intermittent issue as mentioned here:

    https://issuetracker.google.com/issues/240305081?pli=1

    https://community.qlik.com/t5/Official-Support-Articles/Random-GHighThroughputApiError-error-when-reloading-data-from/ta-p/1983458

    You may try creating the issue in the GCP forum https://issuetracker.google.com/issues/new to get assistance.

    Hope this helps. Do let us know if you have any further questions regarding the pipeline or Azure, we are happy to help.

    Thank you.


  4. Mirek Chorąży 0 Reputation points
    2023-09-08T12:07:42.6666667+00:00

    Hi,

    I have the same problem

    I am using a Copy Activity to copy data from Big Query to Azure SQL Database

    I am using AutoResolveIntegrationRuntime (Azure Public Type)

    I get this error:

    Operation on target ... failed: Failure happened on 'Source' side. ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ERROR [HY000] [Microsoft][DSI] An error occurred while attempting to retrieve the error message for key 'GHighThroughputApiError' with message parameters ['Error: { code: UNKNOWN, message: Stream removed }'] and component ID 100: Message not found in file "D:\batch\tasks\shared\ODBC Drivers\Microsoft Google BigQuery ODBC Driver_2.4.5.1014\lib..\ErrorMessages\en-US\SimbaBigQueryODBCMessages.xml",Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [HY000] [Microsoft][DSI] An error occurred while attempting to retrieve the error message for key 'GHighThroughputApiError' with message parameters ['Error: { code: UNKNOWN, message: Stream removed }'] and component ID 100: Message not found in file "D:\batch\tasks\shared\ODBC Drivers\Microsoft Google BigQuery ODBC Driver_2.4.5.1014\lib..\ErrorMessages\en-US\SimbaBigQueryODBCMessages.xml",Source=Microsoft ODBC Driver for Google BigQuery,'

    I will add that a task within the pipeline is usually not executed, but when started manually it is usually executed

    What could be the problem?


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.