Azure data factory link on-premise data pipeline

Han Shih 施學翰 146 Reputation points
2022-01-03T09:20:55.377+00:00

I have lots of data pipelines already running on local VMs, also serverless solutions on other cloud services (GCP, AWS ...)

Is it possible to manage these pipelines using Data Factory as an overall solution to check if any pipeline fails?
(with least modification)

Thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. Niels 236 Reputation points
    2022-01-03T09:38:37.527+00:00

    You mention that you have data pipelines running on local VM's. What do you mean with this?

    • You have created pipelines to copy data from lots of different VM's to Azure? or...
    • you have installed data gateways on lots of different VM's to copy data from each one of them?

    If you mean data gateway, my advise is to rethink your architecture. To get data from an on-premises environment, you only need to install the gateway on one VM in your on-premises domain network. Using this gateway you should be able to reach all other servers and resources (SQL, files, etc.) in your on-premises domain network. So there's no need to install the data gateway on each VM's that contains data resources. If you have lots of servers and data to copy, you might have one or more dedicated data gateway VM's that don't do anything else.

    If you mean you have lots of pipelines that you want to monitor the pipelines for failing, check the Monitor section in ADF Studio and then Metrics & Alerts. Here you can create alerts for when pipelines fail and send an alerts to an action group. You can create an alert using the "Failed pipeline runs metrics" to monitor if pipelines have failed. You can also use the "Integration runtime available node count" metric to monitor your data gateways availability.

    Alternatively, if you want someone that has the appropriate Azure permissions to create alerts, but does not have access to ADF Studio, you can also add alerts using the default Azure portal. Go to the ADF resource > Monitoring > Alerts and the same alerts can be created here.

    Check this link for creating alert using ADF Studio: https://azure.microsoft.com/en-us/blog/create-alerts-to-proactively-monitor-your-data-factory-pipelines/

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Han Shih 施學翰 146 Reputation points
    2022-01-04T04:07:28.753+00:00

    @Niels thanks for your reply

    What I mean is

    I have lots of pipelines currently running on local machines, and GCP\AWS serverless solutions.
    And I find that ADF as a pipeline orchestration solution
    I try to find a way that I can manage all pipelines from different sources in a single service - ADF
    e.g. pipeline execution failed, pipeline not execute (not execute in order) etc.

    I check the link you provided, it seems that I should make all my pipeline 'link' to ADF so that I can use the monitor feature.

    Correct me if I am wrong


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.