How to connect and query with MySQL server using Azure Machine Learning

Nguyễn Thanh Tú 170 Reputation points
2024-07-23T07:30:11.6466667+00:00

Hi,

I am currently working on a project where I need to connect to an Azure Database for MySQL using Azure Machine Learning (AML) and Python scripts. However, I want to use AML resources such as Datastore and Dataset for the connection rather than conventional methods like SQLAlchemy or pyodbc.

I've tried to use Dataset.Tabular.from_sql_query() method but this method does not support MySQL server.

Are there any possible methods that could apply to my issue?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,958 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,218 questions
Azure Database for MySQL
Azure Database for MySQL
An Azure managed MySQL database service for app development and deployment.
843 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,811 questions
Azure Database for PostgreSQL
{count} votes

Accepted answer
  1. dupammi 8,535 Reputation points Microsoft Vendor
    2024-07-29T10:24:53.61+00:00

    Hi @Tú Nguyễn

    Here’s how you can connect and query a MySQL database using Python scripts within Azure Machine Learning SDK V2:

    1. Register the MySQL Database as a Datastore: Although direct support for MySQL in the Dataset.Tabular.from_sql_query() method is limited, you can use a custom approach. First, you need to set up your MySQL connection details and register it as a datastore if required.
    2. Create a Python Script for Data Retrieval: You can use Python libraries such as mysql-connector-python or pymysql to connect to your MySQL database and perform queries. Write a Python script that connects to your MySQL database and retrieves data.

    Example of a Python script:

    import mysql.connector
    import pandas as pd
    
    # MySQL connection details
    db_config = {
        'host': 'your-mysql-host',
        'user': 'your-username',
        'password': 'your-password',
        'database': 'your-database'
    }
    
    # Create a connection to the database
    connection = mysql.connector.connect(**db_config)
    query = "SELECT * FROM your_table"
    
    # Execute the query and load data into a pandas DataFrame
    df = pd.read_sql(query, con=connection)
    connection.close()
    
    # Save DataFrame to a CSV file (or any other format)
    df.to_csv('output_data.csv', index=False)
    

    Submit a Job to AML: Use the command job type in AML SDK V2 to run your Python script. This approach allows you to execute your custom Python script within an AML environment.

    Example of submitting a job using AML SDK V2:

    from azure.ai.ml import MLClient
    from azure.ai.ml import command, Input, Output
    from azure.ai.ml.constants import AssetTypes
    from azure.identity import DefaultAzureCredential
    # Enter details of your AML workspace
    subscription_id = "your-subscription-id"
    resource_group = "your-resource-group"
    workspace = "your-workspace"
    # Create MLClient
    ml_client = MLClient(
        DefaultAzureCredential(), subscription_id, resource_group, workspace
    )
    # Define your job
    job = command(
        code="./src",  # Local path where the code is stored
        command="python query_mysql.py",
        environment="azureml://registries/azureml/environments/sklearn-1.5/labels/latest",
        compute="cpu-cluster",
    )
    # Submit the job
    returned_job = ml_client.jobs.create_or_update(job)
    print(f"Job submitted: {returned_job.studio_url}")
    

    Handle Data Storage: Once your job completes, you can handle the output data as needed. The data can be uploaded to a datastore or processed further within AML.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.