How to connect and query with MySQL server using Azure Machine Learning
Hi,
I am currently working on a project where I need to connect to an Azure Database for MySQL using Azure Machine Learning (AML) and Python scripts. However, I want to use AML resources such as Datastore and Dataset for the connection rather than conventional methods like SQLAlchemy or pyodbc.
I've tried to use Dataset.Tabular.from_sql_query() method but this method does not support MySQL server.
Are there any possible methods that could apply to my issue?
Azure Machine Learning
Azure Storage Accounts
Azure Database for MySQL
Azure Data Factory
Azure Database for PostgreSQL
-
dupammi 8,035 Reputation points • Microsoft Vendor
2024-07-23T11:59:16.75+00:00 Hi @Tú Nguyễn
Thank you for your question.
To connect and query with an Azure Database for MySQL using Azure Machine Learning (AML) and Python scripts, leveraging AML resources such as Datastore and Dataset, you can use the register_azure_my_sql method to register the MySQL database as a datastore and then use the DataTransferStep class to transfer data between storage options.
For more complex queries or operations, consider using a custom Python script and the PythonScriptStep class along with the registered datastore.
Please refer below doc for more details:
Azure Machine Learning Pipeline with DataTransferStep
I hope this helps. Thank you.
-
Tú Nguyễn 70 Reputation points
2024-07-24T03:44:59.6333333+00:00 Hi dupammi
Thank you for your support, I've followed the instructions Azure Machine Learning Pipeline with DataTransferStep and it works for me.But this instruction is applied to SDK V1, Is there a similar guide available for SDK V2?
-
Tú Nguyễn 70 Reputation points
2024-07-24T03:47:22.55+00:00 If you have any instructions for SDK v2, I would appreciate it if you share it with me
-
dupammi 8,035 Reputation points • Microsoft Vendor
2024-07-24T12:50:04.6666667+00:00 Hi @Tú Nguyễn
Please browse data.ipynb , as mentioned in azure-machine-learning-sdk-v2-examples/
and then check the cell containing information for MySQL.
I hope this helps. Thank you.
-
dupammi 8,035 Reputation points • Microsoft Vendor
2024-07-25T03:17:57.96+00:00 Hi @Tú Nguyễn
We haven’t heard from you on the last response and was just checking back to see if you got a chance to check above suggestions.
-
Tú Nguyễn 70 Reputation points
2024-07-29T04:20:23.4966667+00:00 Hi dupanmi
Thank you for your support, I have reviewed the data.ipynb, but I might have missed something in the instructions because this template only provides guidance on connecting to Azure SQL, Amazon S3, and Snowflake, and I don't see any instructions for connecting to MySQL.
-
Tú Nguyễn 70 Reputation points
2024-07-29T06:17:04.6066667+00:00 .....
-
dupammi 8,035 Reputation points • Microsoft Vendor
2024-07-29T10:24:53.61+00:00 Hi @Tú Nguyễn
Here’s how you can connect and query a MySQL database using Python scripts within Azure Machine Learning SDK V2:
- Register the MySQL Database as a Datastore: Although direct support for MySQL in the
Dataset.Tabular.from_sql_query()
method is limited, you can use a custom approach. First, you need to set up your MySQL connection details and register it as a datastore if required. - Create a Python Script for Data Retrieval: You can use Python libraries such as
mysql-connector-python
orpymysql
to connect to your MySQL database and perform queries. Write a Python script that connects to your MySQL database and retrieves data.
Example of a Python script:
import mysql.connector import pandas as pd # MySQL connection details db_config = { 'host': 'your-mysql-host', 'user': 'your-username', 'password': 'your-password', 'database': 'your-database' } # Create a connection to the database connection = mysql.connector.connect(**db_config) query = "SELECT * FROM your_table" # Execute the query and load data into a pandas DataFrame df = pd.read_sql(query, con=connection) connection.close() # Save DataFrame to a CSV file (or any other format) df.to_csv('output_data.csv', index=False)
Submit a Job to AML: Use the
command
job type in AML SDK V2 to run your Python script. This approach allows you to execute your custom Python script within an AML environment.Example of submitting a job using AML SDK V2:
from azure.ai.ml import MLClient from azure.ai.ml import command, Input, Output from azure.ai.ml.constants import AssetTypes from azure.identity import DefaultAzureCredential # Enter details of your AML workspace subscription_id = "your-subscription-id" resource_group = "your-resource-group" workspace = "your-workspace" # Create MLClient ml_client = MLClient( DefaultAzureCredential(), subscription_id, resource_group, workspace ) # Define your job job = command( code="./src", # Local path where the code is stored command="python query_mysql.py", environment="azureml://registries/azureml/environments/sklearn-1.5/labels/latest", compute="cpu-cluster", ) # Submit the job returned_job = ml_client.jobs.create_or_update(job) print(f"Job submitted: {returned_job.studio_url}")
Handle Data Storage: Once your job completes, you can handle the output data as needed. The data can be uploaded to a datastore or processed further within AML.
- Register the MySQL Database as a Datastore: Although direct support for MySQL in the
Sign in to comment