Szerkesztés

Megosztás a következőn keresztül:


Insert data from a SQL table into a Python pandas dataframe

Applies to: SQL Server Azure SQL Database Azure SQL Managed Instance

This article describes how to insert SQL data into a pandas dataframe using the pyodbc package in Python. The rows and columns of data contained within the dataframe can be used for further data exploration.

Prerequisites

Verify restored database

You can verify that the restored database exists by querying the Person.CountryRegion table:

USE AdventureWorks;
SELECT * FROM Person.CountryRegion;

Install Python packages

Download and Install Azure Data Studio.

Install the following Python packages:

  • pyodbc
  • pandas

To install these packages:

  1. In your Azure Data Studio notebook, select Manage Packages.
  2. In the Manage Packages pane, select the Add new tab.
  3. For each of the following packages, enter the package name, click Search, then click Install.

Insert data

Use the following script to select data from Person.CountryRegion table and insert into a dataframe. Edit the connection string variables: 'server', 'database', 'username', and 'password' to connect to SQL.

To create a new notebook:

  1. In Azure Data Studio, select File, select New Notebook.
  2. In the notebook, select kernel Python3, select the +code.
  3. Paste code in notebook, select Run All.
import pyodbc
import pandas as pd
# Some other example server values are
# server = 'localhost\sqlexpress' # for a named instance
# server = 'myserver,port' # to specify an alternate port
server = 'servername' 
database = 'AdventureWorks' 
username = 'yourusername' 
password = 'databasename'  
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
# select 26 rows from SQL table to insert in dataframe.
query = "SELECT [CountryRegionCode], [Name] FROM Person.CountryRegion;"
df = pd.read_sql(query, cnxn)
print(df.head(26))

Output

The print command in the preceding script displays the rows of data from the pandas dataframe df.

CountryRegionCode                 Name
0                 AF          Afghanistan
1                 AL              Albania
2                 DZ              Algeria
3                 AS       American Samoa
4                 AD              Andorra
5                 AO               Angola
6                 AI             Anguilla
7                 AQ           Antarctica
8                 AG  Antigua and Barbuda
9                 AR            Argentina
10                AM              Armenia
11                AW                Aruba
12                AU            Australia
13                AT              Austria
14                AZ           Azerbaijan
15                BS         Bahamas, The
16                BH              Bahrain
17                BD           Bangladesh
18                BB             Barbados
19                BY              Belarus
20                BE              Belgium
21                BZ               Belize
22                BJ                Benin
23                BM              Bermuda
24                BT               Bhutan
25                BO              Bolivia

Next steps