SQL 테이블에 Python 데이터 프레임 삽입

아티클
04/26/2023

적용 대상: SQL Server Azure SQL 데이터베이스 Azure SQL Managed Instance

이 문서에서는 Python에서 pyodbc 패키지를 사용하여 SQL 데이터베이스에 pandas 데이터 프레임을 삽입하는 방법을 설명합니다.

필수 구성 요소

Windows용 SQL Server 또는 Linux용 SQL Server

Azure SQL Database

Azure SQL Managed Instance
샘플 데이터베이스를 Azure SQL Managed Instance로 복원하기 위한 SQL Server Management Studio.

Azure Data Studio. 설치하려면 Azure Data Studio 다운로드 및 설치를 참조하세요.
AdventureWorks 샘플 데이터베이스의 단계에 따라 해당 SQL Server 버전의 AdventureWorks 샘플 데이터베이스 OLTP 버전을 복원합니다.

HumanResources.Department 테이블을 쿼리하여 데이터베이스가 올바르게 복원되었는지 확인할 수 있습니다.
```
USE AdventureWorks;
SELECT * FROM HumanResources.Department;
```

Python 패키지 설치

Azure Data Studio에서 새 Notebook을 열고 Python 3 커널에 연결합니다.
패키지 관리를 클릭합니다.
패키지 관리 창에서 새로 추가 탭을 선택합니다.
다음 패키지 각각에 대해 패키지 이름을 입력하고 검색을 클릭한 다음 설치를 클릭합니다.
- pyodbc
- pandas

샘플 CSV 파일 만들기

다음 텍스트를 복사하여 이름이 department.csv인 파일에 저장합니다.

DepartmentID,Name,GroupName,
1,Engineering,Research and Development,
2,Tool Design,Research and Development,
3,Sales,Sales and Marketing,
4,Marketing,Sales and Marketing,
5,Purchasing,Inventory Management,
6,Research and Development,Research and Development,
7,Production,Manufacturing,
8,Production Control,Manufacturing,
9,Human Resources,Executive General and Administration,
10,Finance,Executive General and Administration,
11,Information Services,Executive General and Administration,
12,Document Control,Quality Assurance,
13,Quality Assurance,Quality Assurance,
14,Facilities and Maintenance,Executive General and Administration,
15,Shipping and Receiving,Inventory Management,
16,Executive,Executive General and Administration

새 데이터베이스 테이블 만들기

SQL Server에 연결의 단계에 따라 AdventureWorks 데이터베이스에 연결합니다.

HumanResources.DepartmentTest라는 테이블을 만듭니다. 이 SQL 테이블은 데이터 프레임 삽입에 사용됩니다.

CREATE TABLE [HumanResources].[DepartmentTest](
[DepartmentID] [smallint] NOT NULL,
[Name] [dbo].[Name] NOT NULL,
[GroupName] [dbo].[Name] NOT NULL
)
GO

CSV 파일에서 데이터 프레임 로드

Python pandas 패키지를 사용하여 데이터 프레임을 만들고, CSV 파일을 로드한 다음, 데이터 프레임을 새 SQL 테이블 HumanResources.DepartmentTest에 로드합니다.

Python 3 커널에 연결합니다.

다음 코드를 코드 셀에 붙여넣고 server, database, username, password, CSV 파일 위치의 올바른 값으로 코드를 업데이트합니다.

import pyodbc
import pandas as pd
# insert data from csv file into dataframe.
# working directory for csv file: type "pwd" in Azure Data Studio or Linux
# working directory in Windows c:\users\username
df = pd.read_csv("c:\\user\\username\department.csv")
# Some other example server values are
# server = 'localhost\sqlexpress' # for a named instance
# server = 'myserver,port' # to specify an alternate port
server = 'yourservername' 
database = 'AdventureWorks' 
username = 'username' 
password = 'yourpassword' 
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
# Insert Dataframe into SQL Server:
for index, row in df.iterrows():
     cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
cnxn.commit()
cursor.close()

셀을 실행합니다.

데이터베이스의 데이터 확인

SQL 커널 및 AdventureWorks 데이터베이스에 연결하고 다음 SQL 문을 실행하여 테이블이 데이터 프레임의 데이터와 함께 성공적으로 로드되었는지 확인합니다.

SELECT count(*) from HumanResources.DepartmentTest;

결과

(No column name)
16

다음 단계

Python을 사용하여 데이터 탐색을 위한 히스토그램 그리기