@Subhadip Roy - Thanks for the question and using MS Q&A platform.
To connect to an On-Prem FTP server using Azure Databricks, you can use the ftplib
library in Python.
Here are the steps to connect to an On-Prem FTP server and read the files:
Step1: Import the ftplib
library:
import ftplib
Step2: Connect to the FTP server using the FTP
function:
ftp = ftplib.FTP('ftp.server.com', 'username', 'password')
- Replace
ftp.server.com
with the hostname or IP address of your FTP server, andusername
andpassword
with your FTP credentials.
Step3: Change to the directory where your files are located using the cwd
function:
ftp.cwd('/path/to/files')
Replace /path/to/files
with the path to the directory where your files are located.
Step4: List the files in the directory using the nlst
function:
files = ftp.nlst()
This will return a list of filenames in the directory.
Step5: Download the files using the retrbinary
function:
files = ftp.nlst()
This will download each file in the directory to your local machine.
Regarding the pre-requisites to connect to an On-Prem FTP server, you need to ensure that the FTP server is accessible from the Azure Databricks cluster. If the FTP server is behind a firewall, you may need to whitelist the IP address of the Azure Databricks cluster. Additionally, you need to ensure that you have the necessary FTP credentials to connect to the server.
OR
In the code snippet below, we leverage scope credentials to define IP address, password, port, and username. Furthermore, we specify the FTP site's data and file location, with the local path mirroring the file path on the Unity Catalog.
# Import the FTP module from the ftplib library
from ftplib import FTP
# Retrieve FTP credentials from KeyVault using Scoped Credentials
ip_address = dbutils.secrets.get(scope='dev',key='ip-address')
password = dbutils.secrets.get(scope='dev',key='password')
port = int(dbutils.secrets.get(scope='dev',key='port'))
username = dbutils.secrets.get(scope='dev',key='username')
# Assign FTP connection details
ftp_host = ip_address
ftp_user = username
ftp_password = password
# Set FTP server path
ftp_path = "/Path/To/FTP"
# Set local paths for storing downloaded files
local_path = "/Volumes/bronze_dev/ftp-files/ftp_files"
# Create an FTP object
ftp = FTP()
# Initialize an empty list to store filenames from the FTP server
flat_files=[]
# Connect to the FTP server using the provided credentials
ftp.connect(ip_address, port)
ftp.login(username, password)
print("connected")
# Change the working directory on the FTP server
ftp.cwd(ftp_path)
# List files in the current directory on the FTP server
files_list = ftp.nlst(ftp_path)
# Iterate through each file in the FTP server directory
for file in files_list:
print(file)
print("local_path :" + local_path)
# Display progress message
print('Downloading files from remote server :' + file)
# Open a local file for writing in binary mode
with open(local_path + file, "wb") as local_file:
print('local file: ', local_file)
# Download the file from the FTP server and write it to the local file
ftp.retrbinary("RETR " + file, local_file.write)
# Close the local file
local_file.close()
For more details, refer to the articles which helps to connect to connect On-Prem FTP server using Azure Databricks.
Uploading Files from FTP Server to Databricks Unity Catalog.
How to connect and process FTP Data from Azure Databricks.
Disclaimer: This response contains a reference to a third-party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.
Hope this helps. Do let us know if you have any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.