Sending data frame as an attachment in email using synapse notebook (PySpark)?

Satheesh K 45 Reputation points
2024-10-03T17:25:48.6866667+00:00

Hello,

Is it possible to send pyspark dataframe as an attachment in email using synapse notebook (PySpark)?

Thanks

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,295 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Ganesh Gurram 6,040 Reputation points Microsoft External Staff
    2024-10-04T07:13:33.0833333+00:00

    @Satheesh K - Thanks for the question and using MS Q&A plaform.

    You can send a data frame as an attachment in an email using a Synapse notebook with PySpark.

    Here's an example code snippet that demonstrates how to do this:

    from email.mime.text import MIMEText
    from email.mime.application import MIMEApplication
    from email.mime.multipart import MIMEMultipart
    from smtplib import SMTP
    import smtplib
    import sys
    import pandas as pd
    
    df_test = pd.read_csv('abfss://******@samplesynapseadlsgen2.dfs.core.windows.net/Data/moviesDB.csv')
    
    email_user = '******@gmail.com'
    email_password = 'PASSWORD'
    
    recipients = ['******@gmail.com'] 
    emaillist = [elem.strip().split(',') for elem in recipients]
    msg = MIMEMultipart()
    msg['Subject'] = 'SUBJECT'
    msg['From'] = '******@gmail.com'
    
    
    html = """\
    <html>
      <head></head>
      <body>
        {0}
      </body>
    </html>
    """.format(df_test.to_html())
    
    part1 = MIMEText(html, 'html')
    msg.attach(part1)
    
    server = smtplib.SMTP('smtp.gmail.com', 587)
    server.starttls()
    server.login(email_user,email_password)
    server.sendmail(msg['From'], emaillist , msg.as_string())
    
    

    Here is the screenshot of the code which I had ran on my synapse notebook:
    User's image

    Here is the email sent using synapse notebook (PySpark):
    User's image

    In case, if you are experiencing this issue: SMTPAuthenticationError: (535, b'5.7.8 Username and Password not accepted. For more information, go to\n5.7.8 https://support.google.com/mail/?p=BadCredentials 98e67ed59e1d1-2e1e8625ccbsm714495a91.48 - gsmtp')
    User's image

    Reason: It will throw this error message if 2-Step Verification turned on.

    App passwords can only be used with accounts that have 2-Step Verification turned on.

    If you do not have a gmail apps password, create a new app with using generate password. Check your apps and passwords https://myaccount.google.com/apppasswords

    For more details, refer to SO thread: smtplib.SMTPAuthenticationError: (534, b'5.7.9 Application-specific password required addressing similar issue.

    For more details, refer to SO thread: Is there a way to send email with a dataframe attachment? addresses similar issue.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.