Why won't PySpark Notebook in Azure synapse accept input() type variable in python

az-sayl-emp-7411 30 Reputation points
2023-03-16T01:20:54.65+00:00

The issue that I'm facing is that my python code will not be compiled(runs indefinitely) in a pyspark notebook when an input() type variable is present in the code. But if the variable is given a hardcoded number the code runs smoothly.

Snippets of the code are as follows

With input type variable (runs indefinitely)

Screenshot 2023-03-16 091418

Without input type variable(runs successfully)

wo input 2

Thank you.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
0 comments No comments
{count} vote

Accepted answer
  1. MartinJaffer-MSFT 26,236 Reputation points
    2023-03-16T22:15:01.3933333+00:00

    @az-sayl-emp-7411 Hello and welcome to Microsoft Q&A.

    I understand you are facing unexpected behavior while running your PySpark notebook, specifically with input(.

    Your code reminds me of my school days. From context, it looks like input( is supposed to await the user type something in.

    You stated the notebook runs indefinitely. I think it actually is waiting for input as instructed. However, Pyspark notebooks are not set up to expect user input after execution starts. Your code would work perfectly on your local computer using normal Python.

    The Pyspark notebook acts somewhat differently than a normal Python. PySpark is distributed computing, and for each part to work independently, they cannot be waiting for input. Instead all input must be either referenced (as in a file to open) or made available (as in calling the notebook with parameters) at the start of execution.

    This is why hardcoding num1 and num2 works.

    For your case, try adding Widgets. I think the FloatText widget would allow you to type in a number, and then execute without hardcoding the number. Basically a textbox. You would type in the number before execution, but it would still work to replace the number during execution. Below in instruction / picture.

    User's image

    Also, a couple corrections since you are learning. Input isn't a datatype the way float is a type. input( is a function.

    Also, Pyspark is more power than you need for school tasks. For simple things it actually takes longer time than regular Python on your local computer. This is due to overhead.

    Have I explained thoroughly?

    2 people found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.