Hi @Shambhu Rai ,
Thanks for posting question in Microsoft Q&A forum and for using Azure Services.
As I understand your question, you want to create an id column as key with incremental numbers during table creation in databricks.
To accomplish this, we can use Generate Always As Identity
while table creation:
CREATE OR REPLACE TABLE demo (
id BIGINT GENERATED ALWAYS AS IDENTITY,
product_type STRING,
sales BIGINT
);
If the table already exists and we want to add surrogate key column, then we can make use of sql function monotonically_increasing_id
or could use analytical function row_number
as shown below:
from pyspark.sql.functions import monotonically_increasing_id
df1 = df.withColumn( "ID", monotonically_increasing_id())
display(df1)
df.createOrReplaceTempView('v_view')
df = spark.sql("""
SELECT
row_number() OVER (
PARTITION BY ''
ORDER BY ''
) as id,
*
FROM
v_view
""")
display(df)
Hope this will help. Please let us know if any further queries.
------------------------------
- Please don't forget to click on or upvote button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how - Want a reminder to come back and check responses? Here is how to subscribe to a notification