How to Create Delta Table in Azure Synapse Analytics with Id Auto Increment Identity Column ?

Question

How to Create Delta Table in Azure Synapse Analytics with Id Auto Increment Identity Column ?

Vedant Desai 656

I have created the Delta Lake Delta tables In ADLS using Synapse Notebook and in that table, I want to add an identity column (Auto increment 1,1) but I am not able to create the same, Below is my Create table script and error which i am facing.

Table Script:

%%sql
CREATE TABLE Staging.DimContract_NAC_Test
(
    DimContractKey BIGINT GENERATED ALWAYS AS IDENTITY,
    IdentCode INT,
    ContractId INT,
    SourceUpdatedDate TIMESTAMP,
    EffectiveStartDate DATE,
    EffectiveEndDate DATE,
    ETLId INT,
    SourceId INT,
    IsIdentCodeDuplicate boolean
)
USING DELTA;

Error:

Syntax error at or near 'GENERATED'(line 3, pos 26) == SQL == CREATE TABLE Staging.DimContract_NAC_Test ( DimContractKey BIGINT GENERATED ALWAYS AS IDENTITY, --------------------------^^^ IdentCode INT, ContractId INT, SourceUpdatedDate TIMESTAMP, EffectiveStartDate DATE, EffectiveEndDate DATE, ETLId INT, SourceId INT, IsIdentCodeDuplicate boolean ) USING DELTA org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:306) org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:143) org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:52) org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:89) io.delta.sql.parser.DeltaSqlParser.$anonfun$parsePlan$1(DeltaSqlParser.scala:78) io.delta.sql.parser.DeltaSqlParser.parse(DeltaSqlParser.scala:111) io.delta.sql.parser.DeltaSqlParser.parsePlan(DeltaSqlParser.scala:75) com.microsoft.azure.synapse.ml.predict.SynapsePredictParser.parsePlan(SynapsePredictParser.scala:19) org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:620) org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:120) org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:620) org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617) org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:150) org.apache.livy.repl.Session.$anonfun$executeCode$1(Session.scala:814) scala.Option.map(Option.scala:230) org.apache.livy.repl.Session.executeCode(Session.scala:811) org.apache.livy.repl.Session.$anonfun$execute$9(Session.scala:536) org.apache.livy.repl.Session.withRealtimeOutputSupport(Session.scala:1038) org.apache.livy.repl.Session.$anonfun$execute$2(Session.scala:536) scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659) scala.util.Success.$anonfun$map$1(Try.scala:255) scala.util.Success.map(Try.scala:213) scala.concurrent.Future.$anonfun$map$1(Future.scala:292) scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) java.lang.Thread.run(Thread.java:750)

Screenshot 2024-06-04 113610

Smaran Thoomu 24,110 Reputation points Microsoft External Staff Moderator

2024-06-05T07:39:35.9533333+00:00

@Vedant Desai Just checking in to see if the below answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Accepted answer

0 additional answers

Your answer

Smaran Thoomu 24,110 Reputation points Microsoft External Staff Moderator

2024-06-05T07:39:35.9533333+00:00

@Vedant Desai Just checking in to see if the below answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 1

When I check the documentation :

You can define a table as having the IDENTITY property when you first create the table by using syntax that is similar to the following statement:

CREATE TABLE dbo.T1
(    C1 INT IDENTITY(1,1) NOT NULL
,    C2 INT NULL
)
WITH
(   DISTRIBUTION = HASH(C2)
,   CLUSTERED COLUMNSTORE INDEX
)
;

Update :

After verifying I think GENERATED ALWAYS AS IDENTITY syntax is not supported directly in Delta Lake

Instead, you can use a workaround to simulate an auto-incrementing column.

Create the table without the identity column.

%%sql
CREATE TABLE Staging.DimContract_NAC_Test
(
    IdentCode INT,
    ContractId INT,
    SourceUpdatedDate TIMESTAMP,
    EffectiveStartDate DATE,
    EffectiveEndDate DATE,
    ETLId INT,
    SourceId INT,
    IsIdentCodeDuplicate BOOLEAN
)
USING DELTA;

Insert data into the table while generating an auto-incrementing column using Spark.

from pyspark.sql import SparkSession
from pyspark.sql.functions import monotonically_increasing_id

# Initialize Spark Session
spark = SparkSession.builder.appName("AutoIncrementColumn").getOrCreate()

# Load your data into a DataFrame
data = [
    (123, 1, '2023-06-01 12:00:00', '2023-06-01', '2024-06-01', 1001, 2001, True),
    (124, 2, '2023-06-02 13:00:00', '2023-06-02', '2024-06-02', 1002, 2002, False),
    # Add more rows as needed
]

columns = ["IdentCode", "ContractId", "SourceUpdatedDate", "EffectiveStartDate", "EffectiveEndDate", "ETLId", "SourceId", "IsIdentCodeDuplicate"]
df = spark.createDataFrame(data, columns)

# Add an auto-incrementing column
df_with_id = df.withColumn("DimContractKey", monotonically_increasing_id())

# Reorder columns if necessary
df_with_id = df_with_id.select("DimContractKey", "IdentCode", "ContractId", "SourceUpdatedDate", "EffectiveStartDate", "EffectiveEndDate", "ETLId", "SourceId", "IsIdentCodeDuplicate")

# Write the DataFrame to the Delta table
df_with_id.write.format("delta").mode("append").saveAsTable("Staging.DimContract_NAC_Test")

Vedant Desai 656 Reputation points

2024-06-04T09:00:06.3833333+00:00

I have tried but giving an error, Thanks for the response

Syntax error at or near 'IDENTITY'(line 3, pos 23)
Amira Bedhiafi 33,071 Reputation points Volunteer Moderator

2024-06-04T10:36:22.7266667+00:00

I updated the answer
vipullag-MSFT 26,487 Reputation points Moderator

2024-06-04T10:45:54.7533333+00:00

Hello Vedant Desai

We noticed your feedback that the above answer was not helpful.

Thank you for taking time to share feedback.

Requesting you to check recent/updates answer provided by Amira to see if that is helpful.
**
Requesting you to accept as answer and take a resurvey.**

Share via

How to Create Delta Table in Azure Synapse Analytics with Id Auto Increment Identity Column ?

0 additional answers

Your answer