Share via

Errors installing Databricks External Hive Metastore

Tarapareddy,Saikrishna 61 Reputation points
2021-12-10T18:03:23.26+00:00

Hi ,
I am getting below error when i try to implement Databriks Hive Metastore V2.3.7 pointing to Azure SQL DB.

Below are the settings i am using in Cluster settings

spark.hadoop.javax.jdo.option.ConnectionDriverName com.microsoft.sqlserver.jdbc.SQLServerDriver
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:sqlserver:/myserver:1433;database=npushivems237test
spark.databricks.delta.preview.enabled true
spark.hadoop.javax.jdo.option.ConnectionUserName HMSTest
datanucleus.fixedDatastore false
spark.hadoop.javax.jdo.option.ConnectionPassword HMSTes987t1234@
hive.metastore.schema.verification.record.version false
datanucleus.autoCreateSchema true
spark.master local[*, 4]
spark.databricks.cluster.profile singleNode
spark.sql.hive.metastore.jars builtin
hive.metastore.schema.verification false
datanucleus.schema.autoCreateTables true
spark.sql.hive.metastore.version 2.3.7

But getting these errors when i try to create a table.

create table test237.test(id integer , name varchar(100))

(3) Spark Jobs
Error in SQL statement: AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:An exception was thrown while adding/validating class(es) : The size (32672) given to the column 'PARAM_VALUE' exceeds the maximum allowed for any data type (8000).
com.microsoft.sqlserver.jdbc.SQLServerException: The size (32672) given to the column 'PARAM_VALUE' exceeds the maximum allowed for any data type (8000).

also found similar error in the driver logs..

com.microsoft.sqlserver.jdbc.SQLServerException: The size (32672) given to the column 'TYPENAME' exceeds the maximum allowed for any data type (8000)

CREATE TABLE SD_PARAMS
(
SD_ID bigint NOT NULL,
PARAM_KEY varchar(256) NOT NULL,
PARAM_VALUE varchar(32672) NULL
) : The size (32672) given to the column 'PARAM_VALUE' exceeds the maximum allowed for any data type (8000).

So it seems it hasn't created all the meta data tables. i only see these below on my Azure SQL DB.
dbo.DATABASE_PARAMS
dbo.DBS
dbo.FUNC_RU
dbo.FUNCS
dbo.GLOBAL_PROVS
dbo.ROLES
dbo.SEQUENCE_TABLE
dbo.VERSION

Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.

0 comments No comments

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 91,866 Reputation points
    2021-12-13T09:21:59.057+00:00

    Hello @Tarapareddy,Saikrishna ,

    Thanks for the question and using MS Q&A platform.

    An important point to keep in consideration, we can use string length up to varchar(8000) only as this is the maximum number of characters that SQL varchar(n) data type can hold. So in cases when there are chances that the string length of the varchar column might exceed 8000 bytes, using varchar(8001) or anything higher will result into an error - com.microsoft.sqlserver.jdbc.SQLServerException: The size (32672) given to the column 'PARAM_VALUE' exceeds the maximum allowed for any data type (8000).

    If you use char or varchar, we recommend to:

    • Use char when the sizes of the column data entries are consistent.
    • Use varchar when the sizes of the column data entries vary considerably.
    • Use varchar(max) when the sizes of the column data entries vary considerably, and the string length might exceed 8,000 bytes.

    For more details, refer to Data Types and char and varchar

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.