Share via

databricks feature store

Jakub Stavinoha 6 Reputation points
2022-10-24T14:47:33.847+00:00

Hello,

I'm using DBR workspace RT 11.0 for ML and I have an issue with very slow execution FeatureStoreClient commands. For data manipulation I use simple single node cluster 128GB which ,with a small table (400k rows, 20 cols) using command drop_table or create_table, runs instantly. Due to some change days before to drop the identical table takes almost 10minuts and create table never finishes (normally both done within a minute). Admin of workspace informed me not changes were made to my workspaces and is busy now, however I'm sure something is not working well. I was able to create a very small table with just 100 rows for which I was able to create table and write into it, but that all took minutes to finalize, which is just crazy amount of time.

I would be suspicious about the place somewhere between notebook and hive metastore, but does not possess relevant knowledge. Would there be please any check recommendations to narrow the area of possible root-causes which I could discuss with my admin?

Thanks a lot in advance for your time.
Regards,
Jakub

Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.


1 answer

Sort by: Most helpful
  1. Jakub Stavinoha 6 Reputation points
    2022-10-31T14:57:34.107+00:00

    Hello, Not, but I used a quickest possible workaround;

    Instead of calling fs commands to create_table or write_table I have instead saved the dataframe as a delta table and afterwards registered the table as fs one. This way there weren't any problems nor delays, except not having a direct definition of partitions which was not a big pain in my use-case.

    However, the command to drop table, which was very slow as well, now runs fast. But the above-mentioned commands still perform endlessly for originally described dataframe of 400k lines s (I think the longest run was over 1.5h after I cancelled the command).

    Was this answer helpful?

    1 person found this answer helpful.

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.