databricks feature store

Question

databricks feature store

Jakub Stavinoha 6

Hello,

I'm using DBR workspace RT 11.0 for ML and I have an issue with very slow execution FeatureStoreClient commands. For data manipulation I use simple single node cluster 128GB which ,with a small table (400k rows, 20 cols) using command drop_table or create_table, runs instantly. Due to some change days before to drop the identical table takes almost 10minuts and create table never finishes (normally both done within a minute). Admin of workspace informed me not changes were made to my workspaces and is busy now, however I'm sure something is not working well. I was able to create a very small table with just 100 rows for which I was able to create table and write into it, but that all took minutes to finalize, which is just crazy amount of time.

I would be suspicious about the place somewhere between notebook and hive metastore, but does not possess relevant knowledge. Would there be please any check recommendations to narrow the area of possible root-causes which I could discuss with my admin?

Thanks a lot in advance for your time.
Regards,
Jakub

PRADEEPCHEEKATLA 91,866 Reputation points

2022-10-25T09:13:55.66+00:00

Hello @Jakub Stavinoha ,

Thanks for the question and using MS Q&A platform.

Can you please do check with the admin of workspace to know what are the changes made on the specific cluster level? If it not possible, try to create a new cluster with Runtime 11.0 for ML and see if the same behaviour exists?
Jakub Stavinoha 6 Reputation points

2022-10-25T09:38:41.643+00:00

Hello Pradeep,
the cluster creation is managed by myself. I currently use single node cluster with RT ML 11.0. with node type E16D_v4 and the issue is unfortunately present.
PRADEEPCHEEKATLA 91,866 Reputation points

2022-10-31T05:51:58.41+00:00

Hello @Jakub Stavinoha ,

Did you able to resolve the issue or still experiencing the same behaviour?

1 answer

Your answer

PRADEEPCHEEKATLA 91,866 Reputation points

2022-10-25T09:13:55.66+00:00

Hello @Jakub Stavinoha ,

Thanks for the question and using MS Q&A platform.

Can you please do check with the admin of workspace to know what are the changes made on the specific cluster level? If it not possible, try to create a new cluster with Runtime 11.0 for ML and see if the same behaviour exists?
Jakub Stavinoha 6 Reputation points

2022-10-25T09:38:41.643+00:00

Hello Pradeep,
the cluster creation is managed by myself. I currently use single node cluster with RT ML 11.0. with node type E16D_v4 and the issue is unfortunately present.
PRADEEPCHEEKATLA 91,866 Reputation points

2022-10-31T05:51:58.41+00:00

Hello @Jakub Stavinoha ,

Did you able to resolve the issue or still experiencing the same behaviour?

Answer 1

Jakub Stavinoha 6

Hello, Not, but I used a quickest possible workaround;

Instead of calling fs commands to create_table or write_table I have instead saved the dataframe as a delta table and afterwards registered the table as fs one. This way there weren't any problems nor delays, except not having a direct definition of partitions which was not a big pain in my use-case.

However, the command to drop table, which was very slow as well, now runs fast. But the above-mentioned commands still perform endlessly for originally described dataframe of 400k lines s (I think the longest run was over 1.5h after I cancelled the command).

PRADEEPCHEEKATLA 91,866 Reputation points

2022-11-02T09:54:11.643+00:00

Hello @Jakub Stavinoha ,

Glad to know that your issue has resolved. And thanks for sharing the solution, which might be beneficial to other community members reading this thread.

Share via

databricks feature store

1 answer

Your answer