question

FlashGordon-7254 avatar image
0 Votes"
FlashGordon-7254 asked FlashGordon-7254 commented

What is the Polybase Architecture?

Hi, I recently connected MS SQL Community 2019 with a local Oracle 19c database using Polybase. Polybase seems really insistent on using "Create External Table" to get access to the external database. So do I have to Create External Table for every foreign table I want to read? That seems really tedious and you have to deal with column data type issues between ms sql and the foreign database. So is the architecture of Polybase to individually connect every table in the foreign database one at a time? Prior approaches like Linked Server let you make one connection to Oracle and you could then just use queries to refer to all the table resources in the other database.

sql-server-general
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

ErlandSommarskog avatar image
0 Votes"
ErlandSommarskog answered FlashGordon-7254 commented

Yeah, if you are connecting to say an Oracle database, it's kind of tedious.

But if the data source is Hadoop or Spark or something which does not really have tables, it makes more sense, and I think is what has been driving this design.

Then they have also added this for relational sources so that you can use the same interface all over the place.

And there are some advantages with external tables over linked servers. One is that you can opt to only expose some of the columns on the remote source. Another is security. A linked server is not a securable, so once you have set up a linked server, every one can access the linked server as such. If you have self-mapping, it's not so bad, because everyone needs to have access on the other server. But if you have set up a login mapping and set it up for everyone - that's really bad. But manage so that only certain users that can access it - that's a nightmare.

With external tables, you can grant SELECT just like you can on any other table, so that is really neat.

But what I have found when testing is that latency is too high. The prime area is big data, and if you are to inject a 200 GB log file from Hadoop, there is no issue if there is a startup time of two seconds for the query. But it is unacceptable for a point-lookup query.

So for traditional database application, Polybase is not that hot. That is my conclusion.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Erland, Thanks, yes that makes sense that Polybase has more of a Big Data focus. And as you mention, volume awareness is important.

0 Votes 0 ·