Distributed Query Architecture

Microsoft SQL Server supports two methods for referencing heterogeneous OLE DB data sources in Transact-SQL statements:

  • Linked server names

    The system stored procedures sp_addlinkedserver and sp_addlinkedsrvlogin are used to give a server name to an OLE DB data source. Objects in these linked servers can be referenced in Transact-SQL statements using four-part names. For example, if a linked server name of DeptSQLSrvr is defined against another instance of SQL Server, the following statement references a table on that server:

    SELECT JobTitle, HireDate 
       FROM DeptSQLSrvr.AdventureWorks2008R2.HumanResources.Employee;
    

    The linked server name can also be specified in an OPENQUERY statement to open a rowset from the OLE DB data source. This rowset can then be referenced like a table in Transact-SQL statements.

  • Ad hoc connector names

    For infrequent references to a data source, the OPENROWSET or OPENDATASOURCE functions are specified with the information needed to connect to the linked server. The rowset can then be referenced the same way a table is referenced in Transact-SQL statements:

    SELECT *
    FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
            'c:\MSOffice\Access\Samples\Northwind.mdb';'Admin';'';
            Employees);
    

SQL Server uses OLE DB to communicate between the relational engine and the storage engine. The relational engine breaks down each Transact-SQL statement into a series of operations on simple OLE DB rowsets opened by the storage engine from the base tables. This means the relational engine can also open simple OLE DB rowsets on any OLE DB data source.

Relational engine, OLE DB layer, storage engine

The relational engine uses the OLE DB application programming interface (API) to open the rowsets on linked servers, fetch the rows, and manage transactions.

For each OLE DB data source accessed as a linked server, an OLE DB provider must be present on the server running SQL Server. The set of Transact-SQL operations that can be used against a specific OLE DB data source depends on the capabilities of the OLE DB provider. For more information, see OLE DB Provider Reference for Distributed Queries.

For each instance of SQL Server, members of the sysadmin fixed server role can enable or disable the use of ad-hoc connector names for an OLE DB provider using the SQL Server DisallowAdhocAccess property. When ad-hoc access is enabled, any user logged on to that instance can execute SQL statements containing ad-hoc connector names, referencing any data source on the network that can be accessed using that OLE DB provider. To control access to data sources, members of the sysadmin role can disable ad-hoc access for that OLE DB provider, thereby limiting users to only those data sources referenced by linked server names defined by the administrators. By default, ad-hoc access is enabled for the SQL Server OLE DB provider, and disabled for all other OLE DB providers.

Distributed queries can allow users to access another data source (for example, files, non-relational data sources such as Active Directory, and so on) using the security context of the Microsoft Windows account under which the SQL Server service is running. SQL Server impersonates the login appropriately for Windows logins; however, that is not possible for SQL Server logins. This can potentially allow a distributed query user to access another data source for which they do not have permissions, but the account under which the SQL Server service is running does have permissions. Use sp_addlinkedsrvlogin to define the specific logins that are authorized to access the corresponding linked server. This control is not available for ad-hoc names, so use caution in enabling an OLE DB provider for ad-hoc access.

When possible, SQL Server pushes relational operations such as joins, restrictions, projections, sorts, and group by operations to the OLE DB data source. SQL Server does not default to scanning the base table into SQL Server and performing the relational operations itself. SQL Server queries the OLE DB provider to determine the level of SQL grammar it supports, and, based on that information, pushes as many relational operations as possible to the provider. For more information, see SQL Dialect Requirements for OLE DB Providers.

SQL Server specifies a mechanism for an OLE DB provider to return statistics indicating how key values are distributed within the OLE DB data source. This lets the SQL Server query optimizer better analyze the pattern of data in the data source against the requirements of each SQL statement, increasing the ability of the query optimizer to generate optimal execution plans. For more information, see Distribution Statistics Requirements for OLE DB Providers.