Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Here are some notes on "SQL Server 2008 Client Design" I took while attending an advanced class on SQL Server taught by Adam Machanic (https://sqlblog.com/blogs/adam_machanic/default.aspx).
Please note that, although these notes were taken during the class, I might have added some of my own (mis)interpretation :-). Always check your facts on Books Online (I try to provide links when applicable). As with anything else you get from a blog, never use any of this in production before you thoroughly validate it a test environment. Please keep in mind that some of those will be hard to follow without the right pre-requisite knowledge and the right context. Reading the post from top to bottom will help.
Talking about code
- ADO.NET SqlClient, SQLXML and SQLCLR
- Could you write a small ADO.NET in a few hours
- Could you write a few queries and return tables
- Is XML developer technology? Yes, but also admin...
- Need to understand what a developer feels like
- Most of the time waiting for the SQL Server...
Interfaces, ADO.NET
- Interface for an application to get to the data
- Chunky vs Chatty - Big vs Small interfaces
- Impacts performance - Each has its place
Chunky Interfaces
- Extreme case: SELECT with no WHERE clause
- Looks like: GetAllProducts()
- Get lots of data in single a calls
- Large network hit, locks longer, longer load time
- Less batches, potentially less overall I/O and network traffic
Chatty Interface
- Extreme case: Always SELECT ... WHERE Column=value
- Looks like: GetProducts(int[])
- Get just the exact data you need at that time
- Less network traffic, less locking, more seeks, more batches
- Potentially more overall I/O and network traffic
- Sometimes unecessary calls over time
Finding a balance
- How much data does your app need to work at once?
- Is a redesign possible?
- From GetProductsAll() to GetProducts(int[])
- From single updates to bulk/batch updates
Demo
- SQL Profiler running on the SQL side
- RowGetter demo application on client - Queries on SalesDetails table
- App options from MinSalesID/MaxSalesID, Chunky/Chatty, etc...
- Comparing batch query (chunky) vs. One request per row (chatty)
- SQL can also execute the batch as chatty/chunky/semi-chatty
General Rules
- Think also about network (not only CPU and IO).
- The queries are sometimes larger than the results. Think outbound and inbound.
- Prioritize performance first, then maintainability, then productivity
- If it performs and is hard to maintain, they'll use it and curse you
- If it does not perform, they'll not even use it
- Can you fix the performance problem later on the app? Not cheap...
Service Orientated Architecture (SOA)
- Boundaries are explicit
- Services are autonomous
- Services share schema and contract, not class
- Compatibility is based upon policy
Mapping Stored Procedures to SOA
- We've been doing this all along with stored procedures
- Explicit - Access only via SP
- Autonomy - If SPs don't change, can be independently deployed
- Contract - SPs are black block interface to DB
- Policies - Access control to SPs
- You don't necessarily need queues and other things to get to SOA
- Don't over think the problem and waste money
SQLClient Classes
- SQLConnection - connection to database
- SQLCommand - command or stored procedure
- SQLDataReader - connected, tabular "firehose"
- DataTable - disconnected, tabular result
- DataSet - collection of DataTables
SQLConnection
- Exploit connection pool as much as possible
- Golden rule: open late, close early – or, better worded – acquire later, release early
- This is important - worth making your code “less maintainable” to achieve it
- The scope is the app domain
- 16KB for each connection
- Connection pooling – same string, same context, connections are reused
- Connection pooling is not a SQL Server feature, it’s part of ADO.NET
SQLCommand
- Parameterize your queries
- Cache SQLCommand objects when possible
- “Non-result” methods – could be a lot faster
- ExecuteScalar – gets first column of first row, better than DataReader (use TOP,
- ExecuteNonQuery – No result sets, but can use output parameters (could be faster - test it!)
Prepared command
- Use parameters in your commands and turn it on using command.Prepare(), reuse the command
- ADO.NET will use sp_prepare, sp_execute – first sends text, then sends handle and parameters
- Fewer bytes over the wire if you’re calling this a number of times
- This does not apply to SP, since they’re already handled in a similar fashion
SQLDataReader
- Faster method to get data, forward only, connected while reading the data. Very fast, but inflexible
- Bad pattern – DataReader read a row, queries a web service, then moves to next one.
- Buffering of data for DataReader could be on the server or on the client, you don’t know.
- Release your connections explicitly and early. Or else you end up with orphaned connections.
- Use batches and .NextResult to get to the other result sets. This is forward only.
Closing the SQLDataReader
- Garbage collector will release readers once you get out of scope, but it could take a while.
- Use .ExecuteReader(CommandBehavior.CloseConnection) to close connection when reader is closed.
- Use .ExecuteReader(CommandBehavior.SequentialAccess) to read only column at a time (good for LOBs).
- If you call .Close it will get all the other rows in the set (for instance, report interrupted by user).
- Use .Cancel/.Dispose instead of .Close if you don’t care about the rest of the row and output parameters.
- You can monitor attention events in SQL Profiler to see if app is using .Cancel/.Dispose
DataTable
- Cached, in memory access to all of the rows you requested.
- Disconnected access, can be cached, passed around, read backwards, etc.
- Sacrifice performance for ultimate flexibility.
- DataTable is first filled in memory using a DataAdaptor, which uses the DataReader.
DataReader is better than DataTable
- Forward-only
- Populating lists, hydrating objects
- Too much data to fit in memory
DataTable is better than DataReader
- Need caching
- If you need to do Sorting, Filtering, DataViews
- If you need to bind to a Grid or Chart control
- If you are related sets of data (DataSet – also see MARS)
- Simplification of disconnected updates (though you could code it more efficiently)
- Complex processing per row (processing will scale better)
MARS
- Multiple Active Result Sets
- Single batch can return multiple results, without a DataTable
- Multiple Readers interleaved on related keys
- Fast loading of related data, similar to a “merge join”
- See https://msdn.microsoft.com/en-us/library/ms345109.aspx
GetOrdinal
- Column name indexer calls GetOrdinal under the covers
- You can optimize by using <int columnx = reader.GetOrdinal(“columnx”)> outside the loop
- This can make a huge difference in client performance in long loops
ADO.NET is slower SSMS
- Queries will run perform more slowly in ADO.NET than when running with SSMS
- Main culprit is usually ARITHABORT setting (affects things like indexed views, indexes on computed columns)
- Look at sys.dm_exec_plan_attributes WHERE (attribute=’set options’) and (value & 64 = 0)
- See https://msdn.microsoft.com/en-us/library/ms189472.aspx for a sample query
- Fix setting user options in the client with “sp_configure ‘user options’”
Parameterization vs. Concatenation
- Security – prevents SQL injection
- Performance – Fewer plans in cache, less compilation
- Consider using “force parameterization” option in database
- In the app, replace cmd.Text = “SELECT… WHERE ID=”+value with cmd.Parameter.Add(“@ID”,sqlDBType.int)=value;
- It is optional to specify data size or else it will vary based on the actual size.
- To overcome, make sure to use cmd.Parameter.Add(“@ID”,sqlDBType.VarChar,4000)=string
- Do not use Parameters.AddWithValue, since it does not specify size
- Careful with “parameter sniffing” – SQL can optimize the plan for the first value used
Connection Pooling
- New connections are expensive. Pooling caches open connections. Acquire late, release early.
- Eliminates much of cost, make sure there is always a connection available in the pool.
- Pooling is controlled by unique connection string and user identity.
- Application roles will create security context that cannot be reset.
- If your security model uses per-user authentication, it will likely disable pooling.
- Avoid “initial catalog” option. Use “database” option with three-part naming.
- See https://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
SQLConnectionStringBuilder
- SQLConnectionStringBuilder knows how to build connection strings.
- It’s great for setting just the properties your need.
- It can also read connection string in, change just a few properties.
Closing connection
- Understand .iDisposable. “Using” or “try/finally” around the block that uses the connection.
- Acquire, process and release in the same method. Benefit for DataAdapter.
- Remember SQLDataReader(CommandBehavior.CloseConnection).
- Connection leaks. Don’t leave it to the garbage collection…
Pool Size
- Default Maximum is 100. Consider increasing the maximum for busy apps.
- Default Minimum is 0. Consider increasing the minimum for a faster warm-up.
- Both settings are controlled by the connection string: “Max Pool Size” and “Min Pool Size”
- This is a client-side setting. Monitor using Performance Monitor in the client.
- Different for web farm, hosted environment, desktop client, etc.
- See https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectionstring.aspx
Monitoring
- .NET provider
- - Hard connects/disconnects per second
- - Soft connects/disconnects per second
- - Reclaimed connections – should be 0 (garbage collection happened)
- - Active Pools – should not climb
- SQL Trace - New behavior for Audit Login and Audit Logout
- - Hard (nonpooled) logins/logouts – EventSubClass is 1
- - Soft (pooled) logins/logouts – EventSubClass is 2
Data Caching
- Proper use of caching can be key to application scalability
- Consider granularity (user/server), location (client or server), update frequency, expiration, form (data/object)
Query Notification
- Application tells SQL to watch the query, send notification when it sends
- Case: Run “SELECT … FROM Orders WHERE Customer=10” with query notification.
- Case: If new orders for customer 10 show up, SQL will send notification.
- Uses indexed views, broker. We don’t have to worry so much about when to update the cache.
- See https://msdn.microsoft.com/en-us/library/ms175110.aspx
Query Notification Issues
- Indexed views are not free, there’s an overhead.
- Cache updates can be expensive. Strike a balance.
- Consider splitting large cache into several smaller caches, with more restrictive queries.
- Mark cache as “shareable” if possible. Avoid per-user notification.
- Case: Application global data this is only occasionally updated.
- Consider pairing with Change Tracking to reduce potential issues.
SQLBulkCopy
- .NET class that exposes Bulk Copy API
- Sends rows to SQL, minimally-logged inserts
- Input via DataTable or iDataReader
- More efficient than multiple insert statements, especially if minimally-logged
- Could even be used with FULL recovery mode for better performance
- Could be used for Bulk updates as well
- Consider other options like bcp and SSIS, but this is definitely a good one
- Demo: 10,000 rows using Row-by-row inserts and Bulk Insert
- Row-by-row Inserts takes 16.7s while Bulk Insert takes less than 1s
SQLBulkCopy - How
- Extremely easy to use:
- - Instantiate SqlBulkCopy class
- - Specify destination table
- - Call WriteToServer passing DataTable or iDataReader
- For ML insert, use SqlBulkCopyOptions and specify TableLock option.
- See https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx
SQLBulkCopy - Details
- Column mapping overrides supported ColumnMappings property
- Event notification to update UI (every 1000 rows, for instance)
- Don’t notify too frequently or it will slow things down
- Tune batch size. SQLCAT found that 10,000 is a sweet spot for ETL
- See https://msdn.microsoft.com/en-us/library/dd425070.aspx
Table-valued parameters
- Create a named type that is a table. SQL Server 2008, available from ADO.NET 3.5
- Syntax: CREATE TYPE name AS TABLE…
- It’s a table entirely in memory, can be use to pass tables to stored procedure
- Table types support CHECK, DEFAULT, PRIMARY KEY and UNIQUE constraints
- Must specify READONLY option in stored procedure parameter (cannot be OUTPUT)
TVP – How
- Create stored procedures with table type parameter
- Set up parameters collection as usual
- Set SqlDBType = SQLDBType.Structure
- Argument typed as DataTable, dbDataReader or iList<SqlDataRecord>
- Recommended in books online for up to 1,000 rows (SQLBulkCopy for higher)
- It uses the Bulk API, but SQL Profiler shows as individual insers (do not replay)
- See https://msdn.microsoft.com/en-us/library/bb675163.aspx
The Object-Relational Issue
- Database normalization and object-oriented normalization are different
- A whole breed of tools called Object-Relation Mappers (ORM) – Create a layer for the Developer
- Eliminates type issues, better tools, compile-time verification of queries
- .NET 3.5 shipped with two ORM-style frameworks: Entity Framework (EF) and LINQ to SQL (L2S)
LINQ
- LINQ = Language Integrated Query – FROM… WHERE… SELECT…
- It’s actually a good order, makes intellisense work better.
- This is not a database technology
EF
- XML-based ORM technology – new layer between app code and data access code
- Two ways
- – Entity SQL (eSQL) – SQL dialect design to work with entities
- – LINQ to Entities (L2E) – Very similar to LINQ to SQL
- EF queries are always parameterized
- Developer productivity, reduce amount of bad SQL, reduce cursors, temp table usage
EF details
- Careful: Learning curve, Variable-length parameters, automated T-SQL, permissions at database
- Best practices: Log and monitor SQL being produced, consider using SP
- See https://code.msdn.microsoft.com/esql
- See https://www.thedatafarm.com/blog/2009/02/16/LoggingEntityFrameworkQueriesToLookForPerfImprovementOpportunities.aspx
- Treat EF layer as Data Access layer
Comments
- Anonymous
January 01, 2003
PingBack from http://technology.linkablez.info/2009/03/23/sql-server-2008-client-design/ - Anonymous
January 01, 2003
I took some advanced SQL Server 2008 classes recently and shared my notes in a series of blog posts. - Anonymous
January 01, 2003
Thank you for submitting this cool story - Trackback from DotNetShoutout