Tuesday, March 24, 2009

SQL Server Optimization Tips

General Tips:
Try to restrict the queries result set by returning only the particular columns from the table, not all table's columns.
This can results in good performance benefits, because SQL Server will return to client only particular columns, not all table's columns. This can reduce network traffic and boost the overall performance of the query.
Try to avoid using SQL Server cursors, whenever possible. SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use correlated subquery or derived tables, if you need to perform row-by-row operations.
If you need to return the total table's row count, you can use alternative way instead of SELECT COUNT (*) statement. Because SELECT COUNT (*) statement make a full table scan to return the total table's row count, it can take very many time for the large table. There is another way to determine the total row count in a table. You can use sysindexes system table, in this case. There is ROWS column in the sysindexes table. This column contains the total row count for each table in your database. So, you can use the following select statement instead of SELECT COUNT (*): SELECT rows FROM sysindexes WHERE id = OBJECT_ID ('table_name') AND indid <>Use table variables instead of temporary tables. Table variables require less locking and logging resources than temporary tables, so table variables should be used whenever possible. The table variables are available in SQL Server 2000 only.
Try to avoid using the DISTINCT clause, whenever possible. Because using the DISTINCT clause will result in some performance degradation, you should use this clause only when it is necessary.
Include SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a T-SQL statement. This can reduce network traffic, because your client will not receive the message indicating the number of rows affected by a T-SQL statement.
Use the select statements with TOP keyword or the SET ROWCOUNT statement, if you need to return only the first n rows. This can improve performance of your queries, because the smaller result set will be returned. This can also reduce the traffic between the server and the clients.
Try to use UNION ALL statement instead of UNION, whenever possible. The UNION ALL statement is much faster than UNION, because UNION ALL statement does not look for duplicate rows, and UNION statement does look for duplicate rows, whether or not they exist.
Try to use constraints instead of triggers, whenever possible. Constraints are much more efficient than triggers and can boost performance. So, you should use constraints instead of triggers, whenever possible.
Use user-defined functions to encapsulate code for reuse. The user-defined functions (UDFs) contain one or more Transact-SQL statements that can be used to encapsulate code for reuse. Using UDFs can reduce network traffic.
You can specify whether the index keys are stored in ascending or descending order. For example, using the CREATE INDEX statement with the DESC option (descending order) can increase the speed of queries, which return rows in the descending order. By default, the ascending order is used.
If you need to delete all tables’ rows, consider using TRUNCATE TABLE instead of DELETE command. Using the TRUNCATE TABLE is much fast way to delete all tables’ rows, because it removes all rows from a table without logging the individual row deletes.
Don't use Enterprise Manager to access remote servers over a slow link or to maintain very large databases. Because using Enterprise Manager is very resource expensive, use stored procedures and T-SQL statements, in this case.
Use SQL Server cursors to allow your application to fetch a small subset of rows instead of fetching all tables’ rows. SQL Server cursors allow application to fetch any block of rows from the result set, including the next n rows, the previous n rows, or n rows starting at a certain row number in the result set. Using SQL Server cursors can reduce network traffic because the smaller result set will be returned.


Tips for designing Tables :


Try to use constraints instead of triggers, rules, and defaults whenever possible. Constraints are much more efficient than triggers and can boost performance. Constraints are more consistent and reliable in comparison to triggers, rules and defaults, because you can make errors when you write your own code to perform the same actions as the constraints.
Use char/varchar columns instead of nchar/nvarchar if you do not need to store Unicode data. The char/varchar value uses only one byte to store one character; the nchar/nvarchar value uses two bytes to store one character, so the char/varchar columns use two times less space to store data in comparison with nchar/nvarchar columns.
If you work with SQL Server 2000, use cascading referential integrity constraints instead of triggers whenever possible. For example, if you need to make cascading deletes or updates, specify the ON DELETE or ON UPDATE clause in the REFERENCES clause of the CREATE TABLE or ALTER TABLE statements. The cascading referential integrity constraints are much more efficient than triggers and can boost performance.


Tips for designing Stored Procedures :


Use stored procedures instead of heavy-duty queries. This can reduce network traffic, because your client will send to server only stored procedure name (perhaps with some parameters) instead of large heavy-duty queries text. Stored procedures can be used to enhance security and conceal underlying data objects also. For example, you can give the users permission to execute the stored procedure to work with the restricted set of the columns and data.
Call stored procedure using its fully qualified name. The complete name of an object consists of four identifiers: the server name, database name, owner name, and object name. An object name that specifies all four parts is known as a fully qualified name. Using fully qualified names eliminates any confusion about which stored procedure you want to run and can boost performance because SQL Server has a better chance to reuse the stored procedures execution plans if they were executed using fully qualified names
Consider returning the integer value as an RETURN statement instead of an integer value as part of a recordset. The RETURN statement exits unconditionally from a stored procedure, so the statements following RETURN are not executed. Though the RETURN statement is generally used for error checking, you can use this statement to return an integer value for any other reason. Using RETURN statement can boost performance because SQL Server will not create a recordset.
Don't use the prefix "sp_" in the stored procedure name if you need to create a stored procedure to run in a database other than the master database. The prefix "sp_" is used in the system stored procedures names. Microsoft does not recommend using the prefix "sp_" in the user-created stored procedure name, because SQL Server always looks for a stored procedure beginning with "sp_" in the following order: the master database, the stored procedure based on the fully qualified name provided, the stored procedure using dbo as the owner, if one is not specified. So, when you have the stored procedure with the prefix "sp_" in the database other than master, the master database is always checked first, and if the user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.
Use the sp_executesql stored procedure instead of the EXECUTE statement. The sp_executesql stored procedure supports parameters. So, using the sp_executesql stored procedure instead of the EXECUTE statement improve readability of your code when there are many parameters are used. When you use the sp_executesql stored procedure to execute a Transact-SQL statement that will be reused many times, the SQL Server query optimizer will reuse the execution plan it generates for the first execution when the change in parameter values to the statement is the only variation.


Tips for designing Cursors :


Do not forget to close SQL Server cursor when its result set is not needed. To close SQL Server cursor, you can use CLOSE {cursor_name} command. This command releases the cursor result set and frees any cursor locks held on the rows on which the cursor is positioned.
Do not forget to deallocate SQL Server cursor when the data structures comprising the cursor are not needed. To deallocate SQL Server cursor, you can use DEALLOCATE {cursor_name} command. This command removes a cursor reference and releases the data structures comprising the cursor.
Try to reduce the number of columns to process in the cursor. Include in the cursor's select statement only necessary columns. It will reduce the cursor result set. So, the cursor will use fewer resources. It can increase cursor performance and reduce SQL Server overhead.
Use READ ONLY cursors, whenever possible, instead of updatable cursors. Because using cursors can reduce concurrency and lead to unnecessary locking, try to use READ ONLY cursors, if you do not need to update cursor result set.
Try avoiding using insensitive, static and keyset cursors, whenever possible. These types of cursor produce the largest amount of overhead on SQL Server, because they cause a temporary table to be created in TEMPDB, which results in some performance degradation.
Use FAST_FORWARD cursors, whenever possible. The FAST_FORWARD cursors produce the least amount of overhead on SQL Server, because there are read-only cursors and can only be scrolled from the first to the last row. Use FAST_FORWARD cursor if you do not need to update cursor result set and the FETCH NEXT will be the only used fetch option.
Use FORWARD_ONLY cursors, if you need updatable cursor and the FETCH NEXT will be the only used fetch option. If you need read-only cursor and the FETCH NEXT will be the only used fetch option, try to use FAST_FORWARD cursor instead of FORWARD_ONLY cursor. By the way, if one of the FAST_FORWARD or FORWARD_ONLY is specified the other cannot be specified.


Tips for Indexes:


Consider creating index on column(s) frequently used in the WHERE, ORDER BY, and GROUP BY clauses. These column(s) are best candidates for index creating. You should analyze your queries very attentively to avoid creating not useful indexes.
Drop indexes that are not used. Because each index take up disk space and slow the adding, deleting, and updating of rows, you should drop indexes that are not used. You can use Index Wizard to identify indexes that are not used in your queries.
Try to create indexes on columns that have integer values rather than character values. Because the integer values usually have less size then the characters values size (the size of the int data type is 4 bytes, the size of the bigint data type is 8 bytes), you can reduce the number of index pages which are used to store the index keys. This reduces the number of reads required to read the index and boost overall index performance.
Limit the number of indexes, if your application updates data very frequently. Because each index take up disk space and slow the adding, deleting, and updating of rows, you should create new indexes only after analyze the uses of the data, the types and frequencies of queries performed, and how your queries will use the new indexes. In many cases, the speed advantages of creating the new indexes outweigh the disadvantages of additional space used and slowly rows modification. However, avoid using redundant indexes; create them only when it is necessary. For read-only table, the number of indexes can be increased.
Check that index you tried to create does not already exist. Keep in mind that when you create primary key constraint or unique key constraints SQL Server automatically creates index on the column(s) participate in these constraints. If you specify another index name, you can create the indexes on the same column(s) again and again.
Create clustered index instead of nonclustered to increase performance of the queries that return a range of values and for the queries that contain the GROUP BY or ORDER BY clauses and return the sort results. Because every table can have only one clustered index, you should choose the column(s) for this index very carefully. Try to analyze all your queries, choose most frequently used queries and include into the clustered index only those column(s), which provide the most performance benefits from the clustered index creation.
Create nonclustered indexes to increase performance of the queries that return few rows and where the index has good selectivity. In comparison with a clustered index, which can be only one for each table, each table can have as many as 249 nonclustered indexes. However, you should consider nonclustered index creation as carefully as the clustered index, because each index take up disk space and drag on data modification.
Avoid creating a clustered index based on an incrementing key. For example, if a table has surrogate integer primary key declared as IDENTITY and the clustered index was created on this column, then every time data is inserted into this table, the rows will be added to the end of the table. When many rows will be added a "hot spot" can occur. A "hot spot" occurs when many queries try to read or write data in the same area at the same time. A "hot spot" results in I/O bottleneck.

Note. By default, SQL Server creates clustered index for the primary key constraint. So, in this case, you should explicitly specify NONCLUSTERED keyword to indicate that a nonclustered index is created for the primary key constraint.
Create a clustered index for each table. If you create a table without clustered index, the data rows will not be stored in any particular order. This structure is called a heap. Every time data is inserted into this table, the row will be added to the end of the table. When many rows will be added a "hot spot" can occur. To avoid "hot spot" and improve concurrency, you should create a clustered index for each table.
If you create a composite (multi-column) index, try to order the columns in the key so that the WHERE clauses of the frequently used queries match the column(s) that are leftmost in the index. The order of the columns in a composite (multi-column) index is very important. The index will be used to evaluate a query only if the leftmost index key's column are specified in the WHERE clause of the query. For example, if you create composite index such as "Name, Age", then the query with the WHERE clause such as "WHERE Name = 'Alex'" will use the index, but the query with the WHERE clause such as "WHERE Age = 28" will not use the index.
If you need to join several tables very frequently, consider creating index on the joined columns. This can significantly improve performance of the queries against the joined tables.
If your application will perform the same query over and over on the same table, consider creating a covering index including columns from this query. A covering index is an index, which includes all of the columns referenced in the query. So the creating covering index can improve performance because all the data for the query is contained within the index itself and only the index pages, not the data pages, will be used to retrieve the data. Covering indexes can bring a lot of performance to a query, because it can save a huge amount of I/O operations.
Use the SQL Server Profiler Create Trace Wizard with "Identify Scans of Large Tables" trace to determine which tables in your database may need indexes.

This trace will show which tables are being scanned by queries instead of using an index.