Saturday, August 10, 2013

Sparse Columns

Sparse Columns



Introductions

In this article I am trying to discuss about a very interesting feture of MS SQL Server and it is called the Sparse Columns. This feature is started from MS SQL Server 2008 and onwards. Hope it will be interesting and informative.

Point in Focus

What is Sparse columns

Understand the Sparse columns with Example

Which columns not support it

Which datatypes not support it.

Where it is Best use

Others restrictions

What is Sparse columns

Sparse columns are nothing but a ordinary columns which is designed for optimized the NULL values. The benefits of the Sparse columns is they reduce space required for NULL values.
Another benefits is its reduce the costs of increased processing overhead to retrive the non-null values.

Let see the MSDN defination related to Sparse columns.

Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent


In MS SQL Server 2008 the maximum columns supported by a table objects is 1024. But the Sparse columns does not comes under this limit. The maximum Sparse coum supported by a table is 100,000. So a table objects may conatins 1024 regular columns + 100,000 Sparse columns.

Understand the Sparse columns with Example

In this example I am using a system stored procedure sp_spaceused. Before staring the example a samll note related to sp_spaceused from MSDN is mentioned bellow. For more details, please follow the MSDN link.

Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database.


Step-1 [ Creating the Base Table ]

-- Without SPARSE Columns
CREATE TABLE ExmapleTable1
      (EMPID   INT IDENTITY(1,1),
       FNAME   VARCHAR(50),
       LNAME   VARCHAR(50));

GO
-- With SPARSE
CREATE TABLE ExmapleTable2
      (EMPID   INT IDENTITY(1,1),
       FNAME   VARCHAR(50) SPARSE,
       LNAME   VARCHAR(50) SPARSE);

Step-2 [ Now Insert Some Records ]

DECLARE @idx INT = 0;
WHILE @idx < 50000
      BEGIN
         INSERT INTO ExmapleTable1 VALUES (NULL, NULL);
         INSERT INTO ExmapleTable2 VALUES (NULL, NULL);
         SET @idx+=1;
      END

Step-3 [ Now compre both ]

sp_spaceused 'ExmapleTable1'
GO
sp_spaceused 'ExmapleTable2'


Name
Rows
Reserved
Data
index_size
Unused
ExmapleTable1
50000
2824 KB
2808 KB
8 KB
8 KB
Name
Rows
Reserved
Data
index_size
Unused
ExmapleTable2
50000
1416 KB
1408 KB
8 KB
8 KB
                                                           
Which columns not support it

Some of the columns mentioed bellow is not supported by Sparse
Computed Columns / RowGuid / Filestream / Identity / XML

Which datatypes not support it

The following datatype not supported Sparse
Geography / Geometry / Image / Ntext / Text / Timestamp / Userdefine data type

Where it is Best use

The Sparse columns is best used in Fintered Index. Where data are filtered in the row.

Others restrictions


We can not bound rueles with sparse columns and it can not have any defualt value.
It can not be a part of Clustered or a unique Primary key indexes. It can not be used as a partition key of a clustered index or heap.

Sparse columns can not be used with userdefine table type (Table variable and Table valued parameters)



Hope you like it.




Introduces the concept of the sparse column

SQL Server 2008 introduces the concept of the sparse column, which is a type of column that optimizes storage for null values. When a column contains a substantial number of null values, defining the column as sparse can save a significant amount of disk space. In fact, a null value in a sparse column takes up no space at all.
There are trade-offs, however, in using sparse columns because more space is needed for non-null values. In a column configured as sparse, the non-null value requires an additional 4 bytes of storage. For example, a DATETIME value in a non-sparse column requires 8 bytes of storage. In a sparse column, however, that value requires 12 bytes. For this reason, Microsoft recommends that you use sparse columns only when the space saved is at least 20 to 40%.
If you decide to implement sparse columns, keep in mind that there are a number of restrictions. For example, the column must be nullable and cannot be configured with the ROWGUIDCOL or IDENTITY properties, cannot include a default, and cannot be bound to a rule. In addition, you cannot define a column as sparse if it is configured with certain data types, such as TEXT, IMAGE, or TIMESTAMP.
EXAMPLE
CREATE TABLE dbo.DocumentStore
            (DocID int PRIMARY KEY,
             Title varchar(200) NOT NULL,
             ProductionSpecification varchar(20) SPARSE NULL,
             ProductionLocation smallint SPARSE NULL,
             MarketingSurveyGroup varchar(20) SPARSE NULL ) ;

You can find a complete list of restrictions on sparse columns via Microsoft's