SQL data type
SQL data type defines the type and range of the data that can be used with SQL Server.
Commonly used SQL data types:
Data Type | Syntax | Description |
integer | integer | Integer number. |
smallint | smallint | Small integer number. |
numeric | numeric(p,s) | Where p is a precision value; s is a scale value. For example, numeric(7,3) is a number that has 4 digits before the decimal and 3 digits after the decimal. |
decimal | decimal(p,s) | Where p is a precision value; s is a scale value. |
real | real | Single-precision floating point number |
double precision | double precision | Double-precision floating point number |
float | float(p) | Where p is a precision value. |
character | char(x) | Where x is the number of characters to store. This data type is space padded to fill the number of characters specified. |
character varying | varchar2(x) | Where x is the number of characters to store. This data type does NOT space pad. |
bit | bit(x) | Where x is the number of bits to store. |
bit varying | bit varying(x) | Where x is the number of bits to store. The length can vary up to x. |
date | date | Stores year, month, and day values. |
time | time | Stores the hour, minute, and second values. |
timestamp | timestamp | Stores year, month, day, hour, minute, and second values. |
time with time zone | time with time zone | Exactly the same as time, but also stores an offset from UTC of the time specified. |
timestamp with time zone | timestamp with time zone | Exactly the same as timestamp, but also stores an offset from UTC of the time specified. |
DATATYPE
DataType
|
Storage Size (Bytes)
|
Values
|
Comment
|
Tinyint
|
1
|
0 to 255
| |
Smallint
|
2
|
-2 x to 2 x - 1
(X: 15)
| |
Int
|
4
|
-2 x to 2 x - 1
(X: 31)
| |
Bigint
|
8
|
-2 x to 2 x - 1
(X: 63)
| |
Numeric
|
5 to 17
| ||
Datetime
|
8
|
01/01/1753 to 12/31/9999
|
Accuracy: 3 miliseconds
|
smalldatetime
|
4
|
01/01/1900 to 06/06/2079
|
Accuracy: 1 minute
|
Datetime2
|
6 to 8
|
01/01/0001 to 12/31/9999
|
Accuracy: 100 nanoseconds
|
datetimeoffset
|
8 to 10
|
01/01/0001 to 12/31/9999
|
Accuracy: 100 nanoseconds
|
Date
|
3
|
01/01/0001 to 12/31/9999
|
Accuracy: 1 Day
|
Time
|
3 to 5
|
00:00:00.0000000 to
23:59:59.9999999
|
Accuracy: 100 nanoseconds
|
Char
|
As specified
|
Only english characters are allowed
| |
Text
|
generally not used now
| ||
Varchar
|
as specified
max 2 GB |
Only english characters are allowed
| |
Nchar
|
As specified
(double the char) |
Non-english characters are allowed
| |
Ntext
|
generally not used now
| ||
Nvarchar
|
as specified
(double the varchar) max: upto 2 GB |
Can store non-english characters
| |
Binary
|
As specified
| ||
Varbinary
|
Variable As specified
max: upto 2 GB |
FILESTREAM can be used with VARBINARY(MAX)
| |
Image
|
16
|
generally not used now
| |
Cursor
| |||
Hierarchyid
|
892 bytes
|
Binary number maintains the
hierarchy | |
sql_variant
| |||
timestamp/
rowversion |
nullable : varbinary(8)
non-nullable: binary(8) |
Unique Binary number
throught database | |
uniqueidentifier
|
16
|
NEWID function can be used
| |
Xml
|
max: upto 2 GB
|
Maximum 128 levels
| |
Geography
|
max: upto 2 GB
| ||
Geometry
|
max: upto 2 GB
|
User Defined Data Types:
CREATE TYPE City FROM VARCHAR(255) NOT NULL
CREATE TABLE Info (A INT, B CITY)
Data Types
As we all know the selection of proper data types is very important for a SQL Server Table objects. Without proper data types the performance is going down and it is un manageable for SQL developer.
In this article I am trying to gather and provide a chart that contains most useful data types used in Microsoft SQL Server Version 2000/2005/2008.
Datatype
|
Min
|
Max
|
Storage
|
Type
|
Notes
|
Bigint
|
-2^63
|
2^63-1
|
8 bytes
|
Exact numeric
| |
Int
|
-2,147,483,648
|
2,147,483,647
|
4 bytes
|
Exact numeric
| |
Smallint
|
-32,768
|
32,767
|
2 bytes
|
Exact numeric
| |
Tinyint
|
0
|
255
|
1 bytes
|
Exact numeric
| |
Bit
|
0
|
1
|
1 to 8 bit columns in the same table requires a total of 1 byte, 9 to 16 bits = 2 bytes, etc...
|
Exact numeric
| |
Decimal
|
-10^38+1
|
10^38–1
|
Precision 1-9 = 5 bytes, precision 10-19 = 9 bytes, precision 20-28 = 13 bytes, precision 29-38 = 17 bytes
|
Exact numeric
|
Decimal and numeric data type is exactly the same. Precision is the total number of digits. Scale is the number of decimals. For both the minimum is 1 and the maximum is 38.
|
Numeric
|
no
| ||||
Money
|
-2^63 / 10000
|
2^63-1 / 10000
|
8 bytes
|
Exact numeric
| |
Smallmoney
|
-214,748.3648
|
214,748.3647
|
4 bytes
|
Exact numeric
| |
Float
|
-1.79E + 308
|
1.79E + 308
|
4 bytes when precision is less than 25 and 8 bytes when precision is 25 through 53
|
Approximate numeric
|
Precision is specified from 1 to 53.
|
Real
|
-3.40E + 38
|
3.40E + 38
|
4 bytes
|
Approximate numeric
|
Precision is fixed to 7.
|
Datetime
|
1753-01-01 00:00:00.000
|
9999-12-31 23:59:59.997
|
8 bytes
|
Date and time
|
If you are running SQL Server 2008 and need milliseconds precision, use datetime2(3) instead to save 1 byte.
|
Smalldatetime
|
1900-01-01 00:00
|
2079-06-06 23:59
|
Date and time
| ||
Date
(SQL2008)
|
0001-01-01
|
9999-12-31
|
Date and time
| ||
Time
(SQL2008)
|
00:00:00.0000000
|
23:59:59.9999999
|
Date and time
|
Specifying the precision is possible. TIME(3) will have milliseconds precision. TIME(7) is the highest and the default precision. Casting values to a lower precision will round the value.
| |
Datetime2
(SQL2008)
|
0001-01-01 00:00:00.0000000
|
9999-12-31 23:59:59.9999999
|
Presicion 1-2 = 6 bytes precision 3-4 = 7 bytes precision 5-7 = 8 bytes
|
Date and time
|
Combines the date data type and the time data type into one. The precision logic is the same as for the time data type.
|
Datetimeoffset
(SQL2008)
|
0001-01-01 00:00:00.0000000 -14:00
|
9999-12-31 23:59:59.9999999 +14:00
|
Presicion 1-2 = 8 bytes precision 3-4 = 9 bytes precision 5-7 = 10 bytes
|
Date and time
|
Is a datetime2 data type with the UTC offset appended.
|
Char
|
0 chars
|
8000 chars
|
Defined width
|
Character string
|
Fixed width
|
Varchar
|
0 chars
|
8000 chars
|
2 bytes + number of chars
|
Character string
|
Variable width
|
Varchar(max)
(SQL2005/2008)
|
0 chars
|
2^31 chars
|
2 bytes + number of chars
|
Character string
|
Variable width
|
Text
|
0 chars
|
2,147,483,647 chars
|
4 bytes + number of chars
|
Character string
|
Variable width
|
Nchar
|
0 chars
|
4000 chars
|
Defined width x 2
|
Unicode character string
|
Fixed width
|
Nvarchar
|
0 chars
|
4000 chars
|
Unicode character string
|
Variable width
| |
Nvarchar(max)
(SQL2005/2008)
|
0 chars
|
2^30 chars
|
Unicode character string
|
Variable width
| |
Ntext
|
0 chars
|
1,073,741,823 chars
|
Unicode character string
|
Variable width
| |
Binary
|
0 bytes
|
8000 bytes
|
Binary string
|
Fixed width
| |
Varbinary
|
0 bytes
|
8000 bytes
|
Binary string
|
Variable width
| |
Varbinary(max)
(SQL2005/2008)
|
0 bytes
|
2^31 bytes
|
Binary string
|
Variable width
| |
Image
|
0 bytes
|
2,147,483,647 bytes
|
Binary string
|
Variable width
| |
Sql_variant
|
Other
|
Stores values of various SQL Server-supported data types, except text, ntext, and timestamp.
| |||
Timestamp
|
Other
|
Stores a database-wide unique number that gets updated every time a row gets updated.
| |||
Uniqueidentifier
|
Other
|
Stores a globally unique identifier (GUID).
| |||
Xml
(SQL2005/2008)
|
Other
|
Stores XML data. You can store xml instances in a column or a variable.
| |||
Cursor
|
Other
|
A reference to a cursor.
| |||
Table
|
Other
|
Stores a result set for later processing.
|
Hope you like it.
Why nVARCHAR
One of my friends ask me a question why use nVARCHAR not the VARCHAR. He also told me when he makes any Table objects by nVARCHAR data types it takes double in length. I try to explain his query by T-SQL statements.
CREATE TABLE Table_1
(
Emp_name1 VARCHAR(20),
Emp_Name2 nVARCHAR(20)
)
GO
sp_columns Table_1
GO
COLUMN_NAME TYPE_NAME LENGTH
Emp_name1 varchar 20
Emp_Name2 nvarchar 40
This article is dedicated to him to understand the difference between VARCHAR and nVARCHAR and why it takes double in length.
To explain it I am just inserting some values to my above table and observe the result closely.
INSERT INTO Table_1
( Emp_name1, Emp_Name2)
VALUES ('A', 'A')
GO
SELECT DATALENGTH(Emp_name1) [VARCHAR_LENGHT],
DATALENGTH(Emp_name2) [nVARCHAR_LENGHT]
FROM Table_1
GO
VARCHAR_LENGHT nVARCHAR_LENGHT
1 2
If we observe the output closely we can see the VARCHAR takes 1 bytes and nVARCHAR takes 2 bytes.
VARCHAR is ASCII based and nVARCHAR is UNICODE based and 1 byte for ASCII and 2 byte for UNICODE.
So what it means, by using nVARCHAR we just loose the disk space?
In this article I am trying to explain the Difference UNICODE vs. ASCII and UNICODE advantage. To support my point I am gathering some facts that are mentioned bellow.
Here I am just mentioned some major differences only.
1. UNICODE takes 2 byte. 1 byte for language page 1 byte for sign value.
2. ASCII takes 1 byte. It doesn't contain info about language page and all bytes contain sign info.
3. If we will use in our application different language in one time. I mean we can see record on English and Japan language - UNICODE can solve this problem as because it has language page info.
4. If we will use in our application different language in one time. I mean we can see record on English and Japan language - ASCII can't solve this problem, because it can store info only about one language.
Hope you like it.
TIME data type
SQL Server 2008 introduced a new data type called TIME, which allow the TIME without DATE. Before SQL Server 2008 it is not possible.
In this article I am going to explain the TIME data type by creating scenario to understand it better way.
Scenario-1 [ The TIME data type ]
The Example
DECLARE @tm TIME = '16:30:12'
SELECT @tm As [Time]
Result
Time
16:30:12.0000000
Scenario-2 [ The TIME data type Accuracy ]
The default accuracy of TIME data type is 100 nanoseconds. It also allows us to define the accuracy. This indicates how many places to the right of the decimal are stored for the second's portion. We can use 0 to 7 places to the right of the decimal.
Example
DECLARE @tm0 TIME(0) = '16:32:19.1234567',
@tm7 TIME(7) = '16:32:19.1234567'
SELECT @tm0 AS [Time0], @tm7 AS [Time7]
Result Set
Time0 Time7
16:32:19 16:32:19.1234567
Scenario-3 [ The TIME data type Storage ]
A TIME(0) takes three bytes to store and a TIME(7) takes five bytes to store.
Scenario-4 [ The TIME data type Conversion ]
TIME will do the implicit conversion from DATETIME and stores only the time portion of it.
Example
DECLARE @dt1 DATETIME = '12/29/2007 12:43:24.42',
@tm1 TIME(2)
SELECT @tm1 = @dt1
SELECT @tm1 As [TimeOnly]
Result Set
TimeOnly
12:43:24.42
Scenario-5 [ The TIME data with TIME ZONE ]
The TIME() does not include any time zone information, it will accept a time with time zone information but will ignore the time zone info when displayed.
Example
DECLARE @tm TIME(0) = '12:45:11 -05:30'
SELECT @tm AS [Time]
Result
Time
12:45:11
DATETIME vs DATETIME2
The new data type DATETIME2 is introduced in Microsoft SQL Server 2008.
It is recommended by Microsoft to use DATETIME2 instead of previous version DATETIME.
Before using the DATETIME2 we have to understand the differences between DATETIME and DATETIME2.
The differences are mentioned in the tabular format
Options
|
DATETIME
|
DATETIME2[(n)]
|
Min Value
|
1753-01-01 00:00:00
|
0001-01-01 00:00:00
|
Max Value
|
9999-12-31 23:59:59.997
|
9999-12-31 23:59:59.9999999
|
Storage Size
|
8 Bytes
|
6 to 8 bytes
Parameter n is optional and if it is not specified then fractional seconds precision is 7 digit and it can be from 0 to 7 digit. For fractional seconds precision <3, takes 6 bytes For fractional seconds precision 3 or 4 it will take 7 bytes For fractional seconds precision >4 it will take 8 bytes |
How to use it
|
DECLARE @param DATETIME
|
DECLARE @param DATETIME2(7)
|
Compliance
|
Is not an ANSI/ISO compliant
|
Is an ANSI/ISO compliant
|
Function Used
|
GETDATE() – It returns DB Current Date and Time of DateTime Data Type
|
SYSDATETIME()- It returns DB Current Date and Time of DateTime2 Data Type
Example: SELECT SYSDATETIME() |
+/- of (days)
|
WORKS
Example: DECLARE @nowDateTime DATETIME = GETDATE() SELECT @nowDateTime + 1 Result: 2011-09-17 13:44:31.247 |
FAILS – Need to use only DATEADD function
Example: DECLARE @nowDateTime2 DATETIME2= SYSDATETIME() SELECT @nowDateTime2+1 Result: Msg 206, Level 16, State 2, Line 2 Operand type clash: datetime2 is incompatible with int |
Hope you like it.
HerichyID in MS SQL 2008
One of the most tuffs' queries is to finding the hierarchy. It may be employee or other things. It is always the headache of developer to make tree like structure.
In SQL 2000 we use temporary table to solve this. But in SQL 2005 gives us the better facility to work like this by CTE. An example of SQL 2005 CTE is given bellow. Hope you understand it.
Table Structure:
CREATE TABLE my_EMP
(empid int NOT NULL primary key,
empname varchar(50) NULL,
managerid int)
Values Inserted:
INSERT INTO my_EMP(empid, empname, managerid)
VALUES(1, 'Sudip Das', NULL),(2, 'Joydeep Das', 1),(3, 'Tufan',2),
(4, 'Sangram',2)
SELECT * FROM my_EMP
The Output:
empid empname managerid
1 Sudip Das NULL
2 Joydeep Das 1
3 Tufan 2
4 Sangram 2
The CTE and Output (Hierarchy level)
WITH empharicy(empid, empname, haricyOrder)
AS
(
SELECT empid, empname, 1 as haricyOrder
FROM my_EMP
WHERE managerid IS NULL
UNION ALL
SELECT a.empid, b.empname, haricyOrder+1
FROM my_EMP a
INNER JOIN empharicy b ON a.managerid=b.empid
)
SELECT * FROM empharicy
empid empname haricyOrder
1 Sudip Das 1
2 Sudip Das 2
3 Sudip Das 3
4 Sudip Das 3
Great news in SQL 2008, no pain for developer there is a new data type namedHerichyID is your ultimate weapons.
Let's starts it.
To provide a real support of hierarchies, SQL Server 2008 introduces a new type of data: HierarchyID.
It is a managed type (.NET), handled by the SQLCLR of SQL Server. It not only stores the parent elements but also stores the set of information to identify the entire hierarchy.
My main purpose of this article to convert our existing table named "my_EMP" to newly created table which contains data type HierarchyID.
We are taking a new columns named "OrgNode" to understand the migration properly. So let's create the new table.
CREATE TABLE My_EEMP_Converted
(
OrgNode hierarchyid,
empid int,
empname varchar(50),
managerid int
CONSTRAINT PK_My_EEMP_Converted
PRIMARY KEY CLUSTERED (OrgNode)
);
Now create a temporary table in the name of "#children" to contain columns name "Num" that contains the number of children's in each node. Create index on it.
CREATE TABLE #Children
(
empid int,
managerid int,
Num int
);
GO
CREATE CLUSTERED INDEX tmpind ON #Children(managerid, empid);
Now populate the temporary table. Here ROW_NUMBER function is used to populate the "Num" columns. I know you have idea on it.
INSERT #Children (empid, managerid, Num)
SELECT empid, managerid,
ROW_NUMBER() OVER (PARTITION BY managerid ORDER BY managerid)
FROM my_EMP
GO
Now review the child table.
SELECT * FROM #Children ORDER BY ManagerID, Num
The desired result set is displayed.
empid managerid Num
1 NULL 1
2 1 1
3 2 1
4 2 2
Now populate the table "My_EEMP_Converted"
WITH paths(path, empid)
AS
(
-- This section provides the value for the root of the hierarchy
SELECT hierarchyid::GetRoot() AS OrgNode, empid
FROM #Children C
WHERE ManagerID IS NULL
UNION ALL
-- This section provides values for all nodes except the root
SELECT CAST(CONVERT(varchar,p.path)
+ CONVERT(varchar(30),C.Num)
+ '/' AS hierarchyid),
C.empid
FROM #Children C
INNER JOIN paths p ON C.managerid = p.empid
)
INSERT My_EEMP_Converted (OrgNode, O.empid, O.empname, O.managerid)
SELECT P.path, O.empid, O.empname, O.managerid
FROM my_EMP AS O
INNER JOIN Paths AS P ON O.empid = P.empid
GO
SELECT * FROM My_EEMP_Converted
To more understanding of "OrgNode" convert it to string.
SELECT CONVERT(VARCHAR,OrgNode)OrgNode , empid, empname, managerid
FROM My_EEMP_Converted
Now drop the temp table
DROP TABLE #Children
GO
To help queries at the same level in the hierarchy, use the Get Level method to create a computed column that contains the level in the hierarchy. Create compound index on level and the " Hierarchyid".
ALTER TABLE My_EEMP_Converted
ADD H_Level AS OrgNode.GetLevel()
CREATE UNIQUE INDEX My_EEMP_Converted_idx
ON My_EEMP_Converted(H_Level, OrgNode)
GO
SELECT * FROM My_EEMP_Converted
Now create unique index of empid.
Complete view
SELECT OrgNode, CONVERT(varchar,OrgNode) logical_level,
empid, empname, managerid, H_Level
FROM My_EEMP_Converted
OrgNode logical_level empid empname managerid H_Level
0x / 1 Sudip Das NULL 0
0x58 /1/ 2 Joydeep Das 1 1
0x5AC0 /1/1/ 3 Tufan 2 2
0x5B40 /1/2/ 4 Sangram 2 2
Hope you like it.
Trick- Date Range
To display continues date between 2 date ranges uses this trick
SET DATEFORMAT DMY
GO
IF NOT EXISTS(SELECT * FROM sysobjects WHERE xtype='U' and name like'Numbers')
BEGIN
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.columns s1
CROSS JOIN sys.columns s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEYCLUSTERED (Number)
END
GO
DECLARE @Start DATETIME
DECLARE @End DATETIME
SET @Start='01-01-2011'
SET @End='28-02-2011'
SELECT
@Start+Number-1
FROM Numbers
WHERE Number<=DATEDIFF(day,@Start,@End)+1
GO
Which Data Type is good For Primary Key
The most important thing of a table is defining the primary key; it could be single column based or the composite primary key.
In my development life, I saw a lot of primary key with different type or patterns.
Such as single decimal number, some use character with decimal numbers and some others use character+decimal+special character like "SOM/001/2011-2012" like this.
When I am working with SQL server, it always my mind, that what is the better method to implement primary key rather what data type I choose to implement. For searching the suitable answer I read several articles related to it.
One thing is clear, Microsoft provide you the data type UNIQUE IDENTIFITRE witNEWID() function to generate the world wide unique primary key. It 16 byte binary value and called globally unique identifiers (GUIDs). It takes your processor ID, NIC card ID etc to generate the Primary key.
But is it a good data type for a primary key?
To get the suitable answer, we have to understand the definition of the primary key.
1. It is unique and NOT NULL.
2. The primary key contains the default clustered index (Physically sorted environment).
So, is the UNIQUE IDENTIFIRE is the good choice for cluster indexed data? In my view it is not. If your primary key is not well indexed is it going for good quality of performance? No not at all.
So in my view
For primary key always choose the Integer (int) as best data type.
If you're primary key range is very large and not supported the range of Integer then go to Big integer (bigint).
Please review my article, and give me suitable comments related to this as all that I mentioned is not listed in MSDN or any other MS guide line. It is just my point of view only.
The NEW Data Types MS SQL 2008
The data type of SQL server is a very important thing to building the table objects. So we must have clear point of view before using different data types.
In my professional life I have seen may DBA uses the wrong type of data type in table objects and results, performance slow down, Time out …expire etc, as they don't have enough idea of data type related to SQL server.
In these articles, I try to define some new data types that came with SQL 2008 and not used in previous version of MS SQL server.
DATE TIME
It's much easier with SQL 2008; MS provide us better control related to Date and time data types. In SQL 2008 introduced: DATE, TIME, DATETIME2, and DATETIMEOFFSET.
[A]
DATE
The DATE data type stores only date in YYYY-MM-DD format.
It stores range of 0001-01-01 through 9999-12-32. It takes 3 bytes to store.
[B]
TIME
It stores only time in the format hh:mm:ss.nnnnnnn, and range between 00:00:00.0000000 through 23:59:59:9999999 and is accurate to 100 nanoseconds. It takes 3 to 5 bytes to store.
[C]
DATETIME2
It is as similar to the SQL 2005 DATETIME but have greater range value. It runs from 6 to 8 bytes to store.
[D]
DATETIMEOFFSET
It is similar to DATETIME2, but includes additional information to track the time zone. The format is YYYY-MM-DD hh:mm:ss[.nnnnnnn] [+|-]hh:mm with a range of 0001-01-01 00:00:00.0000000 through 0001-01-01 00:00:00.0000000 through 9999-12-31 23:59:59.9999999 (in UTC), and an accuracy of 100 nanoseconds. Storage depends on the precision and scale selected, and runs from 8 to 10 bytes.
SPECIAL
Two new spatial data types have been added- GEOMETRY and GEOGRAPHY-which you can use to natively store and manipulate location-based information, such as Global Positioning System (GPS) data.
[A]
GEOMETRY
The GEOMETRY data type is used to store planar (flat-earth) data. It is generally used to store XY coordinates that represent points, lines, and polygons in a two-dimensional space. For example storing XY coordinates in the GEOMETRY data type can be used to map the exterior of a building.
[B]
GEOGRAPHY
The GEOGRAPHY data type is used to store ellipsoidal (round-earth) data. It is used to store latitude and longitude coordinates that represent points, lines, and polygons on the earth's surface. For example, GPS data that represents the lay of the land is one example of data that can be stored in the GEOGRAPHY data type.
HIERARCHYID
The HIERARCHYID data type is used to enable database applications to model hierarchical tree structures, such as the organization chart of a business.
The HIERARCHYID data type makes it easier to express these types of relationships without requiring multiple parent/child tables and complex joins.
· Organizational structures
· A set of tasks that make up a larger projects (like a GANTT chart)
· File systems (folders and their sub-folders)
· A classification of language terms
· A bill of materials to assemble or build a product
· A graphical representation of links between web pages
FILESTREAM
FILESTREAM is not a data type as such, but is a variation of the VARBINARY(MAX) data type that allows unstructured data to be stored in the file system instead of inside the SQL Server database.
|