Microsoft Technologies: 07/24/13

Wednesday, July 24, 2013

Temporary Tables in SQL Server

Temporary tables are similar to Permanent tables but Temporary tables are stored in TEMPDB System Database. Temporary tables supports all kinds of operation that one normal table supports. Temporary tables can be created at run time in Stored Procedure. Temporary tables can’t be used in User Defined Function. Temporary tables help the developer to improve performance tuning of query.You can’t use stored procedure SP_Rename to rename the Temporary Table name; it can be used to rename columns of Temporary table. Stored procedures can reference temporary tables that are created during the current session. Within a stored procedure, you cannot create a temporary table, drop it, and then create a new temporary table with the same name.

In SQL Server based on scope and behavior there are two types of temp tables.

1. Local Temp Table
2. Global Temp Table

Local Temp Table: Local temporary table name is stared with single hash ("#") sign. Local Temporary Table scope is Local which means Local temp tables are only available to the current connection for the user; and they are automatically deleted when the user disconnects the connection from instance of SQL Server. If User wants to drop the table then using DROP command it can be deleted.Local temporary tables are visible only to their creators during the same connection to an instance of SQL Server. They cannot therefore be used in views and you cannot associate triggers with them.Constraints and Indexes can be created on Local Temporary table, but we can't create Foreign key reference.

Syntax to create Local Temporary Table

Create Table #Table_Name (Column data_type [Width], ….n)

Let’s create one Local Scope Temporary Table and understands the above explained concept.

CREATE TABLE #LOCALEMPTEMPTABLE (EMPID INT, EMPNAME VARCHAR(100))

The above script will create a temporary local scope table in TEMPDB database. You can find it in Tempdb database using below query

GO
SELECT * FROM TEMPDB.SYS.TABLES

As I have already told, you can perform any operation on Temporary table that a Permanent table supports. Let’s insert few records in recently created temporary table. Insert query will be same to insert data in Temporary tables.

GO
INSERT INTO #LOCALEMPTEMPTABLE VALUES ( 1, 'GHANESH');

Let’s fetch the record from the #LocalEMPTempTable table using below query.

GO
SELECT * FROM #LOCALEMPTEMPTABLE

After execution of all these statements, if you close the query window and again open a new connection and then execute "Insert" or "Select" Command on #LocalEMPTempTable table, it will throw below error.

Msg 208, Level 16, State 0, Line 1

Invalid object name '#LOCALEMPTEMPTABLE'.

This is because the scope of Local Temporary table is only bounded with the current connection of current user. You are using a new connection in which the Local Temporary Table was not created.

Global Temp Table: Global Temporary tables name starts with a double hash ("##"). Scope of Global temporary table is Global which means Once this table has been created by a connection, like a permanent table it is then available to any user by any connection. It will be automatically deleted once all connections have been closed from the instance of SQL Server. User can also drop the table using DROP command. It is suggested to Drop the tables manually using DROP statement if it is not required anymore. Constraints and Indexes can be created on Global Temporary table, but we can't create foreign key references.They cannot be used in views and you cannot associate triggers with them.

Syntax to create Global Temporary Table

Create Table ##Table_Name (Column data_type [Width], ….n)

Let’s create one Global Scope Temporary Table and understands the above explained concept.

GO
CREATE TABLE ##GLOBALEMPTEMPTABLE (EMPID INT, EMPNAME VARCHAR(100))

The above script will create a temporary Global scope table in TEMPDB database. You can find the table in Tempdb database using below query.

GO
SELECT * FROM TEMPDB.SYS.TABLES

As I have already told, you can perform any operation on Temporary table that a Permanent table supports. Let’s insert few records in recently created temporary table. Insert query will be same to insert data in Temporary tables.

GO
INSERT INTO ##GlobalEMPTEMPTABLE VALUES ( 1, 'PRASAD'), ( 2, 'ANSHUL')

Let’s fetch the record from the ##GlobalEMPTempTable table using below query.

GO
SELECT * FROM ##GLOBALEMPTEMPTABLE

After execution of all these statements, if you open a new query window and then execute "Insert" or "Select" Command on ##GLOBALEMPTempTable table, it will execute successfully.

This is because the scope of Global Temporary table is not bound with the current connection of current user. Global temporary tables are visible to all SQL Server connections. When you create one of these, all the users can see it.

In What scenarios Temp Tables should be used?

When a large number of row manipulation is required in stored procedures.
When we are having a complex join operation.
Temporary tables can replace the cursor because we can store the result set data into a temp table, and then we can manipulate the data from there.

Key points to remember about Temp Tables.

Temporary Tables can’t have foreign key constraint.
We can’t create Views on Temporary Tables.
We can’t create Triggers on Temporary Tables.
Temporary Tables will be stored in Tempdb system database.
Temporary Tables are automatically deleted based on their scope.
Temporary Tables can’t be renamed using SP_Rename sytem stored prodecure.
Temporary Tables can’t be used in User Defined Function.
The best way to use a temporary table is to create it and then fill it with data. i.e., instead of using Select into temp table, create temp table 1st and populate with data later.
Temporary Tables can have Indexes, Constraints (except Foreign key constraint).
Temporary Tables Supports Transactions.
Temporary Tables Supports Error Handling.
Temporary Tables Supports DDL, DML commands.
Tables need to be deleted when they are done with their work.

Table Variable:

Alternative of Temporary table is the Table variable which can do all kinds of operations that we can perform in Temp table.

In SQL Server we have a Data Type Table. We can make use of this data type to create temporary tables in database. Table variables are partially stored on disk and partially stored in memory. It's a common misconception that table variables are stored only in memory. Because they are partially stored in memory, the access time for a table variable can be faster than the time it takes to access a temporary table.Table variable is always useful for less data. If the result set returns a large number of records, we need to go for temp table. We don’t use CREATE command to Create Temporary Table Variable, we use DECLARE keyword to create Temporary Table Variable. As I have already mentioned that Table is a data type in SQL Server that is why we use declare keyword just like we use DECLARE keyword for any data type. Functions and variables can be declared to be of type Table.

A table variable behaves like a local variable. It has a well-defined scope. This is the function, stored procedure, or batch that it is declared in.

Within its scope, a table variable can be used like a regular table. It may be applied anywhere a table or table expression is used in SELECT, INSERT, UPDATE, and DELETE statements.

However, table variable cannot be used in the following statement:

SELECT select_list INTO table_variable;

Table variables are automatically cleaned up at the end of the function, stored procedure, or batch in which they are defined.

Here is the syntax for temporary table variable.

DECLARE @TempTableVariable (Column Data_Type Width, … n)

Let’s declare a Table variable and perform Insert and Select operation on it.

GO
DECLARE @EMPTABLEVARIABLE TABLE(EMP INT, EMPNAME VARCHAR(MAX))
INSERT INTO @EMPTABLEVARIABLE VALUES(1,'SUMIT')

SELECT * FROM @EMPTABLEVARIABLE

GO

We can’t run INSERT and SELECT commands separately, we have to include DECLARE Table variable command otherwise it will give below error.

Msg 1087, Level 15, State 2, Line 1

Must declare the table variable "@EMPTABLEVARIABLE".

If you have less than 100 rows generally use a table variable. Otherwise use a temporary table. This is because SQL Server won't create statistics on table variables.

Temporary Table with Dynamic Columns in SP

What the Problem is

When we are working with temporary table on a stored procedure with dynamic columns name we have some problem.

To understand it properly we are just taking an example.

BEGIN
DECLARE @TblCol nVARCHAR(max);
CREATE TABLE #TMPDATE
(date1 DATETIME);

INSERT INTO #TMPDATE
(date1)
VALUES ('2014-01-01'),('2014-01-02'),('2014-01-03'),
('2014-01-04'),('2014-01-05'),('2014-01-06');

SELECT @TblCol =
STUFF(( SELECT DISTINCT TOP 100 PERCENT
'] DECIMAL(20,2), [' + t2.date1
FROM (
SELECTCONVERT(VARCHAR(25),date1,105) date1
FROM #TMPDATE
) AS t2
ORDER BY '] DECIMAL(20,2), [' +t2.date1
FOR XML PATH('')
), 1, 2, '') + '] DECIMAL(20,2)'

SET @TblCol = SUBSTRING(@TblCol,LEN('VARCHAR(20,2),')+2, LEN(@TblCol));
SET @TblCol = 'CREATE TABLE #tmp_Example ('+@TblCol +')';

EXEC sp_executesql @TblCol;

-- Here i am trying to Use temp table

SELECT * FROM #tmp_Example;
END

Here in this example temp table named #temp_Example has variable columns depending on the value of temp table #TEMPDATE.

Here the table named #tmp_Example is created successfully but not expose it, so we cannot use this tempt able within the procedure.

Solutions of the Problem

BEGIN
DECLARE @TblCol nVARCHAR(max);
CREATE TABLE #TMPDATE
(date1 DATETIME);

INSERT INTO #TMPDATE
(date1)
VALUES ('2014-01-01'),('2014-01-02'),('2014-01-03'),
('2014-01-04'),('2014-01-05'),('2014-01-06');

CREATE TABLE #tmp_Example
(COL DECIMAL);

SELECT @TblCol =
STUFF(( SELECT DISTINCT TOP 100 PERCENT
'] DECIMAL(20,2), [' + t2.date1
FROM (
SELECTCONVERT(VARCHAR(25),date1,105) date1
FROM #TMPDATE
) AS t2
ORDER BY '] DECIMAL(20,2), [' +t2.date1
FOR XML PATH('')
), 1, 2, '') + '] DECIMAL(20,2)'

SET @TblCol = SUBSTRING(@TblCol,LEN('VARCHAR(20,2),')+2, LEN(@TblCol));
SET @TblCol = 'ALTER TABLE #tmp_Example ADD'+@TblCol;

EXEC sp_executesql @TblCol;

ALTER TABLE #tmp_Example DROP COLUMN COL;

-- Here i am trying to Use temp table

SELECT * FROM #tmp_Example;
END

Here we create the temp table named #tmp_Example first and then we just alter it and drop the extra columns used in creation.

Output is

01-01-2014 02-01-2014
---------- ---------- …. n

(0 row(s) affected)

No more temporary Table Use CTE

Avoid using temporary tables and derived tables as it uses more disks I/O. Instead use CTE (Common Table Expression); its scope is limited to the next statement in SQL query.

The following example shows the components of the CTE structure: expression name, column list, and query. The CTE expression Sales_CTE has three columns (SalesPersonID, SalesOrderID, and OrderDate) and is defined as the total number of sales orders per year for each salesperson.

CTE WITH SINGLE USE

USE ARTEK;

GO
-- DEFINE THE CTE EXPRESSION NAME AND COLUMN LIST.

WITH SALES_CTE (SALESPERSONID, SALESORDERID, SALESYEAR)
AS
-- DEFINE THE CTE QUERY.
(
SELECT SALESPERSONID, SALESORDERID, YEAR(ORDERDATE) AS SALESYEAR
FROM SALES.SALESORDERHEADER
WHERE SALESPERSONID IS NOT NULL
)

-- DEFINE THE OUTER QUERY REFERENCING THE CTE NAME.

SELECT SALESPERSONID, COUNT(SALESORDERID) AS TOTALSALES, SALESYEAR
FROM SALES_CTE
GROUP BY SALESYEAR, SALESPERSONID
ORDER BY SALESPERSONID, SALESYEAR;
GO

CTE WITH MULTIPLE USE

WITH CT (CITYID,STATEID,CITYNAME) AS
(
SELECT CITYID,STATEID,CITYNAME
FROM CITY
),
ST (STATEID,STATENAME, REGIONID) AS
(
SELECT STATEID,STATENAME, REGIONID FROM STATE
),
RT (REGIONID,REGIONNAME) AS
(
SELECT RR.REGIONID,RR.REGIONNAME FROM REGION RR
JOIN ST ON ST.REGIONID = RR.REGIONID
)
SELECT CT.CITYNAME, ST.STATENAME, RT.REGIONNAME
FROM CT
JOIN ST ON CT.STATEID = ST.STATEID
JOIN RT ON RT.REGIONID = RT.REGIONID

IN ABOVE QUERY THE ST CTE IS BEING USED TWICE.

Global Vs Local Temporary Table

Temporary table is a very important part for SQL developer. Here in this article we are focusing about the local and global temporary table.

A global temporary table is visible to any session while in existence. It lasts until all users that are referencing the table disconnect.

The global temporary table is created by putting (##) prefix before the table name.

CREATE TABLE ##sample_Table
(emp_code DECIMAL(10) NOT NULL,
emp_name VARCHAR(50) NOT NULL)

A local temporary table, like #California below, is visible only the local session and child session created by dynamic SQL (sp_executeSQL). It is deleted after the user disconnects.

The local temporary table is created by putting (#) prefix before the table name.

CREATE TABLE #sample_Table
(emp_code DECIMAL(10) NOT NULL,
emp_name VARCHAR(50) NOT NULL)

If we use the block like BEGIN TRANSACTION and COMMIT TRANSACTION/ROLLBACK TRANSACTION the scope of the Local temporary table is within the transaction not out site of transaction.

Ranking Functions in SQL Server

What is Ranking Function?
What are different Ranking Functions in SQL Server?

What is Ranking Function?

In SQL Server Ranking functions are used to give a rank to each record among all records or in a group.

Let’s understand what does this means. I will take a simple real world example to make you understand this. Let’s go to in your school days. Our class teacher maintains a register in which all student names are written. Each student is given a roll number based on their names for example if there are 50 students then each student will get one roll number between 1 to 50, this is nothing but teacher has given a serial number to each student. When your result comes out, based on student marks now student will be given rank, which will be different from roll number (serial number) i.e roll number 10 came first, roll number 21 cane second and so on. This is nothing but teacher has given a rank based on the student marks. In some case suppose two students get the same marks, then same rank will be given to both the students.

All Ranking functions are non deterministic function.

What are the different Ranking Functions in SQL Server?

In SQL Server there are four types of Ranking Functions

1. RANK
2. DENSE_RANK
3. ROW_NUMBER
4. NTILE

Let’s understand each ranking function one by one.

CREATE DATABASE DATABASENAME

Let’s create one table name as EMP in our database and populate data. To create and populate data you can run below set of SQL statements.

GO
IF EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME='EMP')

DROP TABLE EMP

GO

CREATE TABLE EMP(EMPNAME VARCHAR(MAX),DEPARTMENT VARCHAR(MAX), SALARY MONEY)

GO

INSERT INTO EMP VALUES ('GHANESH','IT',18000)
INSERT INTO EMP VALUES ('PRASAD','HR',2000)
INSERT INTO EMP VALUES ('GAUTAM','SALES',5000)
INSERT INTO EMP VALUES ('KUMAR','HR',85000)
INSERT INTO EMP VALUES ('SUMIT','IT',18000)
INSERT INTO EMP VALUES ('PRIYANKA','HR',25000)
INSERT INTO EMP VALUES ('KAPIL','SALES',5000)
INSERT INTO EMP VALUES ('GOLDY','HR',12000)
INSERT INTO EMP VALUES ('SACHIN','IT',21500)
INSERT INTO EMP VALUES ('OJAL','SALES',19500)
INSERT INTO EMP VALUES ('ISHU','HR',28000)
INSERT INTO EMP VALUES ('GAURAV','IT',15500)
INSERT INTO EMP VALUES ('SAURABH','SALES',20500)
INSERT INTO EMP VALUES ('MADHU','IT',18000)
INSERT INTO EMP VALUES('ATUL','SALES',35000)

GO
SELECT * FROM EMP

As you can see from the above query result set, data has been successfully populated. I will be using this data set in coming examples to explain all ranking functions.

RANK

Rank function returns the rank of each record among all records or in a partition. If two or more rows tie for a rank, each tied rows receives the same rank but RANK function does not always return consecutive integers. Don’t worry if second point is not clear to you I will explain this with example.

Syntax:

RANK () OVER (PARTITION BY [column_list] ORDER BY [column_list])

PARTITION BY [Column_list] – This is optional if you want to divide the result set into partitions to which the function is applied then must give the column name, if you don’t use this clause then the function treats all records of the query result set as a single group.

ORDER BY [Column_list] – This is not optional it is must clause; this determines the order of the data before the function is applied.

Let’s understand the RANK function using example.

Problem 1:

You want to get each employee rank among all other employees based on their highest salary.

Solution:

You can use RANK function without PARTITION By Clause and ORDER BY Clause on SALARY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY , RANK() OVER(ORDER BY SALARY DESC) AS OVER_ALL_RANK_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set each employees has given a rank based on their salary. Employee KUMAR has the highest Salary rank is 1 and PRASAD has the lowest salary rank is 15. If you notice Employee MADHU, SUMIT and GHANESH have given rank 8 because all three has the same salary. Employee GAURAV has given 11 because 8 rank can be given to a least one person but two more employees shared the same rank so next two ranks 9 and 10 will not be given to any employee. This is the problem with the RANK Function it does not always return consecutive integers.

Problem 2:

You want to get each employee rank in their department based on their highest salary.

Solution:

You can use RANK function with PARTITION BY Clause on DEPARTMENT
and ORDER BY Clause on SALARY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY , RANK() OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) ASRANK_IN_DEP_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set each employees has given a rank based on their salary in their department. In HR Department every employee has distinct salary so each has given different rank but in IT Department three employees has same salary so they has given rank 2 and next employee gets rank 5.This is the problem with the RANK Function it does not always return consecutive integers.

DENSE_RANK

RANK function does not always return consecutive integers to overcome this problem we have another Ranking function in SQL Server which is known as DENSE_RANK. DENSE_RANK function Returns the rank of rows within the partition of a result set or within all records, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come from the previous row.

Syntax:

DENSE_RANK () OVER (PARTITION BY [column_list] ORDER BY [column_list])

PARTITION BY [Column_list] – This is optional if you want to divide the result set into partitions to which the function is applied then you must give the column name, if you don’t use this clause then the function treats all records of the query result set as a single group.

ORDER BY [Column_list] – This is not optional it is a must clause; this determines the order of the data before the function is applied.

Let’s understand the DENSE_RANK function using the old examples.

Problem 1:

You want to get each employee rank without gap among all other employees based on their highest salary.

Solution:

You can use DENSE_RANK function without PARTITION By Clause and ORDER BY Clause on SALARY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY , DENSE_RANK() OVER(ORDER BY SALARY DESC) AS OVER_ALL_RANK_BY_SALARY FROMEMP

Output:

Explanation:

As you can see from the result set each employees has given a rank based on their salary. Whoever is having same salary has been given same rank i.e 8 and 11 but there is no gap in the ranking like RANK function.

Problem 2:

You want to get each employee rank without any gap in their department based on their highest salary.

Solution:

You can use DENSE_RANK function with PARTITION BY Clause on DEPARTMENT and ORDER BY Clause on SALAY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY , DENSE_RANK() OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) ASRANK_IN_DEP_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set each employees has given a rank based on their salary in their department. In HR Department every employee has distinct salary so each has given different rank but in IT Department three employees has same salary so they has given rank 2 and next Employee gets rank as 3.

ROW_NUMBER

ROW_NUMBER function Returns the serial number of each record within a partition of a result set or within all records. It starts serial number as 1 for the first record and for next records it adds one. We get unique serial number for each record if using ROW_NUMBER function. If we are using ROW_NUMBER function within a partition of a result set then it starts with 1 in each partition.

Syntax:

ROW_NUMBER () OVER (PARTITION BY [column_list] ORDER BY [column_list])

PARTITION BY [Column_list] – This is optional if you want to divide the result set into partitions to which the function is applied then you must give the column name, if you don’t use this clause then the function treats all records of the query result set as a single group.

ORDER BY [Column_list] – This is not optional it is a must clause; this determines the order of the data before the function is applied.

Let’s understand the ROW_NUMBER function with simple examples.

Problem 1:

You want to get serial number for each employee among all employees based on their highest salary.

Solution:

You can use ROW_NUMBER function without PARTITION By Clause and ORDER BY Clause on SALARY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY, ROW_NUMBER() OVER(ORDER BY SALARY DESC) AS ROW_NUMBER_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set each employees has given a serial number based on their salary among all employees. Serial number starts with 1 and goes till 15 (because we have total 15 records in EMP table).Whoever is having same salary has not given same serial number like DENSE_RANK function.

Problem 2:

You want to get serial number for each employee in their department based on their highest salary.

Solution:

You can use ROW_NUMBER function with PARTITION BY Clause on DEPARTMENT and ORDER BY Clause on SALARY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY, ROW_NUMBER() OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) AS ROW_NUMBER_IN_DEP_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set each employee has given a serial number based on their salary in their department and it’s unique, it starts with 1 and goes till 5 because each department has max 5 employees.

NTILE

NTILE function distributes all your records in an ordered groups with a specified group number. The groups are numbered, starting at 1 for the first group and adds 1 for the next group. For each record, NTILE returns the number of the group to which the record belongs.

Syntax:

NTILE (Interger_Expression) OVER (PARTITION BY [column_list] ORDER BY [column_list])

Integer_Expression – It is a positive integer constant expression that specifies the number of groups into which each partition must be divided

PARTITION BY [Column_list] – This is optional if you want to divide the result set into partitions to which the function is applied then you must give the column name, if you don’t use this clause then the function treats all records of the query result set as a single group.

ORDER BY [Column_list] – This is not optional it is a must clause; this determines the order of the data before the function is applied.

Let’s understand the NTILE function with simple examples.

Problem 1:

You want to divide all employees into 3 groups based on their highest salary.

Solution:

You can use NTILE function without PARTITION By Clause and with Inter_expression equals to 3 and ORDER BY Clause on SALAY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY , NTILE(3) OVER(ORDER BY SALARY DESC) AS GROUP_NUMBER_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set there are total 15 records in EMP table. All records have been divided into three groups. Each group has it group number with 5 records.

Problem 2:

You want divide each employee in their department based on their highest salary into 2 groups.

Solution:

You can use NTILE function with INTEGER EXPRESSION value equal to 2, PARTITION BY Clause on DEPARTMENT and ORDER BY Clause on SALAY in Descending order.

SQL Code:

SELECT EMPNAME, DEPARTMENT, SALARY , NTILE(3) OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) ASGROUP_NUMBER_IN_DEP_BY_SALARY FROM EMP

Output:

Explanation:

As you can see from the result set there are total 3 Departments, each department has 5 records in it. All records in each department have been divided into 3 groups. Each group has it group number with its records.

#################################################################################

Example - 2

################################################################################

Let's take following sample table and data to know about RANK, RANK_DENSE, NTILE and ROW_NUMBER with examples.

Create table ExamResult(name varchar(50),Subject varchar(20),Marks int) insert into ExamResult values('Adam','Maths',70) insert into ExamResult values ('Adam','Science',80) insert into ExamResult values ('Adam','Social',60) insert into ExamResult values('Rak','Maths',60) insert into ExamResult values ('Rak','Science',50) insert into ExamResult values ('Rak','Social',70) insert into ExamResult values('Sam','Maths',90) insert into ExamResult values ('Sam','Science',90) insert into ExamResult values ('Sam','Social',80)

RANK()

Returns the rank of each row in the result set of partitioned column.select Name,Subject,Marks, RANK() over(partition by name order by Marks desc)Rank From ExamResult order by name,subject

DENSE_RANK()

This is same as RANK() function. Only difference is returns rank without gaps.select Name,Subject,Marks, DENSE_RANK() over(partition by name order by Marks desc)Rank From
ExamResult order by name

In RANK() result set screenshot, you can notice that there is gap in Rank(2) for the name Sam and same gap is removed in DENSE_RANK().
NTILE()

Distributes the rows in an ordered partition into a specified number of groups.

It divides the partitioned result set into specified number of groups in an order.
Example for NTILE(2)select Name,Subject,Marks, NTILE(2) over(partition by name order by Marks desc) Quartile From ExamResult order by name,subject

Example for NTILE(3)select Name,Subject,Marks, NTILE(3) over(partition by name order by Marks desc) Quartile From ExamResult order by name,subject

ROW_NUMBER()

Returns the serial number of the row order by specified column.select Name,Subject,Marks, ROW_NUMBER() over(order by Name) RowNumber From ExamResult order by name,subject

#################################################################################

Example - 3

################################################################################

DENSE_RANK

The DENSE_RANK Function is one of the SQL Server ranking function. This function will assign the rank number to each record present in a partition without skipping the rank numbers.

NOTE: In SQL, If the DENSE_RANK function encounters two equal values in the same partition, it will assign the same rank number to both values.
SQL DENSE_RANK Function Syntax

In SQL Server, The basic syntax of the DENSE_RANK Function is:

SELECT DENSE_RANK() OVER (PARTITION_BY_Clause ORDER_BY_Clause)
FROM [Source]

Partition_By_Clause: This will divide the records selected by the SELECT Statement into partitions.
If you specified the Partition By Clause then, DENSE_RANK Function will assign the rank number to each partition.
If you havent specified the Partition By Clause then, DENSE_RANK Function will consider all the records as single partition so, it will assign the rank numbers from top to bottom.

Order_By_Clause: This is used to sort the Partitioned data into specified order. Please referSQL Order By Clause for better understanding.

In this article we will show you, How to write DENSE_RANK Function in SQL Server 2014 with example. For this, We are going to use the below shown data

SQL DENSE_RANK Function Example

DENSE_RANK Function allows you to assign the rank number to each record present in a partition. In this example we will show you, How to rank the partitioned records present in a table. The following SQL Query will partition the data by Occupation and assign the rank number using the yearly income.

T-SQL CODE

USE [SQL Server Tutorials]
GO
SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,DENSE_RANK() OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS RANK
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will divide the selected data into partition using their Occupation. From the above you can observe, We got four partitions

PARTITION BY [Occupation]

Below statement will sort the partitioned data in the descending order using their [yearly income]

ORDER BY [YearlyIncome] DESC

In the below statement we used DENSE_RANK() Function with Partition by clause so, DENSE_RANK function will assign the rank numbers for each individual partition.

DENSE_RANK() OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS RANK

It has given same rank to 3 and 4 records because their yearly income is same. Next, it assigned 2nd rank to next record (it wont skip any rank numbers).
SQL DENSE_RANK Function without Partition By Clause

In this example we will show you, What will happen if we miss the Partition By Clause in DENSE_RANK () Function. For instance, The following SQL Query will use the above example query without Partition by clause.

T-SQL CODE

USE [SQL Server Tutorials]
GO
SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,DENSE_RANK() OVER (
ORDER BY [YearlyIncome] DESC
) AS RANK
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will sort the yearly income data in the descending order

ORDER BY [YearlyIncome] DESC

In the below statement we used DENSE_RANK() Function without Partition by clause so, DENSE_RANK function will consider them as single partition and assign the rank numbers from begining to end.

DENSE_RANK() OVER (
ORDER BY [YearlyIncome] DESC
) AS RANK

It has given same rank to 2, 3 and 4 records because their yearly income is same (80000). Next, it has given same rank to 5 and 6 records because their yearly income is same and so on.
SQL DENSE_RANK Function On String Column

DENSE_RANK Function also allows you to rank the string columns. In this example, We are going to use the DENSE_RANK() function to assign the rank numbers for [First name]

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [LastName]
,[Education]
,[YearlyIncome]
,[Occupation]
,[FirstName]
,DENSE_RANK() OVER (
PARTITION BY [Occupation]
ORDER BY [FirstName] DESC
) AS RANK
FROM [Customers]

OUTPUT

If you observe the above screenshot, DENSE_RANK Function has given same rank to 6 and 7 records because their First name is same. Next, it assigned 2nd rank to next record.

NTILE

The NTILE Function is one of the SQL Server ranking function. This function will assign the rank number to each record present in a partition.
SQL NTILE Function Syntax

In SQL Server, The basic syntax of the NTILE Function is:

SELECT NTILE(Interger_Value) OVER (PARTITION_BY_Clause ORDER_BY_Clause)
FROM [Source]

Integer_Value: NTILE Function will use this integer value to decide, the number of ranks it has to assign for each partition. For instance, If we specify 2, NTILE Function will assign 2 rank numbers for each partition.

Partition_By_Clause: This will divide the records selected by the SELECT Statement into partitions.
If you specified the Partition By Clause then, NTILE Function will assign the rank number to each partition.
If you havent specified the Partition By Clause then, NTILE Function will consider all the records as single partition so, it will assign the rank numbers from top to bottom.

Order_By_Clause: This is used to sort the Partitioned data into specified order. Please referSQL Order By Clause for better understanding.

In this article we will show you, How to write NTILE Function in SQL Server 2014 with example. For this, We are going to use the below shown data

SQL NTILE Function Example 1

NTILE Function allows you to assign the rank number to each record present in a partition. In this example we will show you, How to rank the partitioned records present in a table. The following SQL Query will partition the data by Occupation and assign the rank number using the yearly income.

T-SQL CODE

USE [SQL Server Tutorials]
GO
SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,NTILE(2) OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS [NTILE NUMBER]
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will divide the selected data into partition using their Occupation. From the above you can observe, We got four partitions

PARTITION BY [Occupation]

Below statement will sort the partitioned data in the descending order using their [yearly income]

ORDER BY [YearlyIncome] DESC

In the below statement we used NTILE(2) Function with Partition by clause. Here integer value is 2, It means NTILE Function will assign two rank number for each individual partition (4 partitions).

NTILE(2) OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS [NTILE NUMBER]

Although the yearly income of is not same for 6 and 7 records (90000 and 80000), It has given same rank to them because NTILE(2) can assign only 2 two ranks for a partition.
SQL NTILE Function Example 2

In this example we will show you, What will happen if we assign different value atInteger_Value position. For instance, The following SQL Query will use the above example query but we changed the NTILE (2) to NTILE(3).

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,NTILE(3) OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS [NTILE NUMBER]
FROM [Customers]

Above query will assign three rank number per partition.

OUTPUT

Although the yearly income of is same for 3 and 4 records (80000), It has given different rank to them because NTILE(3) will assign 3 ranks for a partition.
SQL NTILE Function without Partition By Clause 1

In this example we will show you, What will happen if we miss the Partition By Clause in NTILE() Function. For instance, The following SQL Query will use the above example query without Partition by clause.

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [LastName]
,[Education]
,[YearlyIncome]
,[Occupation]
,[FirstName]
,NTILE(2) OVER (
ORDER BY [YearlyIncome] DESC
) AS [NTILE NUMBER]
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will sort the partitioned data in the descending order using their [yearly income]

ORDER BY [YearlyIncome] DESC

In the below statement we used NTILE(2) Function without Partition by clause. Here integer value is 2, It means NTILE Function will assign two rank number for each individual partition.

Since we havent used any partition, NTILE Function will consider all the 10 records as single partition. Next, It will assign 2 rank number to that 10 records.

NTILE(2) OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS [NTILE NUMBER]

SQL NTILE Function without Partition By Clause 2

In this example we will show you, What will happen if we assign different value at Integer_Value position. For instance, The following SQL Query will use the above example query but we changed the NTILE (2) to NTILE(3).

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [LastName]
,[Education]
,[YearlyIncome]
,[Occupation]
,[FirstName]
,NTILE(3) OVER (
ORDER BY [YearlyIncome] DESC
) AS [NTILE NUMBER]
FROM [Customers]

OUTPUT

Here we haven’t used any partition, NTILE Function will consider all the 10 records as single partition. Next, It will assign 3 rank number to that 10 records because we have givenInteger_Value as 3.

RANK

The RANK Function is one of the SQL Server ranking function. This function will assign the rank number to each record present in a partition.

NOTE: In SQL, If the RANK function encounters two equal values in the same partition, it will assign the same rank number to both values and skips the next number in the ranking.
SQL RANK Function Syntax

In SQL Server, The basic syntax of the RANK Function is:

SELECT RANK() OVER (PARTITION_BY_Clause ORDER_BY_Clause)
FROM [Source]

Partition_By_Clause: This will divide the records selected by the SELECT Statement into partitions.
If you specified the Partition By Clause then, RANK Function will assign the rank number to each partition.
If you haven't specified the Partition By Clause then, RANK Function will consider all the records as single partition so, it will assign the rank numbers from top to bottom.

Order_By_Clause: This is used to sort the Partitioned data into specified order. Please refer SQL Order By Clause for better understanding.

In this article we will show you, How to write RANK Function in SQL Server 2014 with example. For this, We are going to use the below shown data

SQL RANK Function Example

RANK Function allows you to assign the rank number to each record present in a partition. In this example we will show you, How to rank the partitioned records present in a table. The following SQL Query will partition the data by Occupation and assign the rank number using the yearly income.

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,RANK() OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS RANK
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will divide the selected data into partition using their Occupation. From the above you can observe, We got four partitions

PARTITION BY [Occupation]

Below statement will sort the partitioned data in the descending order using their [yearly income]

ORDER BY [YearlyIncome] DESC

In the below statement we used RANK() Function with Partition by clause so, Rank function will assign the rank numbers for each individual partition.

RANK() OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS RANK

It has given same rank to 3 and 4 records because their yearly income is same. Next, it skipped one rank and assigned 3rd rank to next record.
SQL RANK Function without Partition By Clause

In this example we will show you, What will happen if we miss the Partition By Clause in RANK () Function. For instance, The following SQL Query will use the above example query without Partition by clause.

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,RANK() OVER (
ORDER BY [YearlyIncome] DESC
) AS RANK
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will sort the yearly income data in the descending order

ORDER BY [YearlyIncome] DESC

In the below statement we used RANK() Function without Partition by clause so, Rank function will consider them as single partition and assign the rank numbers from begining to end.

RANK() OVER (
ORDER BY [YearlyIncome] DESC
) AS RANK

It has given same rank to 2, 3 and 4 records because their yearly income is same. Next, it skipped two ranks for those two records (3, 4) and assigned 5th rank to next record.
SQL RANK Function On String Column

RANK Function also allows you to rank the string columns. In this example, We are going to use the RANK() function to assign the rank numbers for [First name]

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [LastName]
,[Education]
,[YearlyIncome]
,[Occupation]
,[FirstName]
,RANK() OVER (
PARTITION BY [Occupation]
ORDER BY [FirstName] DESC
) AS RANK
FROM [Customers]

OUTPUT

If you observe the above screenshot, RANK Function has given same rank to 6 and 7 records because their First name is same. Next, it skipped one rank and assigned 3rd rank to next record.

ROW_NUMBER

The ROW_NUMBER Function is one of the SQL Server ranking function. This function will assign the sequential rank number to each unique record present in a partition.

NOTE: In SQL, If the ROW_NUMBER function encounters two equal values in the same partition, it will assign the different rank number to both values. Here rank number will depends up on the order they displayed.
SQL ROW_NUMBER Function Syntax

In SQL Server, The basic syntax of the ROW_NUMBER Function is:

SELECT ROW_NUMBER() OVER (PARTITION_BY_Clause ORDER_BY_Clause)
FROM [Source]

Partition_By_Clause: This will divide the records selected by the SELECT Statement into partitions.
If you specified the Partition By Clause then, ROW_NUMBER Function will assign the rank number to each partition.
If you haven't specified the Partition By Clause then, ROW_NUMBER Function will consider all the records as single partition so, it will assign the rank numbers from top to bottom.

Order_By_Clause: This is used to sort the Partitioned data into specified order. Please refer SQL Order By Clause for better understanding.

In this article we will show you, How to write ROW_NUMBER Function in SQL Server 2014 with example. For this, We are going to use the below shown data

SQL ROW_NUMBER Function Example

ROW_NUMBER Function allows you to assign the rank number to each record present in a partition. In this example we will show you, How to rank the partitioned records present in a table. The following SQL Query will partition the data by Occupation and assign the rank number using the yearly income.

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,ROW_NUMBER() OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS [ROW NUMBER]
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will divide the selected data into partition using their Occupation. From the above you can observe, We got four partitions

PARTITION BY [Occupation]

Below statement will sort the partitioned data in the descending order using their [yearly income]

ORDER BY [YearlyIncome] DESC

In the below statement we used ROW_NUMBER() Function with Partition by clause so, ROW_NUMBER function will assign the rank numbers for each individual partition.

ROW_NUMBER() OVER (
PARTITION BY [Occupation]
ORDER BY [YearlyIncome] DESC
) AS [ROW NUMBER]

Although the yearly income of is same for 3 and 4 records (80000), It has given different rank to them because ROW_NUMBER() assign unique ranks to each individual record.
SQL ROW_NUMBER Function without Partition By Clause

In this example we will show you, What will happen if we miss the Partition By Clause in ROW_NUMBER() Function. For instance, The following SQL Query will use the above example query without Partition by clause.

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,ROW_NUMBER() OVER (
ORDER BY [YearlyIncome] DESC
) AS [ROW NUMBER]
FROM [Customers]

OUTPUT

ANALYSIS

Below statement will sort the yearly income data in the descending order

ORDER BY [YearlyIncome] DESC

In the below statement we used ROW_NUMBER() Function without Partition by clause so, Rank function will consider them as single partition and assign the rank numbers from beginning to end.

ROW_NUMBER() OVER (
ORDER BY [YearlyIncome] DESC
) AS [ROW_NUMBER]

SQL ROW_NUMBER Function On String Column

ROW_NUMBER Function also allows you to rank the string columns. In this example, We are going to use the ROW_NUMBER() function to assign the rank numbers for [First name]

T-SQL CODE

USE [SQL Server Tutorials]
GO

SELECT [LastName]
,[Education]
,[YearlyIncome]
,[Occupation]
,[FirstName]
,ROW_NUMBER() OVER (
PARTITION BY [Occupation]
ORDER BY [FirstName] DESC
) AS [ROW NUMBER]
FROM [Customers]

OUTPUT

If you observe the above screenshot, Although the first name is same for 6 and 7 records (John), It has given different rank to them because ROW_NUMBER() assign unique ranks to each individual record.

#################################################################################

Example - 4

################################################################################

Ranking Function

Ranking functions return a ranking value for each row in a partition. Depending on the function that is used, some rows might receive the same value as other rows. Ranking functions are non-deterministic and return Big Integer values.

This article is related to RANKING Function.

ROW_NUMBER function

The ROW_NUMBER function returns the row number over a named or unnamed window specification.

The ROW_NUMBER function does not take any arguments, and for each row over the window it returns an ever increasing BIGINT. It is normally used to limit the number of rows returned for a query. The LIMIT keyword used in other databases is not defined in the SQL standard, and is not supported.
Derby does not currently allow the named or unnamed window specification to be specified in the OVER() clause, but requires an empty parenthesis. This means the function is evaluated over the entire result set.
The ROW_NUMBER function cannot currently be used in a WHERE clause.
Derby does not currently support ORDER BY in sub queries, so there is currently no way to guarantee the order of rows in the SELECT sub query. An optimizer override can be used to force the optimizer to use an index ordered on the desired column(s) if ordering is a firm requirement.

RANK() function

If two or more rows tie for a rank, each tied rows receives the same rank.

For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number three, because there are two rows that are ranked higher. Therefore, the RANK function does not always return consecutive integers.

The sort order that is used for the whole query determines the order in which the rows appear in a result set.

DENSE_RANK()function

If two or more rows tie for a rank in the same partition, each tied rows receives the same rank.

For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number two. This is one more than the number of distinct rows that come before this row. Therefore, the numbers returned by the DENSE_RANK function do not have gaps and always have consecutive ranks.

NTILE() function

If the number of rows in a partition is not divisible by integer_expression, this will cause groups of two sizes that differ by one member. Larger groups come before smaller groups in the order specified by the OVER clause. For example if the total number of rows is 53 and the number of groups is five, the first three groups will have 11 rows and the two remaining groups will have 10 rows each. If on the other hand the total number of rows is divisible by the number of groups, the rows will be evenly distributed among the groups. For example, if the total number of rows is 50, and there are five groups, each bucket will contain 10 rows.

Example are mentioned bellow

CREATE TABLE #EXMAPLE_TABLE
( CODE INT IDENTITY(1,1) NOT NULL,
SNAME VARCHAR(50) NOT NULL,
GRADE CHAR(1) NOT NULL)

GO

INSERT INTO #EXMAPLE_TABLE(SNAME, GRADE)
VALUES
('Raja', 'A'),
('Prja', 'B'),
('Saja', 'C'),
('Khaja', 'A'),
('Baja', 'B'),
('Balaram', 'C'),
('Majhi', 'B'),
('Sajal', 'B'),
('Sarita', 'C'),
('Sarika', 'C')

SELECT ROW_NUMBER() OVER(ORDER BY GRADE) [ROW_NUMBER],
RANK() OVER(ORDER BY GRADE) [RANK],
DENSE_RANK() OVER(ORDER BY GRADE) [DENSE_RANK],
NTILE(4) OVER(ORDER BY GRADE) [NTILE],
SNAME, GRADE
FROM #EXMAPLE_TABLE

ROW_NUMBER RANK DENSE_RANK NTILE SNAME GRADE

1 1 1 1 Raja A
2 1 1 1 Khaja A
3 3 2 1 Baja B
4 3 2 2 Prja B
5 3 2 2 Majhi B
6 3 2 2 Sajal B
7 7 3 3 Sarita C
8 7 3 3 Sarika C
9 7 3 4 Saja C
10 7 3 4 Balaram C

SELECT ROW_NUMBER() OVER(PARTITION BY GRADE ORDER BY GRADE)[ROW_NUMBER],
SNAME, GRADE
FROM #EXMAPLE_TABLE

ROW_NUMBER SNAME GRADE

1 Raja A
2 Khaja A
1 Baja B
2 Prja B
3 Majhi B
4 Sajal B

1 Sarita C
2 Sarika C
3 Saja C
4 Balaram C

Normalization

Definition :

Normalization or data normalization is a process to organize the data into tabular format (database tables). A good database design includes the normalization, without normalization a database system may slow, inefficient and might not produce the expected result. Normalization reduces the data redundancy and inconsistent data dependency.

There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. There are several benefits for using Normalization in Database.

Normalization is a process of minimizing redundant data from database by decomposing the columns of a table into multiple tables. This process increases the number of tables in the database but it minimizes the redundant (duplicate), inaccurate and inconsistent data in database. Decomposed tables are connected using relationships (primary keys and foreign keys) to reduce the insert, update and delete anomalies. During normalization process, database designer decides the actual structure of the tables and their relationships. It is easy to find and fix any design problem at this early stage. During normalization process, we have various normal forms with some conditions. A table must satisfy all conditions of that normal form to qualify be in that normal form. Lets discuss what is a normal form in next sub section.

Benefits :

Eliminate data redundancy
Improve performance
Query optimization
Faster update due to less number of columns in one table
Index improvement

Normal Forms:

We organize the data into database tables by using normal forms rules or conditions. Normal form defines a set of standard which must be followed for a good database design. During normalization process database designer converts the database tables in normal forms and check its functional dependency. Each normal form has some predefined standards which must be followed to qualify that normal form. To qualify each normal form, a table must qualify the previous normal form conditions first i.e. a table in 3NF must satisfy all conditions of 2NF which in turn must satisfy all conditions of 1NF. Generally we organize the data up to third normal form. We rarely use the fourth and fifth normal form.

Normal forms at a glance

Its time to summarize our reading. We have below image to summarize the reading on normal forms:

To understand normal forms consider the folowing unnormalized database table. Now we will normalize the data of below table using normal forms.

First Normal Form (1NF)

A database table is said to be in 1NF if it contains no repeating fields/columns. The process of converting the UNF table into 1NF is as follows:

Separate the repeating fields into new database tables along with the key from unnormalized database table.

The primary key of new database tables may be a composite key

1NF of above UNF table is as follows:

Second Normal Form (2NF)

A database table is said to be in 2NF if it is in 1NF and contains only those fields/columns that are functionally dependent(means the value of field is determined by the value of another field(s)) on the primary key. In 2NF we remove the partial dependencies of any non-key field. The process of converting the database table into 2NF is as follows:

Remove the partial dependencies(A type of functional dependency where a field is only functionally dependent on the part of primary key) of any non-key field.

If field B depends on field A and vice versa. Also for a given value of B, we have only one possible value of A and vice versa, Then we put the field B in to new database table where B will be primary key and also marked as foreign key in parent table.

2NF of above 1NF tables is as follows:

Third Normal Form (3NF) :

A database table is said to be in 3NF if it is in 2NF and all non keys fields should be dependent on primary key or We can also said a table to be in 3NF if it is in 2NF and no fields of the table is transitively functionally dependent on the primary key.The process of converting the table into 3NF is as follows:

Remove the transitive dependecies(A type of functional dependency where a field is functionally dependent on the Field that is not the primary key.Hence its value is determined, indirectly by the primary key )

Make separate table for transitive dependent Field.

3NF of above 2NF tables is as follows:

Boyce Code Normal Form (BCNF)

A database table is said to be in BCNF if it is in 3NF and contains each and every determinant as a candidate key.The process of converting the table into BCNF is as follows:

Remove the non trival functional dependency.

Make separate table for the determinants.

BCNF of below table is as follows:

Fourth Normal Form (4NF)

A database table is said to be in 4NF if it is in BCNF and primary key has one-to-one relationship to all non keys fields or We can also said a table to be in 4NF if it is in BCNF and contains no multi-valued dependencies.The process of converting the table into 4NF is as follows:

Remove the multivalued dependency.

Make separate table for multivalued Fields.

4NF of below table is as follows:

Fifth Normal Form (5NF)

A database table is said to be in 5NF if it is in 4NF and contains no redundant values or We can also said a table to be in 5NF if it is in 4NF and contains no join dependencies.The process of converting the table into 5NF is as follows:

Remove the join dependency.

Break the database table into smaller and smaller tables to remove all data redundancy.

5NF of below table is as follows:

Summary

In this article I try to explain the Normalization with example. I hope after reading this article you will be able to understand Normal Forms. I would like to have feedback from my blog readers. Please post your feedback, question, or comments about this article.

#############################################################################

Example - 2

#############################################################################

1st NF:

Removes repetitive groups
Create a PK

2nd NF:

Should be in 1st NF
Remove columns which create duplicate data in a table and creates a new table with Primary Key – Foreign Key relationship

3rd NF:

Should be in 2nd NF
Remove those non-key attributes (which are not PK) which can be derived from other non-key attributes.

Country can be derived from State also… so removing country

BCNF NF:

Should be in 3rd NF
If PK is composed of multiple columns then all non-key attribute should be derived from FULL PK only. If some non-key attribute can be derived from partial PK then remove it

#############################################################################

Example - 3

#############################################################################

First Normal Form – 1NF

A table can be in First Normal Form (1NF), when it satisfies below conditions:

All columns of a table must have a single atomic value in each row of a table i.e. a cell (intersection of rows and columns) in a table must contain a single atomic value.
Each row of a table should have an unique identifier to uniquely identify the rows of the table i.e. no two rows of a table can be identical in 1NF. A primary key or composite key can be used to uniquely identify records of the table.

To understand the concept of First Normal Form (1NF), consider this employee table for an example:

To bring the employee Table in First Normal Form; 1) Organize the ProjectId and ProjectName attribute values in single atomic values and then 2) EmpId and ProjectId attributes can uniquely identify each row of the employee table.

Have a look on employee table after bringing it in First Normal Form:

Note: EmpId and ProjectId uniquely identifies each row of employee table.

Second Normal Form – 2NF

A table can be in Second Normal Form (2NF), if satisfies below conditions:

The table must satisfy all the conditions of 1NF.
All non-key attributes (columns) are dependent on key columns i.e. Each column of the table must be fully functionally dependent on key column (or set of columns). Partial dependencies must be removed from the table in case table has a composite primary key.

Note: A table with a single column primary key is automatically in 2NF and does not need to be tested for partial dependency.

In continuation of employee table as an example, to bring this table in 2NF, we have to find and remove any partial dependency of non key columns (EmpName, ProjectName, Zip and City) on key columns (EmpId and ProjectId). We can see that column ProjectName functionally depends on ProjectId but not on EmpId attribute which shows a partial dependency in the table. To bring this table in Second Normal Form, we need to break this table in two tables, employee table and project table as below:

Employee Table in 2NF:

Third Normal Form – 3NF

A table must satisfy below conditions to be in Third Normal Form (3NF):

The table must satisfy all conditions of 2NF.
Transitive dependency of non-key attributes on key column must be removed i.e. if column A depends on column B and column B depends on column C, column A is transitively depends on column C. Any non-key column must not dependent on another non-key column.

To bring employee table in Third Normal Form (3NF), we have to find and remove any transitive dependency from this table. We can see that attribute City depends on Zip, and attribute Zip depends on EmpId which shows a transitive dependency of city attribute on EmpId. To bring this table in 3NF, split the employee table as below:

Employee table in 3NF:

Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) is an extension of 3NF with strict condition. A table must satisfy below conditions to be in BCNF:

Table must be in 3NF.
For any functional dependency X -> Y, X should be a super key.

In employee table, in dependency EmpId -> EmpName, Zip, EmpId is the super key and in Zip – City table, in dependency Zip -> City, Zip is the super key for city.

We might need to test the above BCNF functional dependency in tables which have overlapping composite candidate keys.

Microsoft Technologies

Pages

Wednesday, July 24, 2013

Temporary Tables in SQL Server

Ranking Functions in SQL Server

Normalization

About Me