What is normalization? Why normalization is required? Explain 1NF, 2NF, and 3NF with example.

This answer is restricted. Please login to view the answer of this question.

Login Now

Normalization is a database design technique that reduces data redundancy and eliminates undesirable characteristics like Insertion, Update, and Deletion Anomalies. Normalization rules divide larger tables into smaller tables and link them using relationships. The purpose of Normalisation in SQL is to eliminate redundant (repetitive) data and ensure data is stored logically.

Normalization is required because of the following reasons:

  • By using normalization redundancy of database or data duplication can be resolved.
  • We can minimize null values by using normalization.
  • Results in a more compact database (due to less data redundancy/zero).
  • Minimize/avoid data modification problems.
  • It simplifies the query.is
  • The database structure is clearer and easier to understand.
  • The database can be expanded without affecting existing data.
  • Finding, sorting, and indexing can be faster because the table is small and more rows can be accommodated on the data page.

Here are the most commonly used normal forms:

  • First normal form(1NF)
  • Second normal form(2NF)
  • Third normal form(3NF)
  • Boyce & Codd normal form (BCNF)
  • Fourth normal form(4NF)

1. First Normal Form (1NF):

For a table to be in the First Normal Form, it should follow the following 4 rules:

  1. It should only have single(atomic) valued attributes/columns.
  2. Values stored in a column should be of the same domain
  3. All the columns in a table should have unique names.
  4. And the order in which data is stored does not matter.

Example: Let’s say a company wants to store the names and contact details of its employees. It creates a table in the database that looks like this:

Emp_Id Emp_Name Emp_Address Emp_Mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur 8812121212,
9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore 9990000123,
8123450987

This table is not in 1NFas the rule says “each attribute of a table must have atomic (single) values”, the Emp_Mobile values for employees Jon & Lester violates that rule.

To make the table complies with 1NF we need to create separate rows for each mobile number in such a way so that none of the attributes contains multiple values.

Emp_Id Emp_Name Emp_Address Emp_Mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur 8812121212
102 Jon Kanpur 9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore 9990000123
104 Lester Bangalore 8123450987

2. Second Normal form (2NF):

For a table to be in the Second Normal Form,

  1. It should be in the First Normal form.
  2. And, it should not have Partial Dependency.

Example: Let’s say a school wants to store the data of teachers and the subjects they teach. They create a table Teacher that looks like this: Since a teacher can teach more than one subject, the table can have multiple rows for the same teacher.

Teacher_Id Subject Teacher_Age
111 Maths 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40

Candidate Keys: {Teacher_IdSubject}
Non-prime attribute: Teacher_Age

This table is in 1 NF because each attribute has atomic values. However, it is not in 2NF because the non-prime attribute Teacher_Age is dependent on Teacher_Id alone which is a proper subset of the candidate key. This violates the rule for 2NF as the rule says “no non-prime attribute is dependent on the proper subset of any candidate key of the table”.

To make the table complies with 2NF we can disintegrate it in two tables like this:

Teacher_Details table:

Teacher_Id Teacher_Age
111 38
222 38
333 40

Teacher_Subject table:

Teacher_Id Subject
111 Maths
111 Physics
222 Biology
333 Physics
333 Chemistry

Now the tables are in the Second normal form (2NF).

3. Third Normal Form (3NF):

A table is said to be in the Third Normal Form when,

  1. It is in the Second Normal form.
  2. And, it doesn’t have Transitive Dependency.

Example: Let’s say a company wants to store the complete address of each employee, they create a table named Employee_Details that looks like this:

Emp_Id Emp_Name Emp_Zip Emp_State Emp_City Emp_District
1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1101 Lilly 292008 UK Pauri Bhagwan
1201 Steve 222999 MP Gwalior Ratan

Super keys: {Emp_Id}, {Emp_IdEmp_Name}, {Emp_IdEmp_NameEmp_Zip}…so on
Candidate Keys: {Emp_Id}

Non-prime attributes: all attributes except Emp_Id are non-prime as they are not part of any candidate keys.

Here, Emp_StateEmp_City & Emp_District dependent on Emp_Zip. Further Emp_zip is dependent on Emp_Id what makes non-prime attributes (Emp_State, Emp_City & Emp_District) transitively dependent on super key (Emp_Id). This violates the rule of 3NF.

To make this table complies with 3NF we have to disintegrate the table into two tables to remove the transitive dependency:

Employee Table:

Emp_Id Emp_Name Emp_Zip
1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1101 Lilly 292008
1201 Steve 222999

Employee_Zip table:

Emp_Zip Emp_State Emp_City Emp_District
282005 UP Agra Dayal Bagh
222008 TN Chennai M-City
282007 TN Chennai Urrapakkam
292008 UK Pauri Bhagwan
222999 MP Gwalior Ratan
If you found any type of error on the answer then please mention on the comment or report an answer or submit your new answer.
Leave your Answer:

Click here to submit your answer.

Discussion
0 Comments
  Loading . . .