Normalization entails organizing the
columns (attributes) and
tables (relations) of a database to ensure that their
dependencies are properly enforced by database integrity constraints. It is accomplished by applying some formal rules either by a process of
synthesis (creating a new database design) or
decomposition (improving an existing database design).
=> quá trình chuẩn hóa DB sẽ áp dụng 1 số luật (các dạng chuẩn) để thiết kế các bảng cơ sở dữ liệu sao cho có thể giảm thiếu tối đa dư thừa dữ liệu, sự chính xác của truy vấn dữ liệu, tăng performance truy xuất DB,..
FIRST NORMAL FORM
To be in first normal form (1NF), a table must have the following qualities:
- The table is two-dimensional with rows and columns.
- Each row contains data that pertains to some thing or portion of a thing.
- Each column contains data for a single attribute of the thing it’s describing.
- Each cell (intersection of a row and a column) of the table must have only a single value.
- Entries in any column must all be of the same kind. If, for example, the entry in one row of a column contains an employee name, all the other rows must contain employee names in that column, too.
- Each column must have a unique name.
- No two rows may be identical (that is, each row must be unique).
- The order of the columns and the order of the rows are not significant.
A table (relation) in first normal form is immune to some kinds of modification anomalies but is still subject to others. The SALES table is in first normal form, and the table is subject to deletion and insertion anomalies. First normal form may prove useful in some applications but unreliable in others.
=> Dạng chuẩn 1 đảm bảo mỗi dòng trong bảng DB có 1 khóa chính (khóa đơn) và giá trị các cột còn lại chứa đựng giá trị đơn, tuy nhiên vẫn có hiện tượng dữ liệu bị lặp lại trong bảng csdl.
SECOND NORMAL FORM
To appreciate second normal form, you must understand the idea of functional dependency. A functional dependency is a relationship between or among attributes. One attribute is functionally dependent on another if the value of the second attribute determines the value of the first attribute. If you know the value of the second attribute, you can determine the value of the first attribute.
Suppose, for example, that a table has attributes (columns) StandardCharge, NumberOfTests, and TotalCharge that relate through the following equation:
TotalCharge = StandardCharge * NumberOfTests
TotalCharge is functionally dependent on both StandardCharge and NumberOfTests. If you know the values of StandardCharge and NumberOfTests, you can determine the value of TotalCharge.
Every table in first normal form must have a unique primary key. That key may consist of one or more than one column. A key consisting of more than one column is called a composite key. To be in second normal form (2NF), all non-key attributes must depend on the entire key. Thus, every relation that is in 1NF with a single attribute key is automatically in second normal form.
If a relation has a composite key, all non-key attributes must depend on all components of the key. If you have a table where some non-key attributes don’t depend on all components of the key, break the table up into two or more tables so that — in each of the new tables — all non-key attributes depend on all components of the primary key.
Sound confusing? Look at an example to clarify matters. Consider the SALES table. Instead of recording only a single purchase for each customer, you add a row every time a customer buys an item for the first time. An additional difference is that charter customers (those with Customer_ID values of 1001 to 1007) get a discount off the normal price.
Customer_ID does not uniquely identify a row. In two rows, Customer_ID is 1001. In two other rows, Customer_ID is 1010. The combination of the Customer_ID column and the Product column uniquely identifies a row. These two columns together are a composite key.
If not for the fact that some customers qualify for a discount and others don’t, the table wouldn’t be in second normal form, because Price (a non-key attribute) would depend only on part of the key (Product). Because some customers do qualify for a discount, Price depends on both CustomerID and Product, and the table is in second normal form.
=> Dạng chuẩn này cho biết, trong trường hợp có khóa phức thì tất cả các thuộc tính còn lại chỉ phụ thuộc vào khóa phức đó. Do đó, nếu có các cột chỉ phục thuộc vào 1 phần của khóa chính phức thì cần lấy ra các cột này để đưa vào một bảng DB mới.
This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase Location]. In this case, [Purchase Location] only depends on [Store ID], which is only part of the primary key. Therefore, this table does not satisfy second normal form.
To bring this table to second normal form, we break the table into two tables, and now we have the following:
What we have done is to remove the partial functional dependency that we initially had. Now, in the table [TABLE_STORE], the column [Purchase Location] is fully dependent on the primary key of that table, which is [Store ID].
THIRD NORMAL FORM
Tables in second normal form are especially vulnerable to some types of modification anomalies — in particular, those that come from transitive dependencies.
A transitive dependency occurs when one attribute depends on a second attribute, which depends on a third attribute. Deletions in a table with such a dependency can cause unwanted information loss. A relation in third normal form is a relation in second normal form with no transitive dependencies.
Look again at the SALES table, which you know is in first normal form. As long as you constrain entries to permit only one row for each Customer_ID, you have a single-attribute primary key, and the table is in second normal form. However, the table is still subject to anomalies. What if customer 1010 is unhappy with the chlorine bleach, for example, and returns the item for a refund?
You want to remove the third row from the table, which records the fact that customer 1010 bought chlorine bleach. You have a problem: If you remove that row, you also lose the fact that chlorine bleach has a price of $4. This situation is an example of a transitive dependency. Price depends on Product, which, in turn, depends on the primary key Customer_ID.
Breaking the SALES table into two tables solves the transitive dependency problem. The two tables make up a database that’s in third normal form.
=> nếu table có 1 số cột có tính phụ thuộc bắc cầu ví dụ a=b*c thì các thuộc tính b và c cần bị loại bỏ bảng cũ và tách ra thành 1 bảng mới.
0 comments:
Post a Comment