Applying Referential Integrity to a Database

Correct Table Linking at Start is Essential Cost-Effectiveness

© Harry P. Schlanger

Nov 29, 2008
One-To-Many Relationship, Harry P. Schlanger
When not properly linked, badly connected tables can cause inefficiency to data retrieval. Eventually the database needs to be fixed by hand, at cost.

Data referential integrity is often not well understood in database design. When integrity is not properly applied, it can cause a lot of inefficiency to data retrieval. Eventually the table data needs to be fixed. Duplicate records must be located manually by using queries, and removed. “Orphan” records (due to deleted records) must be repaired.

All this effort requires time and energy, and costs money to fix. It is much more cost-effective to properly plan the design and implementation of the database in the first place.

This article reviews table relationships, focuses on correct table linking and identifies the consequences for failing to do so.

Implement in Correct Sequence

A database design to correctly plan and implement table relationships with referential integrity has previously been described by this author. The design begins with identifying entities from business data; tables are designed following conventions such as normalization, and then linked with each other. Only then are tables populated with data.

Table Relationship Types

In theory, there are three types of relationships:

  • One-to-many
  • Many-to many
  • One-to-one

In practice, the many-to-many relationship breaks down into a number of one-to-many relationships. The one-to-one relationship is not useful and is usually absorbed into another table. Therefore, we are left with just the one-to-many relationship to contemplate.

Fortunately, this table arrangement is easy to conceptualize. Figure 1 shows a simple one-to-many example that anyone can relate to. Using the Access 2007 database, a table named tblSong is used to record unique songs (i.e. one), while songs that are being played one or more times (i.e. many) are recorded in the linked table named tblSongPlayed.

Enabling Referential Integrity

The link in Figure 1 is achieved by dragging the primary key SongID onto the corresponding "foreign key" in tblSongPlayed. This causes a panel to show giving the option to check the "Enforce Referential integrity" checkbox.

If this is not checked, the link is still made but a thin line is shown. If checked, a heavy line results (see Figure 1). If there is existing data already in the tables, Access 2007 checks that there are matching IDs in both tables, otherwise a dialog message will be returned advising of the inability to make the connection with referential integrity.

How to "Clean up" Data

When data exists in tables and a sound link cannot be made, the data has to be corrected. There are often two types of problems to consider:

  • Duplicate records have been allowed
  • "Orphan" records exist due to deleted records

A query needs to be run to check for and locate duplicate records. When duplicates are found, they need to be removed manually or with a delete query. Careful judgement must be exercised and decisions to remove records made. The degree of difficulty for these tasks depend on

  • complexity of the query (number of tables involved)
  • number of records in participating tables

Similarly a check must be made to locate orphan records. This requires a different query to be run. It should be obvious by now that it is advisable to start any new database by leaving tables empty, then connecting tables with referential integrity, then lastly, populating the tables.

Summary and Conclusion

In an Access 2007 database, the heavy line between two tables indicates that referential integrity is in place. This is achieved by clicking the appropriate checkbox in the panel that opens up when the link is made.

Using a query, one may detect the soundness of such table connections, by checking whether the line is a thin or heavy line. In the case of an unsound link, the data in tables should be "cleaned up", and tables connected with proper referential integrity. The latter prevents data corruption in the database, ensuring no attendant, costly exercise to clean up data. Therefore, it makes sense to implement an initial database that has proper data integrity before any of the tables are populated.


The copyright of the article Applying Referential Integrity to a Database in Database Programming is owned by Harry P. Schlanger. Permission to republish Applying Referential Integrity to a Database in print or online must be granted by the author in writing.


Figure 1. One-To-Many Relationship, Harry P. Schlanger
One-To-Many Relationship, Harry P. Schlanger
     


Post this Article to facebook Add this Article to del.icio.us! Digg this Article furl this Article Add this Article to Reddit Add this Article to Technorati Add this Article to Newsvine Add this Article to Windows Live Add this Article to Yahoo Add this Article to StumbleUpon Add this Article to BlinkLists Add this Article to Spurl Add this Article to Google Add this Article to Ask Add this Article to Squidoo