Entity Relationship Model

What is the ER model?

The ER model is a high-level conceptual data model used to describe the structure of a database in terms of entities (things of interest), attributes (properties of those things), and relationships (how those things are associated). It helps designers capture data requirements visually before implementation.

Core concepts

1. Entity

An entity is a real-world object or concept that is distinguishable and relevant to the system (e.g., Student, Course, Employee).
Represented as a rectangle in diagrams.
Entity types: the class (e.g., Student). Entity instances (tuples/records) are members of that class (e.g., a specific student).

2. Attribute

A property of an entity (or relationship) — e.g., Student has StudentID, Name, DOB.
Types:
- Simple (atomic): cannot be divided (e.g., Age).
- Composite: composed of sub-parts (e.g., Address → Street, City, Zip).
- Derived: computed from other attributes (e.g., Age from DOB).
- Multi-valued: can have multiple values (e.g., PhoneNumbers).
Notation: oval attached to entity rectangle.

3. Key attribute

An attribute (or set) that uniquely identifies an entity instance (e.g., StudentID).
Shown by underlining attribute name.

4. Relationship

Association between two or more entities (e.g., ENROLLED_IN between Student and Course).
Shown as a diamond (Chen notation) or a line (crow’s foot).
Relationship attributes: attributes that belong to the relationship (e.g., Grade for ENROLLED_IN).

5. Cardinality / Participation constraints

Cardinality: numerical mapping between instances of participating entity sets.
- One-to-one (1:1) — each instance in A maps to at most one in B.
- One-to-many (1:N) — instance in A can map to many in B, but B maps to at most one in A.
- Many-to-many (M:N) — A maps to many B and B maps to many A.
Participation (total/partial):
- Total (mandatory): every entity instance must participate in the relationship.
- Partial (optional): participation is optional.
Notation: min..max or symbols (crow’s foot: one, many, optional).

6. Weak entity

An entity that cannot be uniquely identified by its own attributes alone; depends on a strong (owner) entity.
Has a partial key and an identifying relationship (double rectangle/diamond in diagrams).
Example: Dependent identified by (EmployeeID, DependentName).

7. Generalization / Specialization (IS-A)

Generalization: union of common attributes into a superclass.
Specialization: splitting into subclasses with additional attributes.
Types of inheritance:
- Total vs Partial (are all super instances members of subclasses?)
- Disjoint vs Overlapping (can an instance belong to multiple subclasses?)
Notation: triangle linking superclass and subclasses.

8. Aggregation

Treating a relationship as an abstract entity to participate in another relationship (useful for n-ary relationships).
Not widely used but handy for complex models.

9. Ternary / n-ary relationships

Relationships that involve three or more entities (e.g., Supplier–Part–Project).
Important to model explicitly; cannot always be decomposed into binary relationships without losing semantics.

ER diagram notations (brief)

Chen classical: rectangle = entity, oval = attribute, diamond = relationship.
Crow’s Foot / IE: entities = boxes, relationships = lines with symbols showing cardinality (crow’s foot = many).
UML class diagrams can also represent ER concepts for object-relational modeling.

Steps to design an ER model

Gather requirements: interview stakeholders, read documents.
Identify entities: nouns in the requirements often map to entities.
Identify attributes and keys for each entity.
Identify relationships between entities, and whether relationships have attributes.
Set cardinalities and participation constraints for each relationship.
Identify weak entities, specialization/generalization needs.
Refine: check for redundancy, normalize attributes, ensure clarity.
Map to relational schema (next section).

Mapping ER → Relational schema (rules & example)

Common mapping rules (Chen notation → relations):

Strong entity → create a table with entity’s attributes; primary key = key attribute.
- Student(StudentID PK, Name, DOB, Email)
Weak entity → table includes owner’s PK as foreign key + weak entity’s partial key as part of composite primary key.
- Dependent(EmployeeID FK, DependentName, DOB, PRIMARY KEY(EmployeeID, DependentName))
1:N relationship → include PK of the “one” side as foreign key in the table of the “many” side (if no attributes on relationship).
- Course(CourseID, ...), Student—if one student has many courses? Typically Enrollment table covers M:N; for 1:N apply the FK.
M:N relationship → create a junction table with FKs to both entities; include relationship attributes and composite PK (or surrogate).
- Entities: Student(StudentID), Course(CourseID)
- Relationship ENROLLED(StudentID, CourseID, EnrollDate, Grade), PK = (StudentID, CourseID)
1:1 relationship → put FK in either table, or create separate table if relationship has attributes or for optionality.
- If mandatory on both sides, could merge entities.
Ternary (n-ary) relationship → create a relation that includes FKs to all participating entities + relationship attributes. PK can be composite of the FKs if it uniquely identifies rows.
ISA (specialization) → options:
- Single table for superclass + subclass attributes nullable (single-table).
- Separate table per subclass with PK = superclass PK (joined tables).
- Table per class (redundant data) — less common.

Concrete example — University (ER → relations)

ER components (short):

Entities: Student(StudentID, Name, DOB), Course(CourseID, Title, Credits), Instructor(InstID, Name)
Relationship: ENROLL between Student and Course with attribute Grade (M:N)
Relationship: TEACH between Instructor and Course (1:N: one instructor can teach many courses)

Mapped relational schema:

Student(StudentID PK, Name, DOB)
Course(CourseID PK, Title, Credits, InstID FK) — if course has one instructor (1:N)
Instructor(InstID PK, Name)
Enroll(StudentID FK, CourseID FK, EnrollDate, Grade, PRIMARY KEY(StudentID, CourseID))

Constraints & integrity

Entity integrity: primary keys cannot be NULL.
Referential integrity: foreign keys must match existing primary key values or be NULL if optional.
Uniqueness: enforce unique constraints where needed.
Domain constraints: attribute value ranges/types.
Business rules: encoded as constraints/triggers or enforced in application logic (e.g., “a student can enroll in at most 6 courses per semester”).

Common modeling pitfalls

Overuse of multi-valued attributes (prefer separate entity/table).
Mistaking attributes for entities (e.g., making Address an attribute when it should be an entity if reused).
Ignoring relationship attributes (they sometimes become a separate relation).
Forcing binary relationships when a ternary relationship better represents semantics.
Not capturing optionality/participation — leads to wrong FK placement.

Tips & best practices

Always identify natural primary keys; if none, use surrogate keys (e.g., ID).
Normalize attributes (avoid repeating groups).
Use relationship attributes as a signal to implement a junction table.
Validate the model with real sample transactions/use-cases.
Keep diagrams readable — split complex models into modules or domains.
Document assumptions and business rules alongside the diagram.

Quick reference: when to create a separate table

Relationship has attributes → make a separate relation/junction table.
Many-to-many relationship → always require a junction table.
Multi-valued attribute → create a separate table (e.g., StudentPhone(StudentID FK, Phone)).

Computer Science

Search This Blog