1st Cut Data Model

The Entities/Attribues List is input to the data modeling process.

Before we begin modeling, we need to assign the "Candidate Identifiers" to the entities in the list. It is almost a universal standard the identifiers take the form "Id." Note that the identifier name "Id" is assigned to the Purchase Order entity. The reason for this is identiers should alsways take the form of your CPU registers. We want the relational math processing of the database management system (DBMS) to occur in the very fast CPU memory, not the slow-assed auxuliary storage. "PurchaseOrder.Id" becomes the primary identifier (computer friendly) and "PurchaseOrder.Number," the "legacy key," becomes the secondary primary identifier (people friendly). The same applies to the Item.Code.

We resolve the Compay-Office question with the business user by creating an "Office" candidate entity because the business user tells us the company has six Office with more planned.

The 1st cut data model is produced by placing an object in the model for each of the data groups identified on the Purchase Order. Don't struggle with "getting it right." At this point, we can't. We don't know enough. Just trust the process and begin.

    Purchase Order -- data occurs once
            Id; Number; Date; Status; Approved By; Requested By; DeliveryDate; Notes;
            Total

    Company -- data occurs once
            Id; Name

    Office -- data occurs once
            Id; Name; Street; Suite; City; State; ZipCode; Country; Phone; Attention

    Supplier -- data occurs once
            Id; Name; Street; Suite; City; State; ZipCode; Country; Terms; Phone; Email

    Employee -- data occurs once
            Id; Name; Phone

    Department -- data occurs once
           Id; Name

    Item -- data repeats and entity has no relationship to its parent.
           Id; Name; Code; Quantity; Price; Discount; Total

3. Produce the initial data model:

I'm using a freebee data modeling tool, so the notation for the one-to-one, one-to-many and many-to-many relationships are not my first choice, but they will work:

    The one-to-many connector has an arrowhead showing the "many end" of the
    relationship. The arrowhead shows the location of a foreign identifier.

    The many-to-many relationship has arrowheads at either end. Do we have foreign
    identifiers in both entities? Absolutely not! A many-to-many relationship is resolved
    by an associative entity (AKA junction, intersection, bridge, edge entity,
    depending on whose book you read.) we haven't seen yet. Be patient.

Note: There are no one-to-one relationships in a database. If you think you are looking at a one-to-one, you're looking at two parts of the same entity. Collapse all the attributes into a single entity and name the resulting entity appropriately.

Begin the data model by placing the business object entity in the middle of the page and the other entities around it. Now we have to draw in the realtionships by asking a series of questions about the relationships of the entities.

While creating our 1st cut data model we can start applying the tests for Normalization.

Normalized data is data "free of relational math errors." There are three test we apply to each entity as we evolve the data model. An entity is said to be in the 1st Normal Form when it contains "No repeating data," also stated as "no repeating groups." If all of the attributes in the entity can contain one and only one value, the entity is said to be in the 1st Normal Form. If the identifier changes, the data changes. Repeating data are attributes that can have more than one value in an entity and don't belong there. They belong in some other entity.

    Question: "Over time, will the Company have one, or more than one Office?"
    Answer: The Company will have, over time, have many Office(s).
    Result: The relationship, over time, between the Company entity and the Office entity
    is one-to-many, from the Customer to the Office.

Then we ask the same series of questions from the other end of the realtionship.

    Question: "Over time, will an Office repot to one, or more than one Company?"
    Answer: An Office will, over time, report to one Company.
    Result: The relationship between the Office and the Customer is many-to-one,
    from the Office to the Company.

Let's do one more.

    Question: "Over time, will the Department have one, or more than one Employee?"
    Answer: The Department will have, over time, many Employee(s).
    Result: The relationship, over time, between the Department entity and the Employee
    entity is one-to-many, from the Department to the Employee.

Then we ask the same series of questions from the other end of the realtionship.

    Question: "Over time, will an Employee be assigned to one, or more than one
    Department?"
    Answer: An Employee may, over time, be assigned to to many Departments.
    Result: The relationship over time, between the Employee entity and the Department
    entity is one-to-many, from the Department to the Employee.

We have a one-to-many relationship from each entity. That's a many-to-many relationshiop and we have to adjust the data model. This relationship is resolved by building a candidate entity between the Department and the Employee and named the DepartmentEmployee or EmployeeDepartment. Because the name implies the structure of the foreign identifiers, select one of the two names. When we investigate and calculate "Loads" later in this process we will be able to correctly structure the proper identifier (key).

4. Produce the "initial" 2nt cut data model by resolving all of the many-to-many relationships.

We have produced our 2nd cut data model with the many-to-many relationships resolved. Now we have to update our Entity-Attribute list with the three new candidate entities and create the relationships (identifiers). Remember, the arrowheads tell us wher the foreign identifiers are located.

    Purchase Order -- data occurs once
            Id; Number; Date; Status; Approved By; Requested By; DeliveryDate; Notes;
            Office.Id; Supplier.Id; Total

    Company -- data occurs once
            Id; Name

    Office -- data occurs once
            Id; Name; Street; Suite; City; State; ZipCode; Country; Phone; Attention;
            Company.Id

    OfficeDepartment -- data occurs once
            Office.Id; Department.Id;

    Supplier -- data occurs once
            Id; Name; Street; Suite; City; State; ZipCode; Country; Terms; Phone; Email

    PurchaseOrderEmployee -- data occurs once
            PurchaseOrder.Id; Employee.Id; Role

    Employee -- data occurs once
            Id; Name; Phone

    DepartmentEmployee -- data occurs once
            Department.Id; Employee.Id; Title

    Department -- data occurs once
           Id; Name

    Item -- Does have repeating data.
           PurchaseOrder.Id; Item.Id; Item.Name; Item.Code; Quantity; Item.Price; Discount;
           Total;

All of our entities now pass the test for 1st Normal Form with the exception of the Item. In each of the other entities, all of the attributes are dependant on the identifier(s). If the identifier(s) change, the data changes.

Now, we have to test those entities that compound candidate identifiers (compound keys) and we do that on the nest page, 2nd Normal Form.