10 things to avoid in data model 09242010

20
Ten Things to Avoid in a Data Model Dr. Michael Blaha Modelsoft Consulting Corp www.modelsoftcorp.com E-mail: [email protected]

Upload: erwin-modeling

Post on 24-Jan-2015

557 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: 10 things to avoid in data model 09242010

Ten Things to Avoid in a Data Model

Dr. Michael BlahaModelsoft Consulting Corpwww.modelsoftcorp.com

E-mail: [email protected]

Page 2: 10 things to avoid in data model 09242010

PAGE 2

Introduction

• A model is an abstraction of some aspect of a problem.

• A data model is a model that describes how data is represented and accessed, usually for a database.

– Data modeling can be a difficult task and is often pivotal to the success or failure of a project.

• There are many pitfalls to data modeling as we will explain...

– Strategic pitfalls.

– Detailed pitfalls.

• We do not discuss detailed modeling constructs such as keys, data types, nullability, and referential integrity.

Page 3: 10 things to avoid in data model 09242010

PAGE 3

Strategic Pitfalls…

Page 4: 10 things to avoid in data model 09242010

PAGE 4

Strategic Pitfall: Vague Purpose

• Don’t build a model without understanding the business rationale.

• The purpose for a model dictates the level of detail.

– Just entities and relationships.

– Fully attributed.

– With data types and constraints.

• The purpose also dictates the level of polish, the degree of completeness, and the amount of time justified.

• Different kinds of data models.

– Detailed application model for development.

– Rough application for a purchase spec.

– Enterprise model for integration.

• This pitfall might seem obvious, but I’ve seen modeling efforts with little business purpose and no clear definition of deliverables.

Page 5: 10 things to avoid in data model 09242010

PAGE 5

Strategic Pitfall: Literal Modeling

• Your job is not to do what the customer says. Your job is to solve the problem that the customer is imperfectly describing.

• You must pay attention to the hidden true requirements.

• You must interpret and abstract what the customer tells you.

– You must recognize arbitrary business decisions that could easily change.

• You can raise abstraction by thinking in terms of patterns.

• The use case mentality really misses this point.

Page 6: 10 things to avoid in data model 09242010

PAGE 6

Strategic Pitfall: Literal Modeling Example

• The original model is correct, but has problems. What happens if a person gets promoted to a supervisor and then to a manager? Are there multiple records? Movement of a record? Or???

• The improved model is more abstract and softcodes the management hierarchy.

Originalliteralmodel

Improvedabstractmodel

Page 7: 10 things to avoid in data model 09242010

PAGE 7

Strategic Pitfall: Large Size

• Avoid large models. Limit a model to no more than 200 tables.

• Large models involve more work.

• Is the large size really justified or can you simplify the model with abstraction?

• I rarely encounter a large model with a compelling justification.

• I don’t see this step in software development methodologies, but it is certainly needed.

Page 8: 10 things to avoid in data model 09242010

PAGE 8

Strategic Pitfall: Speculative Content

• Do not include content that is not needed now and “might be helpful” in the future..

• All this does is to make a model larger, increase development time, and raise cost.

• A model must fully address the requirements, but not greatly exceed them.

• A quality model should be readily extended, so there is no need to add content in advance of need.

• Speculative content runs counter to the philosophy of agile development.

Page 9: 10 things to avoid in data model 09242010

PAGE 9

Strategic Pitfall: Lack of Clarity

• A relational database is declarative. Declare data in your models.

• A domain is the set of possible values for an attribute.

– ERwin lets you define domains and then assign them to the pertinent attributes.

• An enumeration is a domain that has a finite set of values.

– Declare enumerations in your databases.

• Don’t store data structures with a binary encoding.

• Don’t use cryptic names.

• Don’t use anonymous fields that application code must interpret.

• Obfuscation can happen through sloppy development practices.

Page 10: 10 things to avoid in data model 09242010

PAGE 10

Strategic Pitfall: Lack of Clarity Example

Enumerationstored inplace

Enumerationstoredseparately

Car table

1

2

3

carID year color weight

2001

1989

2000

red

red

blue

2000

1500

2500

Car table

1

2

3

carID year colorID weight

2001

1989

2000

1

1

3

2000

1500

2500

Color table

1

2

3

colorID color

red

green

blue

Enumerationencoded

Car table

1

2

3

carID year color weight

2001

1989

2000

1

1

3

2000

1500

2500

Page 11: 10 things to avoid in data model 09242010

PAGE 11

Detailed Pitfalls…

Page 12: 10 things to avoid in data model 09242010

PAGE 12

Detailed Pitfall: Reckless Violation of Normal Forms

• Do not accidentally violate normal forms.

• A normal form is a guideline that increases data consistency.

• As tables satisfy higher levels of normal forms, they are less likely to store redundant or contradictory data.

• Denormalization is only justified when there is a major performance bottleneck, such as for data warehouses.

• Be suspicious of large tables (30 attributes or more).

• Be suspicious of any entity type that is difficult to define.

• It is acceptable to violate normal forms deliberately, when there is a good reason to do so.

Page 13: 10 things to avoid in data model 09242010

PAGE 13

Detailed Pitfall: Normal Forms Example

• The contact position and contact phone depend on the contact name.

• The contact name depends on customerPK.

Violatesnormalform

Satisfiesnormalform

Page 14: 10 things to avoid in data model 09242010

PAGE 14

Detailed Pitfall: Needless Redundancy

• Be careful with redundancy.

– Redundancy across applications.

– Redundancy within an application.

• Normal forms are one aspect of redundancy.

• Ideally there should be a single recording of each data item. (Rarely is this completely feasible.)

• Organizations are rife with applications that overlap in awkward and loosely controlled ways.

– This is a major justification for data warehouses.

• Don’t include redundant data to compensate for a poorly conceived application.

• Redundant data is acceptable if you use built-in database features to keep redundant data consistent (such as materialized views).

Page 15: 10 things to avoid in data model 09242010

PAGE 15

Detailed Pitfall: Parallel Attributes

• Avoid parallel attributes for non-data-warehouse applications.

• Parallel attributes often codify arbitrary business decisions, reducing information system flexibility.

• Widespread use of parallel attributes often indicates a poor model.

Parallel attributes Parameterized model

Page 16: 10 things to avoid in data model 09242010

PAGE 16

Detailed Pitfall: Symmetric Relationships

• Avoid symmetric relationships for relational databases.

• Promote a symmetric relationship to an entity type.

• Otherwise double entry or double search.

• Symmetric relationships can be acceptable for programming.

Symmetric relationship Promotion to an entity type

Page 17: 10 things to avoid in data model 09242010

PAGE 17

Detailed Pitfall: Anonymous Fields

• As much as possible, clearly describe the data being stored and avoid anonymous fields.

• How to distinguish the city of Chicago from Chicago street?

• May need to parse a field to separate city, state, and postal code.

• A few incidental user-defined fields are OK.

fragment of Location table

456 Chicago Street

198 Broadway Dr.

123 Main Street

locationAddress1 locationAddress2 locationAddress3

Decatur, IL xxxxx

Suite 201

Cairo, IL xxxxx

Chicago, IL xxxxx

Chicago, IL xxxxx

Page 18: 10 things to avoid in data model 09242010

PAGE 18

Summary

• Data modeling is often a pivotal task in building a database application.

• A data model determines an application’s data quality, extensibility, and performance — and influences whether the application has a chance at business success.

• You can improve your data models if you pay attention to the pitfalls we have covered.

Page 19: 10 things to avoid in data model 09242010

PAGE 19

Speaker Bio

• Since 1994 Dr. Michael Blaha has been a consultant and trainer in conceiving, architecting, modeling, designing, and tuning databases for dozens of organizations throughout the world.

• He has authored six U.S. patents, five widely used books, and many papers.

• His most recent book, Patterns of Data Modeling, was published in June 2010.

• Blaha received his doctorate from Washington University in St. Louis and is an alumnus of GE Global Research in Schenectady, NY.

• You can contact him at [email protected] and www.modelsoftcorp.com.

Page 20: 10 things to avoid in data model 09242010

PAGE 20

Questions?