As Codd states, the value of NULL by itself lacks sufficient specificity of meaning. Does it imply that a field value is simply missing for a particular record, but might yet take on value or might have previously had value; or does it imply that the field value is not applicable for that record and never will be. (This lack of clarity creates that open/closed debate.) For the sake of clarity in a data warehouse, where data values are expected to provide a clear, complete, and accurate representation of business information (closed world), allowing null values create inappropriate ambiguity.
Four different types of default values should be considered in place of a null value:
- A logically accurate value
- Well-defined "unknown" (Codd's A-value)
- Well-defined "not applicable" (Codd's I-value)
- Source specific default
The biggest challenges in applying this rule will be to understand the underlying business operations and make a determination of what will be the most appropriate and most intuitive values for users to interact with. Be very careful to consider what impact those choices will have on how users will access information in the data warehouse. If it makes simple questions more complciated to answer, then it is probably a poor choice.
Word Count: 9,401