According
to William H. Inmon, a leading architect in the construction of data warehouse
systems,
“A data warehouse is a subject-oriented, integrated, time-variant, and
nonvolatile
collection of data in support of management’s decision making process”
.
Subject-oriented: A data
warehouse is organized around major subjects, such as customer,supplier,
product, and sales.Rather than concentrating on the day-to-day operations and
transaction processing of an organization, a data warehouse focuses on the modeling
and analysis of data for decision makers. Hence, data warehouses typically provide
a simple and concise view around particular subject issues by excluding data that
are not useful in the decision support process.
Integrated:
A data warehouse is usually constructed by integratingmultiple heterogeneous sources,
such as relational databases, flat files, and on-line transaction records.Data
cleaning and data integration techniques are applied to ensure consistency in naming
conventions, encoding structures, attribute measures, and so on.
Time-variant:
Data are stored to provide information from a historical perspective
(e.g.,
the past 5–10 years). Every key structure in the data warehouse contains,
either
implicitly
or explicitly, an element of time.
Nonvolatile: A data
warehouse is always a physically separate store of data transformed from the
application data found in the operational environment. Due to
this
separation, a data warehouse does not require transaction processing, recovery,
and
concurrency control mechanisms. It usually requires only two operations in data
accessing: initial loading of data and
access of data.