The IQ Exchange For Intelligence Systems, AI, Expert System, Virtual Reality
The Brain
The Brain Functions
Brain-Like Computers
The Brain & DNA
Intelligence Quotes
Personal Intelligence
Personal Intelligence
Personal IQ
Business Intelligence
Business Intelligence
Data Warehouses
Expert Systems
Expert Systems
Artificial Intelligence [AI]
Aviation Expert Systems
Virtual Reality
Augmented Reality
Virtual Reality


Building QA Into Business Intelligence

Quality Data Means Quality Business Intelligence

Defining data warehouse quality is not generally done in terms of data, but rather in terms of the bigger picture ability to satisfy its customer base.

The best time to begin building in quality is before the warehouse is first developed.

Developing a data warehouse is an iteractive process - measure what's been done, see where you are, make adjustments, and plan the next iteration using the measurement data.

Quality control is meeting customer expectations but not exceeding them. The cost of exceeding needs is extreme, for very little, if any additional value or return.

Measuring is the only way to determine if you are improving over time. Data warehousing is a process, hence process-oriented measures should be used.

Measure the level of activity, and for how long, rather than product measures, such as volumes of data or instances of access to the data warehouse.

Individual measures feed into larger sets of metrics, encompassed by an overall data warehouse quality program.


Quality, Measurement, and Data Warehousing

Quality is not free, however, measurement does not cost as much as bad quality.

The cost of a quality program includes:

  • planning
  • implementation
  • ongoing measurement
  • re-planning—to accomodate changing business needs

The business value in data warehousing is in the right decisions being taken and the right action being performed.

The primary DW measurement is therefore in terms of the business impact as a result of the warehouse.


Data Warehouse Success Measures

To understand quality [what you did right and wrong], one needs “meta data” about what you're doing. Successful measurement is the key to warehouse quality.

There are three types of success as they relate to data warehousing:

  1. Economic success - the data warehouse has a positive impact on the bottom line.
  2. Political success - people like what you've done, and they use it
  3. Technical success - this is the easiest to accomplish. It means the chosen technologies are appropriate for the task and are applied correctly.


Data Warehouse Quality Measures

Quality, defined in terms of degrees of excellence is avery subjective measure. The overall quality of a data warehouse is best measured in terms of:

  1. Business Quality
  2. Information Quality
  3. Technical Quality

Business Quality

Business quality is directly related to economic success; the ability of the data warehouse to provide information to those who need it, in order to have a positive impact on the business.

Business quality is made up of business drivers that directly correlate to items in the company's strategic plans. How well the data warehouse helps accomplish these drivers, is a key measure of the success of the data warehouse.

For instance, does the data warehouse align with business strategy, and how well does it support the process of strengthening core competencies and improving competitive position? Does it enable business tactics, such that it makes a positive day-to-day difference?

Information Quality

Information is only of value if it is used. Its value is therefore based on how well it is integrated into business processes, not on data quality itself.

Information quality is the key to political success, people actually using the data warehouse. In turn, this success depends on promoting awareness of the existance of the DW, access tools, and the knowledge and skills to use information outputs.

Use of BI tools has a large change management component, moving users away from using 2D reports to reports available using a multidimensional data model.

Information quality measures will also include:

  • how well users understand the warehouse
  • ease of getting data required
  • user access to data - in office, from home, third party partners
  • frequency of data access
  • how and when data is used

Information quality also includes data quality and performance. Expectations must be closely managed in this area in accordance with technical capability.

Technical Quality

Technical quality is the ability of the data warehouse to satisfy users information needs. There are four important technical quality factors.

Reach - whether the data warehouse can be used by those who are best served by its existence. This is typically beyond the base of suppliers, customers, and a few managers.

Range - defines a range of services provided by the data warehouse, including: what data is available and what is accessible. For instance, web services make data widely available for extraction from multiple locations as well as accessible by users in multiple locations.

Manuverability - the ability of the data warehouse to respond to changes in the business environment. The data warehouse must continually evolve to conform with changes in:

  • users and their expectations
  • upper management
  • the overall business
  • technology

  • data sources
  • technical platform

Capability - an organization's technical capability to build, operate, maintain, and use a data warehouse.


Measurement Process

A good approach to measuring data warehouse quality is using the goal-question-metric (GQM). This is achieved by:

  1. Identifying the type of desired impact with the data warehouse - business, information, or technical quality.
  2. Defining quality goals specific to the business - specific statements that relate to the type of impact.
  3. Developing questions to ask to identify if goals have been achieved in terms of usage, response time, meeting the needs of users, errors, and on-time delivery of cubes.
  4. Identifing Quality Areas
  5. Creating Goals


Quality Characteristics

Distinguishing characteristics that help define quality of the data warehouse:

Business quality - focus on business drivers—those things that help a business achieve its overall goals.

Information quality - users know when and how the data warehouse can help them make business decisions.

Technical quality - this relates to “reach,” or the ability to access the necessary information in the warehouse.

Tip: Don't try to achieve all goals at once - focus on the things that make the most sense.


Metrics and Measures

The terms 'measures' and 'metrics' are often confused, and confusing!

Measures are the specific pieces of data you need to collect.

A metric is a set of measures, or a methodology used to measure.

In data warehousing, a metric would be the general number of access to the data warehouse. Measures would be the number of specific accesses to SQL, accesses to certain data tables, etc.

Objective measures and subjective measures should be defined:

Objective measures can only measure those things which are tangible, and as such 'countable' in the data warehouse process.

Subjective measures are people's perceptions, usually collected using surveys or user interviews. They are not as 'countable' as objective measures.

It is easy to get misleading data using subjective measures. It is therefore best to integrate subjective measurement into the daily user experience, without being overly intrusive to your users. For example, gather responses during user login or logout. It's also important to provide feedback to the participants. Make sure you are surveying the right people, at the right time?

In addition to different types of measures, there are also different levels:

Existence - does the warehouse exist or doesn't it? This sounds overly simple, but it's important: Have users accessed the data base or not?
Quantity - this refers to “how much,” or how many times the warehouse was used.
Quality - the most difficult level, assessing “How good did we do?” Thomann warns this third level is the fuzziest until you understand the first two levels.
Metrics also have a number of components, and for data warehousing can be broken down in the following manner:

Objects - the “themes” in the data warehouse environment which need to be assessed. Objects can include business drivers, warehouse contents, refresh processes, accesses, and tools.

Subjects - things in the data warehouse to which we assign numbers, or a quantity. For example, subjects include the cost or value of a specific warehouse activity, access frequency, duration, and utilization.

Strata - a criterion for manipulating metric information. This might include day of the week, specific tables accessed, location, time, or accesses by department.
These metric components may be combined to define an “application,” which states how the information will be applied. For example: “When actual monthly refresh cost exceeds targeted monthly refresh cost, the value of each data collection in the warehouse must be re-established.”


The Data Warehouse and Change

An important characteristic in data warehousing is the concept of process; in this sense, the realization that the warehouse will constantly change. Wells suggests that organizations anticipate change in data warehousing and expect it. We're surprised by change, he says, but we should just accept it—and manage it.

Growth is a form of change, but it's more predictable and thus more manageable. For example, in data warehousing, growth can be defined by the following:

the number of users,
how they use the warehouse,
the addition of new data, and
the addition of different types of data.
Wells suggests using a chart to help manage data warehouse growth. The chart can state expectations of the warehouse (which can be defined in measurable terms), and for each expectation, list the goals, metrics, and measures to be used to manage those expectations. Don't forget adding what you'll do to monitor the growth (is reality matching the data?), and plan to update the goals, metrics, and measures as the warehouse changes over time.

Thomann adds that change in data warehousing is desirable, because it needs to grow. Otherwise you'll have tomorrow's legacy system. But to keep the warehouse valuable you have to strive for continuous improvement. To illustrate, he describes a “typical” data warehousing curve: When the warehouse is first implemented, after a week or so the usage level is very high because news about the warehouse has spread and users are exploring. In a month, usage drops off significantly as users learn what the data warehouse cannot do. After the next release or feature addition usage goes up slightly, although there isn't as much interest as with the initial release. Then, a few days later it dips again. This is a typical pattern, but what you're ideally looking for is a curve with definite increases and no dips. “You can't stop entropy,” he says, “but you can delay it by being proactive in your management. So use data—like you give your users—only it's for you.”

Further Steps

Of course, the objective of measuring is to take the measurement data and brainstorm possible future improvements. For example, once you have measurement data you can do cost/benefit analyses for new data warehouse projects. Thomann and Wells suggest building a set of priorities, because you can't do everything. Then plan future projects, packaging the good ideas together if they are compatible. In addition, what you don't want to do with the data warehouse is important, so put a boundary around your projects based on the organization's specific needs. In the end, however, a measurement program isn't about just getting data—you have to apply the knowledge and take action to make it work.


Back To Top