TUTORIALS • 29 October 2001

META DATA TUTORIAL
09:00-17:30
FULL DAY
Meta Data Fundamentals
Peter Aiken, Meta Data Fundamentals
 
INFORMATION QUALITY TUTORIALS

09:00-17:30
FULL DAY

ABCs of Information Quality
Larry P. English, Information Impact International

09:00-17:30
FULL DAY

Data Stewardship: A Framework for Achieving & Maintaining Data Integrity in the DW
Mark Johnson, InER-G Solutions Ltd

   
DAMA TUTORIALS
09:00-17:30
FULL DAY

Enterprise Architecture
John Zachman, Zachman International

09:00-12:30
HALF DAY

Value-Added Attribute Modelling
Graham Witt, Simsion Bowles & Associates

09:00-12:30
HALF DAY
The Ten Best Ways to Achieve Data Resource Quality
Michael Brackett, DAMA International & Data Resource Design and Remodelling
14:00-17:30
HALF DAY

Data Modelling Contentious Issues
Karen Lopez, Infoadvisors

14:00-17:30
HALF DAY

The Semantic Integration of Data from Multiple Sources into a Common Database
Michael Scofield, Experian

   
17:50-18:20 Perspective Session
Track 1 - MetaMatrix:
Enterprise Information Integration: Model Driven Data Integration, in Java Using MOF XMI and UML, David Penney, CTO, MetaMatrix Europe
17:50-18:20 Perspective Session
Track 2 - Reuters:
Adding value to Financial Services with XML Standards, Mark Hunt, Director of XML Strategy, Reuters Group
   

META DATA TUTORIAL

Full Day Tutorial
9:00-17:30

META DATA FUNDAMENTALS

Peter Aiken    Peter Aiken
Founding Director
Institute for Data Research

What is meta data? IT staff in most enterprises have a common problem. How can they convince managers to plan, budget, and apply resources for meta data management? What is meta data and why is it important? What technologies are involved? Internet and intranet technologies are part of the answer and will get the immediate attention of management. XML is the other technology. The current popularity of XML and e-business has served to rekindle interest in meta data management. Both require the reconciliation of business data and jargon in order to deliver the correct information to the desired knowledge workers so that they can speak the same language. This tutorial will discuss these and related meta data issues.

Metadata

  • Definition
  • Examples
  • Motivation
  • Business Case

Metadata Engineering

  • Defined
  • Importance of Modelling
  • Recovery techniques
  • Objects able to be recovered

Metadata Management/ToolKit

  • Meta-Meta Model
  • Office-Components

Metadata Recovery Examples

  • ERP
  • Reference Metadata
INFORMATION QUALITY TUTORIALS
Full Day Tutorial
9:00-17:30

THE ABCs OF INFORMATION QUALITY: A Primer for Information Quality Improvement

Larry English   

Larry P. English
President
INFORMATION IMPACT International, Inc.

While we have for some time recognised the requirement for quality of products and services to be competitive, organisations are just now becoming aware of the problems in information quality and how poor information quality is hurting both competitiveness and the bottom line. Information quality improvement is not an academic exercise – it is a required tool for business performance excellence in the Information Age.

In this tutorial Larry describes the fundamental principles of information quality. He describes how an organisation can improve the quality and value of its information resources. He describes metrics for measuring information quality and management principles for implementing an effective information quality environment. Larry describes how organisations have successfully implemented information quality processes to improve the effectiveness of their business and information system processes.

A. Assessment: Information Quality Inspection

  • What is information quality and why it is essential to business survival
  • Information customers and information producers
  • The information supply chain
  • Metrics for information quality
  • Processes for assessing business information quality

B. Betterment: Information Quality Improvement

  • Applying quality management principles to the information products
  • Quick wins and systemic improvements for information quality improvement and business effectiveness
  • Process for information quality improvement

C. Culture: Creating an Environment for Sustainable Information Quality

  • Information quality maturity assessment
  • How to start a information quality initiative
  • Creating and sustaining change for business effectiveness through quality information
Full Day Tutorial
9:00-17:30

DATA STEWARDSHIP – A FRAMEWORK FOR ACHIEVING AND MAINTAINING DATA INTEGRITY IN THE DATA WAREHOUSE

Mark Johnson   

Mark Johnson
President & CEO
InER-G Solutions, Ltd.

Data integrity can be defined as "The Accuracy, Timeliness, Consistency and Completeness of Data." Achieving and maintaining high integrity in the Data Warehouse is crucial to the overall value of the Data Warehouse itself.

Contrary to popular belief, the process of achieving a high level of data integrity is heavily dependent of the involvement of the business across multiple fronts. What is needed to achieve this objective is framework that optimises the collaboration between the business and I.T. to build a foundation for sustained and measurable data quality improvement. Defining and implementing this framework, "Data Stewardship", is the focus of this tutorial.

During this session we will develop a firm understanding of the base data quality issues that impact the integrity of data in the Data Warehouse. Next, we will develop an understanding of the Data Stewardship approach for defining and implementing a comprehensive data quality improvement program that will result in sustained improvements in data quality, not only in the Data Warehouse, but in the sources systems that feed it as well. This tutorial will conclude with a case study that will help participants tie the concepts together in order to accelerate application of the learned principles in their organisation. Topics addressed include:

  • Data Warehouse Data Model
  • Data Sourcing
  • Data Quality Assessment
  • Business Rules
  • Remediation
  • Data Stewardship Program Structure
  • Roles & Responsibilities
  • Implementation
DAMA TUTORIALS
Full Day Tutorial
9:00-17:30

ENTERPRISE ARCHITECTURE

John Zackman   

John Zachman
President
Zachman International

Enterprise Architecture is fundamental for enabling an enterprise to assimilate internal changes in response to the external dynamics and uncertainties of the information age environment. It not only constitutes a baseline for managing change, but also provides the mechanism by which the reality of the enterprise and its systems can be aligned with management intentions. The objective of this seminar is to build an understanding of the concepts of Enterprise Architecture and develop a sense of urgency for implementing those concepts in a modern enterprise.

Introduction to Enterprise Architecture

  • The Framework for Enterprise Architecture
  • Basic enterprise physics

Industrial Age Break-Down

  • Enterprise frustrations
  • Architectural explanations

Information Age Build-Up

  • The long term trade-off
  • The short term trade-off

Reducing Time-To-Market

  • Process evolution
  • Mass customisation

Implementation practicalities

  • Issues
  • Enterprise engineering design objectives

Conclusions

  • Cheaper and faster
  • Framework resources

Half Day Tutorial
9:00-12:30

VALUE-ADDED ATTRIBUTE MODELLING

Graham Witt   

Graham Witt
Senior Consultant
Simsion Bowles and Associates

Attributes are often the "poor relation" in the modelling process. A number of bad practices are to be found in many system acquisition projects (whether building or buying). These include confusing requirement and solution (failure to treat attributes differently at different levels of the Zachman framework), disintegrating complex attributes at too early a stage of analysis and leaving the definition of derived attributes and constraints on attributes to process modellers. The result is often a system that fails to meet business requirements. This tutorial provides some practical techniques for the thorough modelling of attributes that brings benefits to the whole system acquisition project, including process definition.

  • Business data types
  • Complex attributes
  • Derived attributes
  • Attributes of relationships
  • Attributes at different levels of the Zachman framework
  • Attribute implementation options (including XML)
  • Constraints on attributes
Half Day Tutorial
9:00-12:30

THE TEN BEST WAYS TO ACHIEVE DATA RESOURCE QUALITY

Michael Brackett   

Michael Brackett
President
DAMA International

Why is the data resource in public and private sector organisations failing to adequately support their information needs? What do organisations consistently do, or not do, to ruin one of their most critical resources? Why have organisations allowed this situation to happen and to continue for so long? What bad habits should be avoided and what good practices should be followed to ensure a high-quality data resource that supports business activities? People are asking these questions with increasing regularity. The underlying theme of most of these questions is what can organisations do to prevent any further data disparity. The answer begins by identifying the bad habits that organisations have and the impacts of those bad habits.

Next, the bad habits must be turned into good practices that directly benefit the organisation. The good practices that produce early benefits become the best practices for quick-starting an initiative to improve meta-data quality. This tutorial covers the bad habits that lead to a low-quality data resource, their impacts, and the good practices that result in a high quality data resource, their benefits, and the best practices for achieving early successes. Delegates will get an overview about:

  • How to identify the bad habits
  • To understand the impacts of the bad habits
  • The architectural good practices to follow
  • The cultural and management good practices to follow
  • The benefits of the good practices
  • The best practices to implement for early successes
  • Evaluation criteria to determine the state of the data resource
  • How to slow and stop the cycle of increasing data disparity
Half Day Tutorial
14:00-17:30

DATA MODELLING CONTENTIOUS ISSUES

Karen Lopez   

Karen Lopez
Principal Consultant
InfoAdvisors, Inc

A highly interactive session where attendees will evaluate the options and best practices of common and advance data modelling issues, such as:

  • Party/party role
  • Natural vs. surrogate keys
  • Abstraction vs. specification
  • Conceptual, logical, physical
  • UML vs. ERD
  • Privacy vs. fair use and data mining
  • Derived/calculated data in logical models
  • Logical & physical data model separation

Delegates in this session will be presented with an issue along with a range of responses or possible solutions. Attendees will vote on their preferred response, and then the group as a whole will discuss the results, along with the merits of each possible response. If the specific issue has been discussed in other presentations, a summary of the responses of the other groups will be presented.

Half Day Tutorial
14:00-17:30

THE SEMANTIC INTEGRATION OF DATA FROM MULTIPLE SOURCES INTO A COMMON DATABASE

Michael Scofield   

Michael Scofield
Director of Data Quality
Experian

More and more, data stewards (DBAs, data architects, data warehouse designers, etc.) are being asked to integrate data from multiple, dissimilar sources into a common database. This can be because of companies merging or acquiring each other, or the effort to integrate customer data from various applications to support a more aggressive CRM. Or, it can be when a data warehouse seeks to integrate cause and effect data from disparate source applications. Successful mapping of source data to target field depends upon a comprehensive understanding of the business meaning and data architectures of each source, and the target. By semantic, we mean ensuring that each source data field has the comparable meaning, scope, and normal behaviour (not merely field-name and format) corresponding with its peer source field(s). Merging two sources is exciting enough. Merging three or more can be terrifying.

This tutorial will cover a wide range of techniques showing many practical examples of actual data. It is not enough to use documentation (file descriptions, etc.) of sources (which may be obsolete). One must look at the actual data, all of it.

We will discuss step-by-step techniques for uncovering data anomalies, data quality problems, and semantical discontinuities in how a field is used. We start by creating an inventory of the data, its architecture, and its behaviour at each source, from the high-level view down to the specific, detailed behaviour of each field and column, and inter-dependencies. Techniques in data profiling and domain studies will be shown in detail with examples of surprise findings. For example a field may be used in one way for one entity subtype, and in a different way for another subtype. Never underestimate the creativity of application owners to use a field for a purpose different than its original intent. Even the treatment of negative values (such as total invoice amount) may be different for different sources.

Then, the task of evaluating the commonality of any pair of source fields, and determining the appropriate target field in the target database is not for the naive. We will review some mistakes of wimp analysts who made unwarranted assumptions about source data, without even looking at the actual business data (gasp!). In contrast, we will review sound analytical techniques for getting the correct mapping and translation to the target database. Also, data quality issues such as validity, completeness, richness, and accuracy will be discussed.

Finally, we will survey techniques of establishing an on-going data surveillance program to ensure that later production-ized loads of data will not be caught by surprise when a source changes definitions or scope of the data it supplies.