In this term paper the
author first introduces the concepts of Master Data Management (MDM), Master
Data, Data Domain, Customer Data Integration (CDI), Product Data Management
(PDM) and One Master Data. Next a business case in support of MDM is presented.
In the business case studies various industry scenarios that would require or
benefit from a MDM initiative. The implementation of Master Data Management
requires business initiative and an IT initiative. The paper will therefore explain
various implementation architecture and management framework for MDM
implementations that are published in journals and books. The most important part of MDM is data synchronisation
techniques. The data synchronisation is required to maintain the integrity of
Master Data in a steady state scenario. The paper will explain data synchronisation
techniques that could be used. In the conclusion the paper will provide a MDM
implementation solution using a case study which will use the concepts
explained in the paper. The problem statement in the case study is derived from
the authors work experience. In order to complete the term paper multiple
articles from various management and technology journal and books were
reviewed. These articles are listed in the references table..
The increasing amount of
data is creating challenges to companies' data management practices, causing
data quality problems which are very common in today's companies. Additionally
today's technology allows storing more data than a company can manage and
different enterprise solutions often lead to further data confusion [8].
Disparate systems create
potential for data error; data errors are these inconsistencies in data that
cause data quality issues which could result in lost consumer cross selling
opportunities, invoicing problems, or even failed products. It is estimated
that incorrect data in retail industry lead to a loss of approximately $40
billion annually [9].
Master Data Management also
called Reference Data Management, is an integrated business and IT function that
focuses on the management and interlinking of reference or master data that is
shared by different systems and used by different groups within an organization
[4].
Gartner defines Master Data
Management as below.
“Master Data Management is a
technology enabled business discipline that helps organisation achieve a
“single version of truth” in such important areas as customers, product,
accounts etc.
In MDM, the business and IT
organisation work together to ensure the uniformity, accuracy, semantic persistence,
stewardship and accountability of enterprise’s official, shared master data.
Organisation apply MDM to eliminate the costly debates on “whose data is right”
which can lead to poor decision making and business performance” [6]
Master data provides a
foundation and a connecting function for business Intelligence(BI) by the way
in which it interacts and connects with transactional data from multiple
business areas such as sales, service, order management, purchasing,
manufacturing, billing, accounts receivable, and accounts payable(AP) [1].
According to master data
(also called reference data), is any information that is considered to play a
key role in the core operation of a business, typically shared by multiple
users and groups across an organization and stored on different systems [4].
Master Data is complementary
to BI and can provide an excellent source of dimensional data [11].
Master Data consists of
information critical to a company’s operations. The data is usually categorised
master data entity such as customer, products, vendors, partners, employees,
inventory etc. These categories are called Data Domains [1 and 9]. The concepts
of Master Data Management apply to each of the domains in general. Each of the
domains has different implementations challenges.
In an article on Enterprise
Data Management (EDM), Cohen [5] describes MDM as one of the main components of
an effective enterprise data management (EDM) program. There are six components
in enterprise data management (EDM). Figure 1 describes the components of enterprise
data management (EDM) which when managed well together help companies to take
advantage of the latest technological innovations and more effectively manage
their information.
Figure 1: Enterprise Data Management [5]
MDM is the process of
helping a company to standardize the definition and attributes of all of its
critical data elements (customer, vendor, product, etc.) to create a common
point of reference enterprise wide. MDM can facilitate the sharing of data
among all a company's disparate business functions, departments and even
divisions - not to mention across all information systems, platforms and
applications. Without an effective enterprise wide MDM implementation the other
components for EDM will not be as effective. The business cases defined in the
next section provide some examples to support this statement.
A MDM solution therefore creates
a single view of data in any targeted data domain. This is also referred to as
the golden record. For example, if the master data management is for Customer
Data then any record will refers to the “single truth” or “single customer
view” which is an authoritative customer record that has usually been generated
by extracting, cleansing the data from multiple channels of enterprise. This
process is called Customer Data Integration (CDI). CDI is the subset of MDM and
encompasses every aspect of customer touch points in the organisation. CDI is
the most widely used implementation of MDM [1] while [8] mention that customer master
data is a common starting point for an organization’s MDM.. An effective CDI
means that any customer attribute is uniquely identifiable and there exists no
multiple versions of customer attribute in any of the company’s enterprise IT
systems.
Product data
management (PDM) systems are used to manage all product-related data and also
product master data. Product master data is far more complex than customer
master data [8].
In this section four
Business Case scenario are provided. These business cases present a problem statement
the solution to which is implementation of MDM.
1.
Business Case #1: Merger of two
companies.
In U.S.A, a major telecom
service Provider Company A bought telecom service provider company B. The two
major telecom service providers merged in 2005. Each of the company
individually provided mobile phone services to approximately 20 million
subscribers each – one used CDMA technology and other used GSM technology.
The challenges for the newly
merged companies where multiple, the chief among them being to consolidate
their customer service department such that the outward projection to the
customer was one brand. This was in addition to normal integration related
problems like, HR, finance and regulatory etc.
The challenges resulting out
of this merger that MDM could address are:
a.
How do we accomplish consolidation of all
customer bases such that there is single source of truth on all the customer attributes?
This is CDI part of MDM.
b.
Because the two companies had different price
plan, devices, and products – they need to be consolidated into one product
reference. This is part of PDM or MDM.
2.
Business Case #2: Replacement of an ERP
application
In the year 2008, a major
crown corporation, managing social housing portfolio replaced its legacy ERP
system with a Commercial off the shelf (COTS) implementation. This had a unintended
impact on the downstream applications when the migration to new ERP was
completed. The downstream applications that used the original ERP’s unique
identifier (UiD) to cross reference were now out of sync with the corporate
property master data in the corporate ERP because the new application did not
use the same Unique Identifier (UiD). Additionally the new ERP did not
integrate with the downstream application. That is whenever a attribute is
modified in the new ERP, that modification is not communicated to the downstream
applications. The new ERP being COTS product, cannot be modified without
incurring huge cost.
Thus the challenge here is
to ensure to synchronise the data in the down steam applications whenever data
is the main ERP is modified. This is a classic case for MDM implementation.
3.
Business Case #3: New application
introduced
In year 2012 a new
application was introduced in a organisation that manages building repairs.
This application was a web based application which was used to order jobs. The
application used the job costing information from the ERP system and
communicated back to the ERP system when the order was completed. This required
that the job costing information that is actually maintained in the corporate
ERP is correctly communicated to the new web based application. This is problem
can be tackled using MDM.
4.
Business Case #4: The Homeless Shelter Network
There are many homeless
shelters in a big city. Big urban center could have upwards of 100s of such
shelters. All of them are mostly funded either directly by provincial
government or a provincial government funding agency and/or individual city
councils. However each one of shelter house is a mostly independent
not-for-profit organisation or a charity run entity. Each one of the shelters
would have their own distinct business processes, and data collection methods
with varying degree of sophistication.
Each shelter would be able
to provide the number of clients it served in a particular time period. However
if the funding agency or the governments wants to know how many unique homeless
individual were served by the all the shelters funded by it, there is no way of
knowing it unless every shelter uniquely and uniformly identify the homeless individual
it serves i.e. if each one of them run same software application to manage
their shelter or at least use same identity proof. However, in practice not
everyone uses same software or same method of indentifying the homeless client.
MDM can be used in scenario
like this to overcome data duplication problem.
Before proceeding with MDM
architecture it is important to review the types of data and tables in a modern
database application. Enterprise systems deal with and generate different types
of data. These data are classified into data domains like, customer, products,
accounts, vendors etc. Additionally the data can be classified as transactional
data and non-transactional data. Transaction data are generally stored in
transaction tables. Examples of transaction data include call records of a
subscriber (CDRs), or line items in a purchase order or a bank transaction in
ATM machine. Normally transaction data tables have large number of records. The
data in transaction tables is dynamic and the tables are frequently updated
with new rows. The data in the transaction table are generally critical for
regulatory reporting. However before the advent of virtual server and cheap
storage the transaction data used to be archived in tape drives or sometimes
simply deleted after certain time period. The transaction tables provide the
point in time information and therefore are at the heart of any Business Intelligence
initiative.
Non-transaction data is also
called reference data are stored in tables called reference table. The
reference table contain such information as customer unique identifier details
(name, address, account number etc), vendor details (vendor name, vendor
number, vendor address etc), and company employees, company address etc. This
information is critical to the organisation. The data in reference tables are
used for referential integrity in transaction table. Reference tables are
normally never archived or deleted.
Another way of categorising
data is operational and non-operational. Operational data is the real-time
collection of data in support of a company’s need in their daily activities.
Nonoperational data is normally captured in a data warehouse on a less frequent
basis and used of business intelligence (BI) [1].
Accordingly this particular
classification of data is used to divide MDM into two sections Operational MDM
and Analytical MDM [1, 3, 4, 8 and 13]. A third category is a combination of
operational and analytical MDM and is called enterprise MDM [1, 4]. Operational
MDM integrate operational applications such as enterprise respirce planning
(ERP), customer relationship management (CRM), and supply-chain management
(SCM) in upstream data flow [8]. Analytic MDM is seen in practices which
reminds data warehousing (DW) such as customer data integration and financial
performance management. The enterprise MDM system is used for maintain and
publishing all the organisation master data.
The architecture of
enterprise MDM is shown in figure 2. The main components of MDM system are MDM
applications, a master data store, a metadata store and a set of master data
integration services [14]. This is shown in figure 3.
Figure 2: Enterprise MDM Architecture [3]
Figure 3: MDM components [14]
Enterprise MDM is the most
intrusive implementation, while analytical MDM is least intrusive reason being
enterprise MDM encompasses both operational and non-operation data. As a result
the gain is highest in enterprise MDM implementation. Additionally while implementing MDM, it makes
sense to break down the MDM initiative into phases and target just a few
applications at a time to avoid disruption.
It is important to
understand how master data is created, used, maintained and integrated with
multiple applications. The MDM frameworks mentioned in this section describes
various ways to store, process and synchronise master data. The main components
of MDM are:
1.
Composite applications
The applications are the IT applications
which will collect, use and maintain master data. An example of this
application could be customer service software used in a call-center, an ERP
system, a down stream application, a front end web application etc. Each
composite application will have its own database or two applications could
share a database.
2.
Business Process Orchestration
This is the most critical part of MDM
initiative. Business Process Orchestration is a set of rules, guidelines,
workflow or regulations created by the business owners and leaders such that
the data being entered in the applications are consistent, and accurate. An example
of this role could be as below. To eliminate discrepancies in name (Michael vs.
Mike, Robert vs. Robert) of the same person, a homeless shelter clerk would
verify the name of the client with his MCP card or any government issued valid
ID. This is a an example of a simple business process and correct
implementation of this is critical for success of MDM. A complex example of business
process could be a set of Microsoft SharePoint workflow steps that would be
required by a clerk to be completed before an new vendor or a supplier is added
to the SCP application.
3.
Enterprise Service Bus
This is the technology component of the MDM.
This could be a complex middleware products like Software AG’s Webmethods, IBM
websphere or Tuxedo. Or it could be a simple solution as a network share with
xml reader products.
4.
MDM data synchronisation services
These are services that will synchronise master data
between the applications or between application and the master data store.
These could be triggered based service or could be message based service.
The Single Central Repository Architecture is shown in
figure 4. In this architecture, the master data is stored in a single central
repository which will be updated by the MDM services, and the applications. The
applications will not hold a copy of any master data. Applications refer to
master data from the central repository.
There are no local versions of Master Data anywhere.
Figure 4: Single Central Repository
Architecture (SCRA) [1]
Advantages of SCRA are that it guarantees data consistency
[1], and some of the applications may become redundant one SCRA repository is
up and running therefore enabling to retire legacy applications.
However the disadvantage is the massive upfront cost. The
upfront implementation and migration to SCRA is costly because it requires
massive data conversion effort and migration of data from multiple disparate
systems. This can also be disruptive to business.
The prevalence of COTS products could possibly make implementation
of SCRA difficult if not impossible. However once SCRA is implemented, the cost
of maintenance would be minimal [1].
The central Hub and Spoke Architecture is a variation of
SCRA [1]. It contains a central repository (central hub) while also providing
ability to the individual application to maintain an extension of the data.
Therefore some application would access master data from the central hub and
not keep a local copy, others might only use the central hub as a reference
[15].
Figure 5: Central Hub and Spoke
Architecture (CHSA) [1]
The biggest advantage of
CHSA is its flexibility to relatively decouple by supporting spoke systems.
This flexibility is really important when we have COTS applications which
cannot be coupled with the central hub [1]. The flaw of CHSA hub-and-spoke is that it
doesn’t address issues of timeliness and latency [15]. Additionally the data
conversion effort is still required.
This pattern uses data virtualization to provide one or
more on-demand integrated views of master data entities such as customer,
product, asset, employee etc. even though the master data is fractured across
multiple underlying systems. Applications, processes, portals, reporting tools
and data integration workflows needing master data can acquire it on-demand via
a web service interface or via a query interface such as SQL [16].
Figure 6: The Virtual Master Data
Management pattern
Data Service Federation is a common Virtual Integration architecture.
The virtual integration pattern aggregates data from multiple sources into a
single view by maintaining metadata definition for all the sources [1].
Figure 6: Data Service Federation
(DSF) [1]
The advantages of DSF is
less costly than the SCRA and CHSA because the data does not have to be
physically copied from one location to another nor any additional storage space
is required. However the biggest disadvantage is that the data improvements are
not propagated back to the source application [1].
6.
Data Synchronisation Techniques
The data synchronisation is the critical step to maintain
the consistency of master data. Synchronization step is mostly required regardless
of what type of MDM framework is implemented. In this paper three different
types of data synchronisation techniques are described. However it must be
noted that there can multiple other ways of synchronisation data and that
database or data synchronisation is not unique to MDM.
6.1.
Trigger based
Trigger based approach is described in the figure below. In
this approach the trigger for data synchronisation is an update or insert event
on the source database table record.
In this step when a candidate record is modified in source
database, a service polls the event and propagates the modified data to other
database tables.
Figure 8: Trigger based [13]
This kind of synchronisation ensures that all the data in
multiple tables are always synchronised. However while it is easy to implement
in a small scale setting, the process is extremely computation intensive in a
large scale. It is also dependent on high availability of networks.
6.2.
Message-based Data Synchronisation and
Integration Framework (MDSIF)
The message based data synchronisation and Integration
Framework is detailed in the article [13]. In this process message oriented
middleware (MOM) is used to propagate the data from multiple data
Figure 7: Message-based Data Synchronisation and Integration
Framework (MDSIF)
6.3.
Confidence Tables Approach
Figure 9: Confidence Tables Approach [13]
In this section we will revisit a
business case mentioned in the previous section and apply MDM principles to
achieve one Master Data. We will select case study on multiple homeless
shelters. The business case is restarted below for easy reference.
It can be concluded
based on the findings in this paper that MDM cannot be classified as only an IT
problem but it is a managerial challenge which requires structural changes to
managing business processes, and managerial decision making.
1.
Master Data Management in Practice: Achieving
True Customer MDM. Cervo, Dalton, Allen, Mark.
ISBNs: 9780470910559. 9781118085660. [Wiley Corporate F&A].Hoboken,
N.J.: Wiley. 2011
2. Management
of the master data lifecycle: a framework for analysis. Ofner, Straub, Otto and
Oesterle.
3. Practical
Approach for Master Data Management, Chandra Sekhar Bhagi, World of
Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741
Vol. 1, No. 5, 213-216, 2011
4.
Enterprise Master Data Management Trends and
Solutions. APOSTOL, Constantin-Gelu, http://revistaie.ase.ro, Vol.
XI, no. 3/2007
5. Understanding
the Scope of Data Management: The Components of a Robust Enterprise Program.
Cohen, Rich.
6. Gartner
Inc,.”Hyper Cycle for Master Data Management, 2010” Andrew White and John
Radcliff.
7. The
transverse information system: new solutions for IS and business performance,
Rivard, François. 1-84821-108-2, 978-1-84821-108-7 Date: 2009 Page: 49 – 84
8. Managing
one master data – challenges and preconditions, Risto Silvola, Olli
Jaaskelainen, Hanna Kropsu-Vehkapera, Harri Haapasalo. Industrial Management
& Data Systems, ISSN: 0263-5577, Volume 111 issue 1
9. Methodologies
for data quality assessment and improvement, Batini, C., Cappiello, C.,
Francalanci, C., Maurino, A. , ACM Computing Surveys, Vol. 41 No.3
11. Introduction
to Master Data Management. Mark Rittman,
13. Message-Based Approach to Master Data Synchronization among
Autonomous Information Systems, Dongjin Yu and Hangzhou Dianzi. International
Journal of Enterprise Information Systems, 6(3), 33-47, July-September 2010.
14.
Using Master Data in Business
Intelligence Colin White, BI Research available at www.fm.sap.com and www.broadstreetdata.com
15.
The Flaw of the Hub-and-Spoke
Architecture, Evan Levy, Information Management Jounal available at
Data
Federation- Master Data Patterns - The Virtual MDM Pattern, Mike Ferguson,
available at http://www.b-eye-network.co.uk/blogs/ferguson/archives/2009/12/data_federation-_master_data_p.php