For a bulk of concern administration activities, information systems ( IS ) are the cardinal enablers for operations and determination devising. Regarded as the ultimate system, Enterprise Resource be aftering systems provide administrations with big application functionality by back uping a major portion of concern activities. They aim is to work out the job of fragmented information system in big organisations, thereby conveying together assorted types of informations from different concern activities in a consistent theoretical account of the concern ( Knolmayer and Rothlin, 2006 ) . Unfortunately, these systems have been plagued with informations quality issues, rendering them inefficient to the chief purpose of supplying accurate and indispensable information required for doing indispensable concern determinations.
This chapter brings into concentrate the ERP systems and its functionalities every bit good as the sort of informations that exists within such systems. It so goes on to explicate informations quality and those that affect the functionality of ERP systems. It goes on to show the usage of ontologies as a solution to such issues
This research therefore purposes
In quite a big per centum of bing administrations, information systems are regarded as the cardinal enablers of operations of concern activities and decision-making, particularly in the range of ERP systems whose chief map is to back up and incorporate all concern maps, procedures and units of an administration and to make a system that is capable of supplying up-to–date and relevant information to the determination shapers, the employees and concern spouses ( Fotini et al, 2008 ) . They non merely embrace the “enterprise” and concentrate on “resources” but besides facilitate undertakings beyond be aftering which include fiscal control, operational direction, analysis and coverage, and everyday determination support ( Botta-Genoulaz and Millet, 2006 ) .
By and large, ERP systems in concern endeavors chiefly include the undermentioned functional faculties: planning, production, stock list, selling and gross revenues, fiscal direction, material direction, e.t.c. ( Zhang and Liang, 2006 ) . Therefore, Scott ( 2002 ) defined ERP systems as “a suite of incorporate corporate broad package applications that drive fabrication, fiscal, distribution, Human Resources, and other concern maps in a real-time environment. These faculties represent or play critical functions in the assorted sections in a concern endeavor and are fundamentally used for mundane concern minutess. This is because they provide mention theoretical accounts that, harmonizing to the makers, embody the current best concern patterns by back uping organizational concern procedures ( Botta-Genoulaz and Millet, 2006 ) .
Their design promises to extinguish the job and cost of runing disparate bequest systems by supplying a individual package system which provides a figure of separate but incorporate faculties. These faculties represent the functionalities of the assorted sections within the administration. The execution of an ERP system is rather expensive, necessitating a multi-million dollar budget and big undertaking squads ( Xu et al, 2002 ) . Despite the costs, many administrations universe over have deployed them within their administrations. This research will concentrate on SAP’s ERP Package because SAP is one of the well-known ERP bundles on the market, with strengths in finance and accounting ( Xu et al, 2002 )
It is an incorporate concern system, which evolved from a construct foremost developed by five former IBM systems applied scientists in 1972. It is a package bundle designed to enable concerns to efficaciously and expeditiously run a assortment of concern procedures within a individual incorporate system. SAP stands for systems, applications and merchandises in informations processing. It is produced by SAP AG, based in Walldorf, Germany, which employs more than 22,000 people in more than 50 states. SAP AG is the third-largest package company in the universe and the largest ERP supplier. SAP package is deployed at more than 22,000 concern installings in more than 100 states and is presently used by companies of all sizes, including more than half of the world’s 500 top companies ( SAP AG Corporate Overview, 2000 ) . Therefore, mySAP ERP system is an first-class system to analyze in an attempt to measure ERP environments.
The information in an ERP system is seen from two different positions, from a concern position and from a proficient position ( Wieczorek et al, 2008 ) . From the concern position, information is divided into Master Data and Transactional Data. Master Data refers to core concern entities a company uses repeatedly across many concern procedures and systems, e.g. hierarchies of clients, providers, histories, merchandises or organizational units ( Brunner et al, 2007 ) .It is “the informations that has been cleansed, rationalized, and integrated into an enterprise-wide “system of record” for nucleus concern activities” ( Berson and Dubov, 2007, p8 ) . Therefore, it is informations that is created one time and re-used many times. Transactional information, on the other manus, has a short life-span and is used for a specific dealing. It is ever related to get the hang informations ( Wieczorek et al, 2008 ) . For illustration, an order for a merchandise requires specific informations such as measure of merchandise or bringing deadline, such informations is known as transactional informations.
From a proficient position, the difference between maestro informations and transactional information is about undistinguished as transactional informations must be stored in a database and hence mentions the maestro informations tabular arraies.
Over the old ages, ERP systems have by and large advertised as a Panacea, with the ability to extinguish the informations quality issues that most legacy systems tend to hold. They make usage of relational database engineering which integrates informations from the assorted functional faculties in the system. One of the strongest statements for ERP systems as before stated, is that maestro informations is entered merely one time and can be used multiple times in different contexts on an endeavor broad system ( Knolmayer and Rothlin, ) , thereby extinguishing the happening of redundancies and inaccuracies.
However, in world, mistakes are most likely to happen during the gaining control of the maestro informations and such mistakes find their manner round the system, finally impacting the transactional informations in the ERP database. If such informations is used to do concern determinations, it may hold an inauspicious consequence on the administration.
To get the better of such jobs, many administrations are presently developing and implementing informations warehouses in order to cut down the costs associated with the proviso of informations to back up concern procedures, and to accomplish high estimated returns on investing ( McFadden 1996 ) . Developers of ERP systems have besides adopted this attack by incorporating informations warehousing engineering into their systems. Fig 2 below shows a diagram of the informations warehouse based on an ERP system.
Data warehouse architecture of an ERP system in coal excavation industry
Beginning: Zhang and Liang ( 2006 )
The diagram above shows the flow of informations from the functional faculties of an ERP system to the information warehouse where it is cleaned and stored. The informations can so be queried and used as a beginning of cognition for doing informed concern determinations.
“A informations warehouse is non a merchandise but a construct to back up an integrated and systematic informations architecture to present high quality, determination relevant ( informations ) structures” ( Lehman and Jaszewski, 1999, pp.1 ) . A simple definition by Delvin ( 1997 ) , defines a Data warehouse as “a individual, complete and consistent shop of informations obtained from a assortment of beginnings and made available to stop users in a manner they can understand and utilize in concern context.” Chaudhuri and Dayal ( 1997 ) likewise defined informations warehousing as a aggregation of determination support engineerings, aimed at enabling the cognition worker ( executive, director, and analyst ) to do better and faster determinations.
A information warehouse is a “subject-oriented, integrated, clip varying, non-volatile aggregation of informations that is used chiefly in organisational determination making” ( Inmon, 1992 ) . It supports online analytical processing ( OLAP ) , which differs from the online dealing processing ( OLTP ) applications supported by the traditional operational databases ( Chaudhuri and Dayal, 1997 ) . Data warehouses infusion, cleanse, integrate and shop huge sums of informations from ERP systems, thereby supplying the relevant support for seasonably and accurate response to user’s questions ( Zhang and Liang, 2006 ) . They provide the information that is required for back uping executive determination devising.
As before stated, informations warehouses were integrated with ERP systems to work out the issue of informations quality, such as informations duplicate and redundancy. This is because they are designed to enable concerns efficaciously and expeditiously run a assortment of concern procedures based on the maestro informations from a individual depository system i.e. informations warehouse ( Xu et al. , 2002 ) . Although, considered a good solution, information warehouses over the old ages have merely proved to be a impermanent solution as the information contained within these systems is still plagued with inaccurate and uncomplete informations which adversely affects the competitory success of the administration ( Redman, 1992 ) . The following subdivision provides an overview of informations quality and those that affect ERP systems.
Although many definitions for informations quality have emerged in literature, the most widely accepted and used is “fitness of use” ( Wand and Wang, 1996 ; Strong et Al, 1997 ; Tayi and Ballou, 1998 ; Wang, 1998 ; Orr, 1998 ) . This means that any construct of quality can merely be applied at the minute where the information is used for some intent ( Dalcin, 2005 ) . Simply put, the quality of informations can non be assessed without seting into consideration the people who use the informations, in other words, the information consumers ( Strong et al ( 1997 ) and Chrisman ( 1991 ) ) . In support of this construct, English ( 1999 ) stated that informations within a database has no existent value or quality, but merely possesses possible value. Its value is merely realised when person uses it for something.
Since informations quality does non hold a to the full embracing definition, it has been proposed as a multi-dimensional construct ( Scannapieco and Missier, 2005 ; Strong et Al, 1997 ; Lee and Strong, 2003 ) . This is because informations quality is defined across assorted dimensions in literature. Some of the most typical dimensions are truth, dependability, importance, consistence, preciseness, seasonableness, comprehensibility, concision and utility ( see Wang and Strong ( 1996 ) for an extended list of informations quality dimensions in literature ) . They define informations quality dimensions as by “a set of informations quality attributes that represent a individual facet or concept of informations quality” .
Beginning: Moody et Al ( 1998 ) .
For simplistic intents, they propose four classs of informations quality dimensions: the intrinsic, handiness, contextual, and representational classs, into which the dimensions have been grouped. Table 1 below shows the four classs with associated dimensions.
Accuracy, objectiveness, credibility, repute
Relevance, value-added, seasonableness, completeness, sum of information
Interpretability, easiness of apprehension, concise representation, consistent representation
Example of informations quality dimension classs
Redman ( 2001 ) , suggested that for informations to be fit for usage, it must fulfill the assorted dimensions and supply a proper degree of item, be easy to read and easy to construe but informations quality non merely involves the accomplishment of the assorted dimensions, it besides involves informations direction, patterning and analysis, quality control and confidence, storage and presentation.
On the other manus, Wand and Wang ( 1996 ) take an alternate attack by specifying the information quality dimensions utilizing Bunge’s ontology. They identify five intrinsic informations quality jobs which occur when information is said to be uncomplete, meaningless, equivocal, excess or wrong. Following this train of idea, Shanks and Drake ( 1998 ) , besides developed a frame work that define informations quality ends for the intrinsic and contextual informations quality classs, which they base on the semiotic theory.
The semiotic theory is concerned with the usage of symbols to go through on cognition ( Shanks and Corbitt, 1999 ) . Out of the six degrees proposed by Stamper ( 1992 ) , merely four are important within the context of informations quality: syntactic, semantic, matter-of-fact and societal degrees.
Semiotic Levels in Understanding Data Quality in a Data Warehouse
Syntactic informations quality is concerned with how informations is structured ( Shanks and Corbitt, 1999 ) . The chief of syntactic informations quality is consistence, where information properties have a consistent symbolic representation ( Ballou et al. 1996 ) . Matter-of-fact refers to the use of informations. Its chief ends are usability and usefulness ( Kahn et al. 1997 ) . Social is concerned with the shared apprehension of the significance of symbols. Its ends are an apprehension of different stakeholder point of position and consciousness of any prejudices ( Shanks and Corbitt, 1999 ) . Semantic informations quality refers to the significance of informations and will be the chief focal point of this research. This dimension will be discussed more in-depth in the following subdivision.
The derivation of semantic quality standards is based on the work of Wand and Wang ( 1996 ) because as opposed to other literature, it provides a alone theoretical and accurate attack to the definition of the informations quality standards ( Price and Shanks, 2004 ) . As early stated, semantic informations quality is concerned with the significance of informations ( Shanks and Corbitt, 1999 ) with the chief end of accomplishing the highest degree of informations completeness and truth ( Tayi and Ballou ( 1998 ) ; Wang et Al ( 1995 ) ) .
Accuracy refers to how good symbols represent provinces of the existent universe ( Shanks and Corbitt, 1999 ) . Pipino et Al ( 2002 ) specify it as “the extent to which information is right and reliable” while Motro and Rakov ( 1998 ) , present it as whether information available are the true values.
Batini and Scannapieco ( 2006 ) , define semantic truth as “cases in which V is a syntactically right value but different from v’ . Simply put, truth is the intimacy between the values v and v’ , considered as the right representation of the real-life phenomena.
Completeness is defined as “the extent to which informations are of sufficient comprehensiveness, deepness, and range for the undertaking at hand” . The completeness dimension can be viewed from many positions. At the most abstract degree, one can specify the construct of the scheme completeness, which is the grade to which entities and properties are non losing from the scheme. At the informations degree, one can specify column completeness as a map of the losing values in a column of a tabular array ( Pipino et al. , 2002 ) . For Wand and Wang ( 1996 ) , completeness is the ability of an information system to stand for every meaningful province of the represented existent universe system.
Veregin ( 1998 ) defines completeness as “a deficiency of mistakes of skip in a database” and depict two sorts of completeness: information completeness, as a mensurable mistake of skip observed between the database and the specification ; and pattern completeness, as the understanding between the database specification and the “abstract universe” that is that portion of the existent universe for which informations are required for a peculiar database application. Motro and Rakov’s ( 1998 ) definition for completeness is “whether all the informations are available” , sing the database nomenclature where information completeness refers to both the completeness of files ( no records are losing ) , and to the completeness of records ( all Fieldss are known for each record ) .
As before stated, ERP systems depend mostly on maestro informations stand foring their clients, providers and merchandises to transport out the assorted concern activities. Master information is typically created one time and re-used many times and does non alter often. It is distributed through a controlled procedure which is supposed to guarantee that all informations is entered and approved with regard to concern regulations, and that every user and every system should have new or updated maestro informations on-demand ( Knolmayer and Rothlin, 2006 ) .
The major job impacting maestro informations is that its gaining control and processing are erring activities. This can either be due to human mistake when capturing the information, the integrating of informations with different semantic regulations from multiple informations beginnings in to the informations warehouse. For case, duplicated or losing informations will bring forth wrong or deceptive statistics, merely put refuse in, refuse out, ( Rahm and Do, 2000 ) . The information quality jobs blighting these systems can be classified into two classs, single-source jobs and multi-source jobs.
From the above diagram, both the single-source and multi-source jobs are categorised into schema degree jobs and case degree jobs. The schema degree jobs are those that are reflected in design of the informations shop while the case degree jobs refer to the mistakes and incompatibilities in the existent information and are non seeable at the scheme degree although the jobs at the schema degree affect the case degree.
Single-Source jobs can be grouped into those that occur in a individual relation ( a database or a file ) and those that occur from bing relationships among the assorted dealingss ( Oliviera et al, ) . The information quality of a beginning is based on the scheme and unity restraints commanding acceptable informations ( Rahm and Do, 2000 ) . A scheme is an internal representation of the universe ; an organisation of constructs and actions that can be revised by new information about the universe ( Wordnet, ) . For informations beginnings that do non hold schemes such as files, there are no limitations on what is allowable. Therefore there is a really high chance of mistakes happening. Databases on the other manus, enforce limitations based on the pre-defined informations theoretical account and application-specific unity restraints ( Rahm and Do, 2000 ) . This gives rise to schema related jobs if the information theoretical account or application-specific unity restraints are inappropriate. In this instance semantic unity refers to the “preservation and consistence of database semantics across different applications” .
On the case degrees, mistakes that occur are due to misspelling, duplicated informations, meaningless informations, and contradictory informations. These mistakes can non be prevented at the schema degree but can be influenced by mistakes at that degree ( Rahm and Do, 2000 ) . Most of these mistakes occur due to human mistake when information is being entered into the system.
Multi-Source Problems are aggravated single-source jobs. Each beginning may incorporate soiled informations and the informations beginning may be represented otherwise, overlap or belie one another ( Rahm and Do, 2000 ) . They occur at the scheme degree because the different informations beginnings are governed by different scheme and unity regulations. The different beginnings have different sets of patterning informations, different calling conventions, e.t.c. At the case degree, informations from the assorted systems might intend the same thing but are represented otherwise. For illustration, gender in one beginning may be represented as Female and Male, while from another it may be represented as 0 and 1. This hence leads to data incompatibilities and reproduction.
During informations integrating, there are two degrees of informations integrating that may happen, these rae the extensional and intensional degrees.
The deduction of such informations quality jobs on an endeavor can non be understated. The following subdivision gives a brief description of the importance of informations quality in a typical endeavor.
Redman ( 1996 ) stated that hapless informations quality impacts a typical endeavor at assorted degrees. At an operational degree hapless informations quality leads to client displeasure, amplified cost and reduced employee occupation fulfillment. Poor quality of informations leads to an addition in operational cost because clip, fiscal and non-financial resources are dedicated to observing and repairing mistakes. Data quality besides has an impact on the tactical degree. At the tactical degree hapless informations quality nowadayss trouble in the reengineering procedure. At a strategic degree informations quality makes it progressively hard to put and put to death concern scheme. It besides contributes towards the issues of informations ownership and diverts management’s attending.
Therefore, this research aims at bettering the information theoretical accounts of ERP systems because if the quality of a information theoretical account that defines the entities and attributes relevant to the user, so data quality will besides be improved ( Reigner and Gregory ( 1994 ) and Fox et Al ( 1994 ) ) . Although the informations patterning stage represents merely a little parts of the entire development attempt, its impact on the concluding consequence is likely greater than any other stage. The information theoretical account forms the foundation for all ulterior design work, and is a major determiner of the quality of the overall system design. This is because informations theoretical accounts focus on structuring the entity and attribute parts of user demands ( Yoon et al, 2000 ) . The information theoretical account is one of the most critical constituents in the full systems’ development. To this terminal, this research will follow the usage of ontologies to better the information theoretical account of the ERP system. The following subdivision gives a brief debut of ontologies and its utilizations. It besides highlights how it will be used to better the information theoretical account.
Ontologies have gained popularity within the Information Technology community because it serves as agencies for set uping expressed formal vocabulary to portion between applications. ( Noy, 2004 ) . Aristotle defined ontology as the “science of being” ( cited in Guarino and Giaretta ) . Traditionally studied in doctrine, ontology is “the metaphysical survey of the nature of being and existence” . Smith and Welty ( 2001 ) defined it in a different mode when they presented ontology as “the scientific discipline of what is, of the sorts and constructions of objects, belongingss, events, procedures, and relation in every reality” .
Ontology is a well-established theoretical sphere within doctrine covering with identifying and understanding elements of the existent universe and their significance. As one of the many borrowed footings within the context of computing machine scientific discipline ( Antoniou and Harmelen, 2004 ) , the most normally used definition is given Gruber ( 1993 ) who defines ontology as “an expressed specification of conceptualisation” . Conceptualization is merely the manner the universe or a peculiar sphere is viewed ; hence ontology describes a sphere in footings of its constructs and relationships ( Horridge et al, 2004 ) . While Guarino and Giaretta ( 1995 ) have been critical in their definition, it has largely been philosophical in nature. Gruber on the other manus communicates an thought of ontology in a simple and precise mode which makes it the most by and large accepted definition of ontology.
Wang et Al ( ) , on the other manus specify an ontology as “a formal expressed specification of a shared conceptualisation of a sphere. It represents the constructs and their dealingss that are relevant for a given sphere of discourse. It consists of a representational vocabulary with precise definitions of the significances of the footings of this vocabulary plus a set of axioms.” Supplying a simplistic definition, Fensel ( 2001 ) said it “provides a shared and common apprehension of a sphere that can be communicated between people and heterogenous, widely dispersed application systems. “
The kernel of information systems is that they are designed in such a manner that they are a faithful representation of the universe in the same manner worlds perceive it. Therefore, theories of ontology provide the footing for understanding and documenting real-world semantics of the informations ( Daga et al, 2005 ) . Data theoretical accounts have been used in the context of information systems for many decennaries with the chief purpose of making representations of world. They are used in administrations to stand for world at three different degrees. They are used to set up the highest degree of description of an organisation’s world, concept a description of the world environing a proposed information system and eventually, they are used to pattern parts of an organisation’s world taking to execution in an operational database ( Kazimerczak and Milton, 2005 ) . Therefore, ontologies are proposed as a method of bettering informations theoretical accounts as they present a shared apprehension of a sphere. They enable shared communicating and apprehension between people with different demands and point of views originating from their different contexts. This allows for the creative activity of normative theoretical accounts which are extendable for future intents and creates the semantics of the system ( Uschold and Gruniger, 1996 ) .
Ontologies besides promote consistence and cut down ambiguity of informations that exists in different systems. They are provide easy to re-use library of objects, properties, relationships, e.t.c, hence incorporating informations from different systems becomes easier.
There is a broad scope of research carried out on the issue of informations quality with legion proposed methods of get the better ofing such issues. This research will concentrate on research carried out with respects to data integrating and Data Warehouses. One of such methods is proposed by Goh et Al ( 1999 ) and Mandick and Zhu ( 2006 ) , which focuses on a flexible question replying system, named COntext INterchange ( COIN ) . The system allows users to question informations in multiple beginnings without worrying about the most syntactic and semantic differences in those beginnings ( Mandick et al, 2009 ) . COIN understands the context of the information beginnings and the information consumers and efforts to get the better of informations misunderstanding jobs by change overing informations in to signifiers users prefer and can understand ( Mandick et al, 2009 ) . Although, the COIN method has proved to be utile in work outing some facet of informations quality, it wholly ignores the presence of informations quality issues within the systems.
There has besides been important research in entity declaration and scheme matching. Schema matching ( Rahm and Bernstein, 2001 ; Doan and Halevy, 2005 ) entails the development of techniques to automatically or semi-automatically fit the assorted informations scheme, the consequence of which can be used to build a planetary scheme for the information warehouse. On the other manus, entity declaration ( Wang and Mandick, 1989 ; Talburt et al. 2005 ) besides known as record linkage ( Winkler, 2006 ) and object designation ( Tejada et al. 2001 ) , provides the techniques that are used to better completeness, decide issues on incompatibilities and for the riddance of redundancies during informations integrating procedure ( Mandick et al, 2009 ) . Other conventional solutions include informations cleaning, informations monitoring, informations cleansing, e.t.c.
Although, all research is important in its ain right as there is no individual perfect attack to work outing a job, the following subdivision provides a comparing between ontologies as a solution for informations quality issues and the conventional attacks and argues why ontologies are considered the best solution in this research.
Botta-Genoulaz and Millet ( 2005 ) An probe into the usage of ERP in the service sector. International Journal of Production Economics, Vol. 99 ( 1-2 ) pp. 202-221
Fontini, M. Anthi-Maria, S. and Euripidis, L. ( 2008 ) ERP Systems Business Value: A Critical Review of Empirical Literature. Panhellinic Conference on Informatics, pp. 186-190
Knolmayer, G. F. and Rothlin, M. ( 2006 ) Quality of Material Master Data and Its Consequence on the Usefulness of Distributed ERP Systems. Lecture Notes in Computer Science, Vol. 4231 pp. 362-371
Scott, T. ( 2002 ) , “ Aligning your informations aggregation and ERP execution determinations ” , IT Papers, available at: hypertext transfer protocol: //www.autoscan.biz/images/PDF/resource/aligning % 20your % 20data % 20collection % 20and % 20ERP % 20implementation % 20decisions.pdf. Accessed 22nd July, 2009
Uschold, M. and M. Gruniger, “ Ontologies: Principles, methods and applications ” , Knowledge Engineering Review, vol 11 ( 2 ) , pp. 93-155, 1996.
Wieczorek, S. , Stefanescu, A. , and Schieferdecker, I. ( 2008 ) Test Data Provision for ERP Systems. International Conference on Software Testing, Verification, and Validation. pp. 396-403
hypertext transfer protocol: //wordnetweb.princeton.edu/perl/webwn? s=schema
Xu, H. , Nord H. J. , Brow, N. , Nord, D. G. ( 2002 ) . Data quality issues in implementing an ERP. Journal of Industrial Management & A ; Data Systems, Vol 102 ( 1 ) pp. 47-58
Zhang, H. and Liang Y. ( 2006 ) A Knowledge warehouse system for endeavor resource be aftering systems. Systems and Behavioural Science, Vol. 23 ( 2 ) pp. 169-176
No related essays.