developing a patient data mining system for the …

University of Ghana Business School, Legon

DEVELOPING A PATIENT DATA MINING

SYSTEM FOR THE UNIVERSITY OF GHANA

HOSPITAL

By

AFARI-KUMAH, EBENEZER

A Dissertation presented to the University of Ghana Business School, Legon in PartialFulfillment for the award of a Master of Business Administration (Management

Information Systems Option) Degree.

July 2005UGBSUniversity of GhanaLegon.

DECLARATION

I hereby declare that this study is my original piece of research conducted between August 2004 and

May 2005 under the supervision of Mr. John Effah, University of Ghana Business School (UGBS),

Legon

In places where references of other people’s work have been cited or their view adopted, full

acknowledgements have been given. I therefore accept full responsibility for any mistake contained

therein. No part of this project work has either been presented in whole or in part to any other

institution for any award.

……………………………… ……………………………..

AFARI-KUMAH, EBENEZER MR. JOHN EFFAH

(STUDENT) (SUPERVISOR)

DEDICATION

To

My parents,

Emmanuel Frank Afari-Kumah and Comfort Yawa Afari-Kumah (1932-1992)

and Siblings,

whose love is the bedrock upon which my life is built.

ABSTRACT

Patient medical data are collected at the University of Ghana Hospital but they are not processed

electronically. In the current medical record management system, the majority of out-patients do not

have a full medical record. Thus, physician’s time is wasted by having to collect all the information

again. In addition, it becomes very difficult to keep track of the patients and to review a patient’s

medical record.

A Data Mart was designed and built using Microsoft Access 2000 Database Management Systems

(DBMS) to collect, store, organize and retrieve medical information of patients at the Medical Records

Unit of the University of Ghana Hospital. The Data Mart was mined to provide timely, accurate and

reliable reports adequate for clinical research and improving health care continuity.

The Data Mart interfaces are intuitive and easy to use. The queries are flexible and the reports well

labeled.

ACKNOWLEDGEMENTS

The successful completion of this study would not have been accomplished without the guidance, co-

operation and support of some people.

First and foremost, I wish to thank the Almighty God for seeing me through this project successfully.

My profound gratitude goes to my supervisor, Mr. John Effah for availing his expertise, suggestions,

advice, comments and constructive criticisms to the completion of this write-up, and also to the Head

of Management Information Systems Department, UGBS, Dr. F. K Banuro for his constructive

criticisms and comments.

A special note of appreciation also goes to the Director-General of Council for Scientific and Industrial

Research (CSIR), and the Management and Staff of the Institute for Scientific and Technological

Information (INSTI) for granting me the study leave to pursue the post-graduate course. To the

management and staff of the University of Ghana Hospital especially Messrs Gaisie and Offei, I say

thank you for granting me opportunity to use the data at your outfit for this study.

Mention must also be made of these people for their support, encouragement and pieces of advice that

spurred me on to go through this master’s programme successfully, Mr. C. Entsua-Mensah, Director,

INSTI-CSIR, Mrs. Marjorie Ethel Danso-Manu, Head, Department of Statistics, UG and Mr. Kwabena

Sarpong-Boateng, Tennessee USA.

I owe special gratitude to Mr. Martin Egblewogbe, Physics Department for reading the manuscript and

making useful suggestions towards this work, Messrs J.O Tackie-Yarboi and George Edward Neizer,

all of AQ Solutions for their Technical input and Prof. Jeffrey D. Ullman, Department of Computer

Science, Stanford University for his thought provoking questions and suggestions.

To the Church Board and the Youth Ministry of Global House of Help (Ebenezer Assembly of God

Church, Kotobaabi) for their prayers; my siblings for being there for me; Prisca, Power, Herty, and my

course mates for their encouragements.

Finally, to all who assisted me in diverse ways, I say thanks and may God bless you.

TABLE OF CONTENTS

DECLARATION i

DEDICATION ii

ABSTRACT iii

ACKNOWLEDGEMENT iv

TABLE OF CONTENTS vi

LIST OF FIGURES ix

CHAPTER ONE INTRODUCTION TO THE STUDY 1

1.0 Background to the study 1

1.1 Brief History of the University of Ghana Hospital 6

1.1.1 Medical Records Unit 7

1.2 Statement of the problem 8

1.3 Objectives of the Study 10

1.4 Scope 10

1.5 Justification 10

1.6 Methodology 12

1.7 Significance of the Study 13

1.8 Conclusion 14

CHAPTER TWO DATA WAREHOUSING, DATA MART AND 15

DATA MINING CONCEPTS

2.0 Introduction 15

2.1 Information systems 142.2 Data Warehousing 17

2.2.1 Data warehousing and OLTP systems 202.2.2 Problems with datawarehousing 202.2.3 Benefits of Data Warehousing 21

2.3 Data Marts 21

2.3.3 Users of Data Warehouse and Marts 22

2.4 Data Mining 22

2.4.1 Data Mining Process 25

2.4.2 Data Mining Techniques and Models 25

2.4.3 Data Mining Functions 26

2.4.4 Data mining problems/issues 26

2.5 Current Data Mining Research Efforts 27

2.5.1 Data Warehousing and Data Mining In Healthcare 30

2.5.2 Benefits of data warehousing and data mining in Healthcare 30

2.5.3 Problems of data warehousing and data mining in Healthcare 34

2.5.4 Health Data Mining Software 36

2.6 Conclusion 37

CHAPTER THREE THE EXISTING PATIENTS’ RECORDS 38 MANAGEMENT SYSTEM 3.0 Introduction 38

3.1 Description of the existing system 38

3.1.1 The Process 38

3.1.2 Databases and Records Storage 41

3.1.3 Hardware 41

3.1.4 Software 41

3.1.5 Human ware 41

3.1.6 SWOT Analysis 42

3.1.6.1 Strengths 42

3.1.6.2 Weaknesses 42

3.1.6.3 Opportunities 43

3.1.6.4 Threats 43

3.2 Conclusion 44

CHAPTER FOUR INFORMATION REQUIREMENTS 45

OF DECISION MAKERS

4.0 Introduction 45

4.1 Monthly Statistical Report 45

4.2 Financial Statistical Report 46

4.3 Patient Status Report 46

4.4 Pharmaceutical report 47

4.5 Conclusion 47

CHAPTER FIVE DESIGN AND DEVELOPMENT OF THE PATIENTS’ DATA MINING SYSTEM 48

5.0 Introduction 48

5.1 Database Design for Data Mart 48

5.1.1 Data Dictionary of the Database of the Data Mart 50

5.2 The Star Schema Model 59

5.2.1 Data Dictionary for the Star Schema of Data Mart 61

5.3 Development Tool 66

5.4 Input design 66

5.5 Preparing the Data Mart for Mining 77

5.6 Queries 81

5.7 Reports 83

5.8 Conclusion 86

CHAPTER SIX PATIENT DATA MINING AND RESULTS 87

6.0 Introduction 87

6.1 Reports 87

6.2 Monthly Statistical report 87

6.3 Financial Statistical Report 97

6.4 Patient Status Report 98

6.5 Pharmaceutical Report 108

6.6 Conclusion 117

CHAPTER SEVEN RECOMMENDATIONS AND CONCLUSIONS 118

7.0 Recommendations 118

7.1 Conclusion 120

REFERENCES 122

LIST OF FIGURES

Fig. 1.1 Organisational chart of records unit

Fig. 3.1 Data Flow Diagram of Patient Record Management System

Fig. 5-1 Entity Relationship (ER) Diagram

Fig. 5-2 Logical Data Model of Data Mart (Star Schema)

Fig. 5.3 The Main switchboard

Fig. 5.4 The Patient Information Form

Fig. 5.5 The Treatment Form

Fig. 5.6 The Diagnosis Form

Fig. 5.7 The Visit Form

Fig. 5.8 The Outcome Form

Fig. 5.9 The Drug Form

Fig. 5.10 The Admission Form

Fig. 5.11 The Purpose of Visit form

Fig. 5.12 The Physician Form

Fig. 5.13 Extract and Load Data Mart form

Fig 5.14 Extract and Load Data Mart Message Box One

Fig. 5.15 Extract and Load Data Mart Message Box Two

Fig. 5.16 Extract and Load Data Mart Message Box Three

Fig. 5.17 Queries Interface

Fig. 5.18 Reports Interface

Fig. 5.19 Patients’ Data Mining System

Fig. 6.1 Top ten causes of Hospital Attendance

Fig. 6.2 Total Treatment and Maintenance Incomes

Fig. 6.3 Main Causes of Hospital Attendance

Fig. 6.4 Number of Patients Admitted By Status

Fig. 6.5 Outcomes of Patients Attendance

Fig. 6.6 Gender of Admitted Patients

Fig. 6.7 Wards of Admitted Patients

Fig. 6.8 Status of Admitted Patients

Fig. 6.9 Patient Diagnosis and Prescription

Fig. 6.10 Total Treatment and Maintenance Incomes

Fig. 6.11 Number of Patients by status

Fig. 6.12 Patient Attendance by Gender

Fig. 6.13 Number of Patients According to Diagnosis

Fig. 6.14 Number of Patients According Address

Fig. 6.15 Patient Attendance by Age

Fig. 6.16 Patients Diagnosis

Fig. 6.17 Prescriptions by Gender of Patients

Fig. 6.18 Drugs Consumed By Patients

Fig. 6.19 Prescribed Drugs and Age of Patients

Fig. 6.20 Diagnosis and Drugs Prescribed Per Patients

Fig. 6.21 Top Ten Causes of Hospital Attendance

Fig. 6.22 Status of Patients and Drugs Prescribed

CHAPTER ONEINTRODUCTION TO THE STUDY

1.0 Background

The past two decades have seen a dramatic increase in the amount of data being stored in electronic

format. This accumulation of data has taken place at an explosive rate. It has been estimated that the

amount of information in the world doubles every 20 months and the size and number of databases are

increasing even faster (Dilly 1995:1). According to Marakas (1998:5), during the 1980s, businesses

and industries all over the globe participated in a “frenzy” of automation. Almost literally, “if it moved,

it was computerised”. In this regard, the office became the new frontier for analysts and software

engineers much as the factory floor had been in the prior decade. This overwhelming concentration on

the automation of business processes appeared to offer organisations the opportunity to improve from

within and realise benefits such as increased profits and reduced costs.

Although many mission-critical business processes were improved during this period, the support of

decision making throughout the organisation remained focused on the operational and functional levels

of the organisation. Basic stock reports gushed out of the organisation’s information systems at an

alarming rate on a daily, weekly, and monthly basis.

Often, the data used to construct these recurrent emanations were too old, too detailed, or too

aggregated. In some cases, the use of such reports led to negative consequences seemingly greater than

if no information had been used.

By the end of the 1980s, it became clear to most organisations that if they were to identify a single key

to survival in the 1990s and beyond, it would be the ability to analyse, plan, and react to changing

business conditions in a much more rapid fashion. To accomplish this, management at all levels of the

organisation would need more and better information.

Database Management Systems gave access to the data stored but this was only a small part of what

could be gained from the data. Traditional On-Line Transaction Processing Systems (OLTPs), are good

at putting data into databases quickly, safely and efficiently but are not good at delivering meaningful

analysis in return.

In the view of Marakas (1999), for most part, these data were literally “locked up” in a thousand

computer systems and were, metaphorically speaking, “data in jail”. It is estimated that only a small

fraction of the data that are captured, processed, and stored in the enterprise is ever actually made

available to executives and decision makers.

The concept of a data warehouse or a data mart is part of the response by information technology to

meet this identified need. It is an elegantly simple concept that, over time, has the potential to evolve

into a significant contributor to the success and stability of an organisation in the global marketplace.

The data mart provides a facility for integrating the data generated in a world of unintegrated

information systems. A functional data mart organizes and stores all of the available data needed for

informational and analytical processing over a historical time perspective.

Data storage became easier as the availability of large amounts of computing power at low cost became

available i.e. the cost of processing power and storage has fallen and has made data cheap. There was

also the introduction of new machine learning methods for knowledge representation based on logic

programming in addition to traditional statistical analysis of data.

The new methods tend to be computationally intensive hence a demand for more processing power.

Improved data and information handling capabilities have contributed to the rapid development of new

opportunities for knowledge discovery. Interdisciplinary research on Knowledge Discovery in

Databases (KDD) has emerged in the present decade (Dilly, 1995:4).

Data mining as automated pattern recognition, is a set of methods applied to KDD that attempts to

uncover patterns that are difficult to detect with traditional statistical methods. Patterns are evaluated

for how well they hold on unseen cases. Databases, data warehouses, and data repositories are

becoming ubiquitous, but the knowledge and skills required to capitalize on these collections of data

are not yet widespread. The use of traditional verification-based approaches to analysis is difficult

when the data is massive, highly dimensional, distributed, and uncertain. Data Mining is the process of

discovering unexpected patterns and trends in data. Yet this process is often misunderstood and

undervalued (Abbott et al, 1999:2).

Data mining claims a novel kind of data exploitation - it is not simply the hypothesis confirmation of

statistics; nor is it simply the data visualization of graphs and plots. Data mining is becoming a force to

be reckoned with, because of the way it can generate new ideas.

It is one of the most efficient and effective ways to add value to data - and the kind of investments

many companies are making in data accumulation and storage right now may be hard to justify without

it.

Theus (2003: 298) identified two factors which basically triggered the momentum of data mining in the

last 10 years. One factor is the fact that more and more organisations were able to gather large amounts

of data on their customers and their transactions.

With modern database systems, these data were accessible to analysts in Research and Development

(R&D) and Marketing in an unprecedented easy and convenient way. The second factor is that software

companies started to develop “out of the box, off the shelf” applications, which apparently would serve

the data analysis needs of those companies.

The health sector in recent times has seen an increasing use of information and communication

technology in almost all facets of health service delivery. These developments are taking place largely

in an environment in which the infrastructure for the deployment and use of Information Technology

remain either weak or uncertain. Among the many weaknesses in infrastructure are the unreliable

domestic electricity supply and the lack of well-developed national technology infrastructures; an

inadequate level of computer literacy, computer density and an underdeveloped telecommunications

infrastructure. These limitations are especially pronounced in the rural areas where the need for health

services is strongest (Ghana Health Policy Document, 2000:1).

It is an acknowledged fact that information and communications technology can provide direct benefits

to health care delivery primarily by increasing access to medical and health care. This role of

technology in health care has been summed up in the Alma-Ata declaration of 1978 which state that

“Primary health care is essential health care based on practical, scientifically sound and socially

acceptable methods and technology made universally accessible and at a cost that the community and

country can afford”.

According to Hersh (2003), we are in an era of uncontrolled health care costs, compromised patient

safety, and a lack of utilization of Information Technology applications that have the potential to

improve the situation. We are also on threshold of great new scientific advances in genomics and

related disciplines. The discipline that focuses on the acquisition, storage, and use of information in

health and biomedicine is called Medical Informatics. Hersh defined Medical Informatics as the

integrative discipline that arises from the synergistic application of computational, informational,

cognitive, organisational, and other sciences whose primary focus is the acquisition, storage, and use of

information in the health/biomedical domain.

1.1 Brief History of the University of Ghana Hospital

The University Hospital was commissioned in 1957 as a department of the University of Ghana (UG)

to provide medical care to the university community comprising staff and their dependents, and

students. In 1976, the services of the hospital were extended to the general public to cater for the rapid

growth in population of the surrounding communities. In recognition of this extended responsibility,

the Ministry of Health (MOH) provided assistance annually in the form of drug supplies, equipment

and instruments until 1986 when the cash and carry system was introduced. Currently classified as a

district hospital by the MOH, it provides general and specialist care.

The hospital is governed by the statutes of the University and is funded from the university’s annual

subvention allocation (PriceWaterHouseCoopers, 2000:19).

The hospital is an 80 bed hospital located behind the Legon Police station opposite the UG main

entrance. The Hospital also receives referral cases from the Student Clinic located on the UG campus,

private hospitals and clinics within its catchment area (Annual Report, 2000:8).

1.1.1 Medical Records Unit

Of particular interest in this study is the records unit. The records unit is controlled by a Biostatistician

who is in charge of the general administration and management of the unit.

He reports directly to the Director of University Health services as well as the Administrator of the

Hospital. There are other staff that report directly to the Biostatistician and assist in data collection from

patients and management of records.

The unit has two divisions under it. One handles records of Senior members and staff, as well as their

dependents while the other department deals with the records of Junior staff, and their dependents,

private patients, public institutions, students, and guests of the University.

Figure 1.1

Organisational Chart of the records unit

1.2 Statement of the Problem

The Medical Records Unit of the University Hospital keeps records on patients. Such records include

in-patients, out-patients, out-patients’ and in-patients’ morbidities, and mortality returns, monthly

returns on deliveries, clinical procedures, outcomes and costs, diagnosis, medical practitioners, and bed

occupancy. This record keeping is not automated. Due to the current medical record management

system, the majority of out-patients do not have a full medical record. With this situation, physician’s

time is wasted by having to collect all this information again.

In addition, it becomes difficult to keep track of the patients and also almost impossible for physician to

review this kind of medical record. This reduces the ability to carry out high quality clinical research.

Poor medical record management not only compromises the continuity of healthcare, but also the

quality of healthcare delivery in the hospital.

The patient’s age instead of date of birth is captured. Thus, when the current age of a patient who

visited the hospital say, five years ago is required, it has to be calculated. In addition, the same medical

record number of a staff is assigned to his or her dependents and this situation does not help in further

clinical research on an individual patient. Therefore, it is necessary to find a tool for managing and

organizing patients’ medical records. In the current records management system, it is difficult for the

staff to determine the relationships in these data with traditional statistical methods to provide

information, which are critical for a wide variety of problems that confront management of the

hospital. Calculators as well as manual methods – pen and paper – are used to calculate frequencies,

means, as part of the data analysis at the unit. Moreover, for a hospital in an academic environment,

collecting and organizing patients’ medical information is very important for clinical research and high

quality health care delivery.

A more comprehensive way of organising the data collected such that information can be retrieved on

time to satisfy decision makers and for the effective management of the hospital is important. This is

why this patients’ data mining system is being developed for the University of Ghana hospital.

1.3 Objectives of the Study

The main objective of the study is to develop a Patient’s Data Mart and use data mining techniques to

analyze it for relevant information. The project work specifically attempts to:

i. Capture comprehensive data on patient for clinical research

ii. Provide an insight into the disease patterns among patients of thehospital

iii. Review patients’ overall medical information

iv. Provide the Hospital Authority with data mining solutions andservices to allow them understand health related behaviours and the pathologiesencountered.

1.4 Scope

The study covered the Medical Records Unit of the University Hospital where records on patients and

the various clinical procedures used; costs and the outcomes are kept. A three stage Data Mart

Development Life Cycle was adopted. These were Planning, Development and Deployment. Planning

resulted in the delivery of a project definition document and a detailed project plan. The two documents

detail a clearly defined scope, an overall schedule of the project activities, the project organisation and

clearly defined project control. Development delivers a Data Mart prototype and a revised plan for

executing the Deployment stage. The Data Mart deliverables include a data base loaded with live

operational data, extract and load scripts; installations and configuration of the query environment; and

design documents.

Deployment delivered a production Data Mart and documentation. Deliverables for the production Data

Mart include a fully populated database optimised for performance, production extract and load scripts,

a production end-user query environment and production reports. The documentation deliverables

include system documentation and initial training of users and support staff. Clustering and

classification data mining tools were used to mine the data mart.

1.5 Justifications

Lack of access to relevant and timely information poses a threat to health care delivery in Ghana. Some

of the data collected is of poor quality due to low capacity, lack of appreciation of the information

generated from them and inappropriate data collection tools.

Although the Hospital has been managed over the years with the available data, it could have been

better managed if the large operational database had been analyzed to reveal some of the trends in it.

With the passage of the Health Insurance bill into law, the University Hospital will be required to

organize its internal systems to provide services as prescribed by the law. This would require extensive

restructuring of both outpatients and inpatient services.

The results of the study would help to analyze the available dataset and provide management with

timely and accurate information for effective decision-making.

1.6 Methodology

Primary data was collected using personal observation methods as well as interviewing decision makers

in the hospital. The decision makers were the Hospital Administrator, Director of Health Services,

Pharmacist, Chief Nursing Officer, and Accountant. Secondary data was also obtained from

administrative documents and the patient’s files available at the records unit. Other secondary data was

obtained from books, journals, papers and the Internet on the subject.

A Strengths, Weaknesses, Opportunities and Threats (SWOT) analysis was carried on the existing

system to determine why the current methods are used and points out its shortcomings. It also

determined whether it was feasible to improve upon the current system or to develop a new one. A Data

Flow Diagram (DFD) was used to model the day-to-day transactional processes within the existing

system.

An Entity-Relationship Diagram (ERD) was used to develop a conceptual model for an on-line

transaction processing system (OLTP) database management system (DBMS) to serve as a source

system for the Data Mart. Microsoft Access, a Relational Database Management System was used to

develop this database system and it captured the day-to-day transactional processes of the Records Unit

since there was no computerised database currently available at the Unit.

Afterwards, a Star Schema Model was created for the data mart to be used for the data mining process.

Microsoft Access was again used to build the data mart based on the star schema model. Queries were

developed using Microsoft Access to Extract, Transform and Load (ETL) data into the data mart.

Microsoft Access Queries and Reports were developed for the data mining process. The programme

was tested and an overview of the necessary control and security measures for implementation was

considered.

1.7 Significance of the Study

For those in Academia and Research, the study would among others (i) Provide good grounding in data

mining and its related concepts, like data warehousing, data marts and so on. (ii) Enable research by

organisations involved in Marketing, Telecoms, Insurance and Banks into using data mining techniques

to increase Returns on Investment (ROI).

In the case of practitioners, the results of the study would (i) Enable medical practitioners discharge

their duties efficiently and effectively, (ii) Assist the hospital authorities to better manage their facilities

and Staff (iii) Assist the hospital authorities to contribute effectively to the efficient and smooth

running of the various insurance schemes that are being implemented, (iv) Provide an easy to use

interface with the capability of quickly drilling down to increasingly detailed patient information, (v)

Improve quality of patient care through the use of data, trend and comparative analysis of clinical data,

(vi) Provide detailed clinical information to any Health Research Data warehouse, which will be used

to analyze large amounts of data in long-term research projects, (vii) Provide health services research

through data management and analysis for better Policy Support.

The result will be useful to the government health managers, nongovernmental healthcare providers and

their managers, and policy makers in health and related sectors of the economy.

Finally, the Ministry of Health would benefit immensely since as part of its management information

and performance monitoring, it hopes to improve upon the application of appropriate technology for

data collection, storage, analysis and dissemination of health information.

1.8 Conclusion

This chapter examined the background of the study and the methodologies adopted in the study as well

as the advantages to be derived from the study. The next chapter reviews the available literature on the

concepts used in the study.

CHAPTER TWO

LITERATURE REVIEW

2.0 Introduction

This chapter reviews the concepts and literature relevant to the study. Section 2.1 opens with review on

Information Systems, followed by section 2.2 on Data warehousing and section 2.3 on Data Marts and

users of it. Section 2.4 deals with data mining, and related researches and Section 2.5 focuses on

current research work on data mining , benefits, problems and software being used. Section 2.6

concludes the chapter.

2.1 Information Systems

An information system according to Laudon and Laudon (2000:7), is a set of interrelated components

working together to collect, process, store, and disseminate information to support decision making,

coordination, control, analysis, and visualization in an organisation. Four main kinds of information

systems were identified that serve different organisational levels. These are operational-level systems,

knowledge-level systems, management–level systems and strategic level systems. Information systems

are built to serve each of the four levels of an organisation. According to Laudon and Laudon (2000),

there are six major types of information systems in contemporary organisations that are designed for

different purposes and different audiences. Operational level systems are Transaction Processing

Systems (TPS), such as payroll or order processing, that track the flow of the daily routine transactions

necessary to conduct business. Knowledge-level systems support clerical, managerial, and professional

workers.

They consist of office automation systems (OAS) for increasing the productivity of data workers and

knowledge work systems (KWS) for enhancing the productivity of knowledge workers. Management

level systems i.e. Decision support systems (DSS) and Management Information systems (MIS) provide

the management control level with reports and access to the organisation’s current performance and

historical records. Most MIS reports condense information from a TPS and are not highly analytical.

Decision support systems (DSS) support management decisions when these decisions are unique,

rapidly changing, and not specified easily in advance. DSS have more advanced analytical models and

data analysis capabilities than MIS and often draw on information from external as well as internal

sources.

Executive support systems (ESS) support the strategic level by providing a generalized computing and

communications environment to assist senior management’s decision making. Laudon and Laudon

(2000:37) indicated that the various types of systems in the organisation exchange data with one

another. Transaction processing systems are a major source of data for other systems, especially MIS

and DSS. The information needs of the various functional areas and organisational levels are too

specialized to be served by a single system. With the increasing amount of Data from these systems,

there is the need to organize it in such a way that Information and Knowledge can readily be made

available for decision making purposes.

2.2 Data Warehousing

A data warehouse is a relational database management system (RDMS) designed specifically to meet

the needs of transaction processing systems. It can be loosely defined as any centralized data repository,

which can be queried for business benefit. Data mining potential can be enhanced if the appropriate

data has been collected and stored in a data warehouse.

Data warehousing is a new powerful technique making it possible to extract archived operational data

and overcome inconsistencies between different legacy data formats. As well as integrating data

throughout an organisation, regardless of location, format, or communication requirements, it is

possible to incorporate additional or expert information.

According to John McIntyre of SAS Institute Inc, it is the logical link between what the managers see in

their decision support i.e. ESS applications and the company's operational activities. In other words the

data warehouse provides data that is already transformed and summarized, therefore making it an

appropriate environment for more efficient DSS and ESS applications.

In the view of Inmon (1993:48), data warehousing is a subject-oriented, integrated, time-variant, and

non-volatile collection of data in support of management’s decision making process.

The first phase in data warehousing is to "insulate" an organisation’s current operational information,

i.e. to preserve the security and integrity of mission-critical On-line Transaction Processing (OLTP)

applications, while giving one access to the broadest possible base of data.

The resulting database or data warehouse may consume hundreds of gigabytes - or even terabytes - of

disk space. What is required then are efficient techniques for storing and retrieving massive amounts of

information. Increasingly, large organisations have found that only parallel processing systems offer

sufficient bandwidth.

The data warehouse thus retrieves data from a variety of heterogeneous operational databases. The data

is then transformed and delivered to the data warehouse/store based on a selected model (or mapping

definition).

The data transformation and movement processes are executed whenever an update to the warehouse

data is required so there should some form of automation to manage and execute these functions. The

information that describes the model and definition of the source data elements is called "metadata".

The metadata is the means by which the end-user finds and understands the data in the warehouse and

is an important part of the warehouse.

Data cleansing is an important aspect of creating an efficient data warehouse in that it is the removal of

certain aspects of operational data, such as low-level transaction information, which slow down the

query times.

The cleansing stage has to be as dynamic as possible to accommodate all types of queries even those,

which may require low-level information. Data should be extracted from production sources at regular

intervals and pooled centrally but the cleansing process has to remove duplication and reconcile

differences between various styles of data collection.

A data warehouse can be used in different ways for example it can be used as a central store against

which the queries are run or it can be used to like a data mart. Data marts, which are small warehouses,

can be established to provide subsets of the main store and summarized information depending on the

requirements of a specific group/department. The central store approach generally uses very simple data

structures with very little assumptions about the relationships between data whereas marts often use

multidimensional databases which can speed up query processing as they can have data structures

which are reflect the most likely questions. Another approach to data warehousing is the Parsaye's

Sandwich Paradigm (Dilly, 1995:5).

This paradigm encourages acceptance of the probability that the first iteration of a data warehousing

effort will require considerable revision. The Sandwich Paradigm advocates the following approach: (i)

Pre-mine the data to determine what formats and data are needed to support a data-mining application;

(ii) Build a prototype mini-data warehouse i.e. the meat of the sandwich, with most of the features

envisaged for the end product; (iii) Revise the strategies as necessary; and (iv) Build the final

warehouse.

2.2.1 Data Warehousing and OLTP Systems

A database which is built for OLTP is generally regarded as unsuitable for data warehousing as they

have been designed with a different set of needs in mind i.e. maximizing transaction capacity and

typically having hundreds of tables in order not to lock out users etc. Data warehouses are interested in

query processing as opposed to transaction processing.

OLTP systems cannot be repositories of facts and historical data for business analysis. They cannot

quickly answer ad hoc queries and rapid retrieval is almost impossible. The data is inconsistent and

changing, duplicate entries exist, entries can be missing and there is an absence of historical data, which

is necessary to analyze trends. Basically OLTP offers large amounts of raw data, which is not easily

understood. The data warehouse offers the potential to retrieve and analyse information quickly and

easily. The data warehouse serves a different purpose from that of OLTP systems by allowing business

analysis queries to be answered as opposed to "simple aggregations" such as `what is the current

account balance for this customer?’.

2.2.2 Problems with Data Warehousing

The problems associated with developing and managing a data warehouse as identified by

(Greenfield, 1996:55) include underestimation of resources for data loading, hidden problems with

source systems, required data not captured, increased end-user demands, data homogenization, high

demand for resources, Data Ownership, high maintenance, long-duration projects, and complexity of

integration.

2.2.3 Benefits of Data Warehousing

The primary purpose of data warehousing is to provide information to business users for strategic

decision making. According to Connolly and Begg (2005:1152), the successful implementation of a

data ware house can bring major benefits to an organisation including potential high returns on

investment, competitive advantage and increased productivity of corporate decision-makers.

2.3 Data Marts

A related concept to a data warehouse is a Data Mart. Mallach (2000:469) indicated that it is a smaller

version of a data warehouse, typically containing data related to a single functional area of an

organisation or having limited scope in some other way. A data mart holds a subset of the data in a

data warehouse normally in the form of a summary data relating to a particular department or business

function. The data mart can be a standalone or linked centrally to the corporate data warehouse. As a

data warehouse grows larger, the ability to serve the various needs of the organisation may be

compromised.

In the view of Connolly and Begg (2005:1173), the popularity of data marts stems from the fact that

corporate-wide data ware houses are proving difficult to build and use. Other reasons identified for the

increasing popularity include (i) to give users access to the data they need to analyze most often, (ii) to

provide data in a form that matches the collective view of the data by a group of users in a department

or business function,

(iii) to improve end-user response time due to the reduction in the volume of data to be accessed and

(iv) to provide appropriately structured data as dictated by the requirements of end-user access tools

such as OLAP and data mining tools, which may require their own internal data base structures. In

practice these tools often create their own data mart designed to support their specific functionality.

2.3.1 Users of Data Warehouse and Marts

Mallach (2000:470) identified the ideal data warehouse or data mart user as (i) A person whose job

involves drawing conclusions from and making decisions based on, large masses of data , (ii) A person

who does not want to (or cannot) get involved with finding and organizing the data for this purpose,

(iii) The person who does not want to (or cannot) access a database in a highly technical fashion and

(iv) Any person or group of people whose decisions have enough value to the organisation to justify the

data warehousing effort. This is because building a data warehouse is at a cost.

2.4 Data Mining

The term data mining has been stretched beyond its limits to apply to any form of data analysis. Some

of the numerous definitions of Data Mining, or Knowledge Discovery in Databases are: Data Mining,

or Knowledge Discovery in Databases (KDD) as it is also known, is the nontrivial extraction of

implicit, previously unknown, and potentially useful information from data.

This encompasses a number of different technical approaches, such as clustering, data summarization,

learning classification rules, finding dependency net works, analysing changes, and detecting anomalies

(Frawley et al, 1995:1).

Data mining is the search for relationships and global patterns that exist in large databases but are

`hidden' among the vast amount of data, such as a relationship between patient data and their medical

diagnosis. These relationships represent valuable knowledge about the database and the objects in the

database and, if the database is a faithful mirror, of the real world registered by the database.

(Holshemier and Siebes, 1994:4). Other definitions for data mining are given by (Theus, 2003:298),

(Raghu, and Gehrke, 2000), (Berry and Linoff, 2000:136), (Gϋrsakal and Acar, 1999:136), (Bicen and

Oktay Firat, 2003:136), (Marakas, 1999:356), (Frand, 2004:1), (Grossman et al, 1998:2) and (Bicen,

2002:15).

The analogy with the mining process is described as Data mining refers to "using a variety of

techniques to identify nuggets of information or decision-making knowledge in bodies of data, and

extracting these in such a way that they can be put to use in the areas such as decision support,

prediction, forecasting and estimation. The data is often voluminous, but as it stands of low value as no

direct use can be made of it; it is the hidden information in the data that is useful"(Clementine User

Guide, 2004:1).

Basically data mining is concerned with the analysis of data and the use of software techniques for

finding patterns and regularities in sets of data. The computer is responsible for finding the patterns by

identifying the underlying rules and features in the data.

The idea is that it is possible to strike gold in unexpected places as the data mining software extracts

patterns not previously discernable or so obvious that no one has noticed them before. Data mining

analysis tends to work from the data up and the best techniques are those developed with an orientation

towards large volumes of data, making use of as much of the collected data as possible to arrive at

reliable conclusions and decisions. The analysis process starts with a set of data, uses a methodology to

develop an optimal representation of the structure of the data during which time knowledge is

acquired.

Once knowledge has been acquired this can be extended to larger sets of data working on the

assumption that the larger data set has a structure similar to the sample data. Again this is analogous to

a mining operation where large amounts of low-grade materials are sifted through in order to find

something of value. While Data mining and knowledge discovery in databases (or KDD) are frequently

treated as synonyms, data mining is actually part of the knowledge discovery process.

2.4.1 Data Mining Process

An approach which could be used is the Cross Industry Standard Process for Data Mining

(CRISP-DM) data mining process model, which is not specific to any particular industry or tool

(Connolly and Begg, 2005:1239).

Zaïane (1999:4) however stated that the knowledge discovery in Databases process comprises of a

few steps leading from raw data collections to some form of new knowledge. The iterative process,

Zaïane (1999:3) indicated consists of these steps: Data cleaning, Data integration, Data Selection, Data

Transformation, Data Mining, Pattern evaluation, and Knowledge representation. He indicated that it is

common to combine some of these steps together. For instance, data cleaning and data integration can

be performed together as pre-processing phase to generate a data warehouse or data mart. Data

selection and data transformation can also be combined where the consolidation of the data is the result

of the selection, or, as for the case of data warehouses, the selection is done on transformed data.

2.4.2 Data Mining Techniques and Models

Different levels of analysis available include artificial neural networks, Genetic algorithms, Decision

trees, nearest neighbour method, Rule induction, and Data visualization. IBM has identified two types

of model or modes of operation, which may be used to unearth information of interest to the user.

These are Verification Model and Discovery Model.

2.4.3 Data Mining Functions

Data mining methods may be classified by the function they perform or according to the class of

application they can be used in. Karuna (1997:2) identified three algorithms, namely, Clustering or

Classification, Association rule, and Sequential analysis while in addition to these three; Edelstein

(1996:469) added forecasting. According to Karuna (1997:2), in clustering/classification, a set of data

is analyzed and a set of grouping rules are generated which can be used to classify future data.

2.4.4 Data Mining Problems/Issues

Data mining systems rely on databases to supply the raw data for input and this raises problems in that

databases tend be dynamic, incomplete, noisy, and large. Other problems arise as a result of the

adequacy and relevance of the information stored.

In the view of Zaïane (1999:13), while data mining is still in its infancy, it is becoming a trend and

ubiquitous. He believes that before data mining develops into a conventional, mature and trusted

discipline, many still pending issues have to be addressed. Some of the issues he identified are Limited

Information, Noise and missing values, uncertainty, size, updates, and irrelevant fields, confidentiality,

cost of software and Technological infrastructure requirement such as size of the database and query

complexity.

2.5 Current Data Mining Research Efforts

The comments by Makulowich (1999:1), Erdelstein (1999:2), ANGOSS Systems as well as by the

personalities mentioned in their articles indicate that most researchers are still not clear as to what data

mining is.

Writing for the Washington Technology Online Journal, Makulowich (1999:1) indicated, “In the best of

all possible digital worlds, information is king. The universe, nothing but an expanding data

warehouse, is completely catalogued by omniscient reference librarians and continuously updated by IT

professionals who require no sleep”.

The ultimate goal is mass customization. Every meaningful relationship in the data is uncovered with

sophisticated analytical tools. Every end user interaction is captured. Every transaction between buyer

and seller is completely transparent. This grand vision of data mining suddenly narrows. The data is

dirty, that is, it can't be used without extensive "cleaning." The end user lacks the numeracy to work

intelligently with the tools, the IT department wars with statisticians over access to the data, and the

field lacks international standards.

According to Makulowich (1999:1), it is all part and parcel of any growing computer area, whether IP

telephony, wireless telecommunication or data mining. In the case of data mining, according to experts,

“it is a field that holds great promise”.

Erdelstein (1999:1) believes “Data mining is the application of these and other [artificial intelligence]

and statistical techniques to common business problems in a fashion that makes these techniques

available to the skilled knowledge worker, as well as the trained statistics professional.

What one really needs to know to do data mining effectively is the line of business in which they are in

and their data. More important is data and not mining. The standing joke is that “Data mining is what

IS people call statistics".

The difficulty in defining the field is echoed by another expert, Christopher Westphal, president and

founder of Bethesda, Md.-based Visual Analytics and co-author of "Data Mining Solution: Methods

and Tools for Solving Real-World Problems. “He admits early on that even though data mining has

been part of the IT vernacular for several years, people still view the process as magic. "This is partly

because no established road maps or procedures have been formally identified for guiding analysts to

profitable outcomes” (Erdelstein ,1999:2).

For Erick Brethenoux, vice president of Lazard Freres & Co. LLC, New York, and a data warehouse

and data mining analyst, the field is moving in at least two major directions: one aligned with

knowledge management and the other with greater data visualization techniques.

He divided the data mining aspect of knowledge management into categorization and filtering. In the

first case, it amounts to finding interesting trends and business intelligence in business data. For

example, in launching a new product, the organisation might want to determine if there are existing

patents on the product or process and, if there are, whether or not to buy them. Filtering can be combing

through information for use in knowledge management (Erdelstein, 1999:2).

Erdelstein (1999:1) again quoted Brethenoux to have said that , "With the trend of increasing use of the

World Wide Web, data mining opens the whole area of such things as pattern analysis of traffic,

allowing you to see who has access to what information”.

Makulowich (1999:2) concluded that the critical issues for data mining on which all agree is to

correctly define the business problem.

Another area that Data mining is applied is agriculture where a lot of activities have taken place.

Christensen and Di Cook (1998:1) describe simple numerical methods to establish the relationship

between 10 soil characteristics variables and corn yield. Nunyen et al (2001:1) demonstrates use of

data mining techniques on images to identify trash in ginned corn while Abdullah et al (2003:1)

studied about learning dynamics of pesticide abuse through data mining. The Australia’s

Commonwealth Scientific and Industrial Research Organisation (CSIRO) is involved in both research

and consulting in Data Mining for all sectors, especially Health.

2.5.1 Data Warehousing and Data Mining in Healthcare

Data warehousing and data mining applications have found their way into health research and a lot of

activities and documented reports can be seen in this area: - Bramson (2000:1-2), Hagland (2004:1-2),

Baylis (1999:1-10), Berndt et al (2000:2), and Sokol et al. (2001:1).

2.5.2 Benefits of Data Warehousing and Data Mining in Healthcare

Baylis (1999:1) indicated, “Health care generates mountains of administrative data about patients,

hospitals, bed costs, claims, etc. Clinical trials, electronic patient records and computer supported

disease management will increasingly produce mountains of clinical data. This data is a strategic

resource for health care institutions.

With the advent of data warehousing techniques, Baylis (1999:2) noted that specific areas of interest

may be investigated more thoroughly. Products such as INFoCOM from SMS, which is a clinically

based data warehouse product designed for use throughout a hospital, bring the potential for specialized

information production to the clinicians and managers desktop through the use of clinical workstations

and Executive Information Systems (EIS).

He continued by saying that “Data mining products are designed to take this one stage further. It brings

the facility to discover patterns and correlation hidden within the data repository and assists

professionals to uncover these patterns and put them to work. Therefore decisions rest with healthcare

professionals, not the Information systems experts.”

Schonlau et al (2003:294) used the techniques of data mining to explore the characteristics of diabetes

patients from the U.S medical expenditure panel survey (MEPS) by constructing patient profiles. The

profile consists of the variables total expenditure, number of ambulatory (office based visits, number of

visits to the emergency room, number of hospital (inpatient)) visits, number of home health visits and

physical and mental health status. They found out that hospital and home health visits are negatively

correlated; self –reported mental and physical health correlate well.

Eric Bremer, PhD., a Director of the Pediatric Brain Tumor Research Program at the Children’s

Memorial Hospital in Chicago is reported by Haglard (2004:2) as one doctor who knows well the

tremendous potential of data mining, which is using software programs for pattern recognition and

predictive analysis. He is steering a project that demonstrates the future of data mining in medical

research.

Bramson (2000:1), notes that “historically, healthcare providers have received a considerable amount of

data from third-party payors, but we have not always known what use to make of the information. Data

mining is an attempt to extract from the available data valuable nuggets of information that can create

change in health delivery. It also represents an opportunity to shape the way in which medicine will be

practiced in the 21st century. As physicians, we should be excited about being able to participate”.

Kantor (2001:2) noted that “the need to turn massive amounts of data into meaningful information is

especially profound within the healthcare industry, where the well-being, recovery and even lives of

patients may depend upon the result.

Data mining helps healthcare organisations recognize certain patterns within their patient data that are

often difficult to detect with traditional statistical methods. It can help a hospital or clinic answer such

daily business questions as what are our top ten paid procedures?”, can we identify the diagnosis-

related groups (DRGs) that receive the lowest amounts of reimbursement from Medicare or Medicaid? ,

how can we determine if we are receiving all the timely and accurate reports we need to make our

analyses, and how can we use our legacy system more effectively to identify patterns that will

maximize the output and efficiency of our operations.” Kantor (2001:2) continued by saying that,

having centralized and consolidated its data warehouse, implementing a Microsoft database,

middleware, and applications on Intel® Xeon™ and Itanium® servers, Clalit Health Services, today

can answer these questions and does so on a daily basis.

Baylis (1999:2) again indicated that findings from any data mining exercise could bring patterns to

surface (that might otherwise remain undiscovered), which may suggest alternative ways for treating

patients making better use of resources.

He concluded by noting that, the use of data mining has focused on evidence-based patterns from

previous patients treatment. In all likelihood, the absence of automated discovery of patterns would

leave many questions unasked. These questions, if asked, would benefit not only the resource

utilization for patient treatment, but also the health of the patient. Data mining according to him helps

professionals discover there patterns and put them to work. As models are based directly on history,

they represent the ultimate in evidence-based care. But technology is no panacea; professional, ethical

and practical issues must be addressed.

Oracle’s Berger has been quoted (Hagland, 2004:1) as saying that the potential of data mining is great

for health plans and providers to use data mining tools to improve patient care and plan member

service. Public health, medical research, patient care, health plan member segmentation, population

health management and individual customer services can all benefit from data mining.

Biritwum et al (2000:198) conducted a study to determine the pattern of disease or conditions leading

to hospitalization at Korle Bu Teaching Hospital, Ghana in 1996. They concluded among others that the

hospital managed a wide range of diseases over 1560 different (ICD-10) disease/conditions/injuries.

Pregnancy and child birth conditions formed 43%, infectious and parasitic diseases formed 8%, injuries

and poisoning formed 7% and perinatal period conditions formed 6 %. Malaria was still the major

cause of hospitalization and there was a high prevalence of HIV/AIDS, Tuberculosis and non-

communicable disease such as injuries, circulatory disease and malignancies.

A dbase IV programme was used for data capture and SPSS package was used in data analysis. From

their findings, they concluded that there is the need for institutionalization of the management system

and further extension of this approach to other facilities in Ghana. Bamgboye, and Familusi

(1990:128) carried out studies to examine pattern of pediatric deaths in the children’s emergency room

(CHER) of the University College Hospital, Ibadan between 1978 and 1986, and relates this to the

admission diagnosis. Data was collected from admission registers and case fatality and proportional

death rates calculated.

In another study by Bamgboye, and Familusi (1990:50) they looked at the pattern of diseases of

emergency admissions at the children’s emergency room of the University College Hospital.

The information extracted from the admission registers includes age, sex, residence, principal

diagnosis and outcome of admission. The extracted data were later processed, with an IBM PC micro-

computer using statistical software SURVEY MATE, for data entry, frequencies, and cross-

tabulations. Their studies may be classified as data mining based on the definition given earlier of data

mining.

According to Bresnahan (1997:3), one way data mining is helping health care providers cut costs and

improves care is by showing which treatments statistically have been most effective. Other areas she

identified include, helps health care management organisations to evaluate their doctors and facilities,

hospitals benefit from knowing how they compare with their peers, lets care givers identify people

statistically at risk for certain ailments so that they can be treated before the condition escalates into

something expensive and potentially fatal and fraud detection.

2.5.3 Problems of Data Warehousing and Data Mining in Healthcare

Hagland (2004:1) quoted Fran Turisco, a consultant in the Boston Office of first Consulting Group to

have said that, “A gradual but fundamental shift is taking place. Organisations used to mine claims

information and financial data, now they need to mine clinical information”.

Hospitals have lab and medication centres but sometimes do not have a lot of electronic information

with clinical documentation. “So they are putting in some of the advanced clinical information systems

needed for data mining”.

He again quoted Charles Berger, a senior director of product management for life sciences and data

mining at Boston-based Oracle Corporation as saying that “Health plans that already use data mining

have reasons to employ better data mining.

Deeper analysis of member populations and better assessment of at risk populations can reduce risk and

better serve their member’s needs, particularly as consumer-driven health plans become a force, experts

say, Soaring medical costs are driving such improvements”. Berger described a typical first-generation

use of data mining in which a payer client used Oracle’s data mining software to uncover billing

anomalies. “Certain patients repeatedly would go from doctor to doctor for procedures and certain

Doctors were routinely running up the biggest bills with those procedures.

According to Berger the next frontier for Oracle Corporation is Medical Research and thus it is

currently working on complex data mining in genomics, including projects at the national Institutes of

Health, Bethesda, Md. With data mining, he notes “we can handle hundreds, even thousands, of

variables.”

Bramson (2000:2) lamented in his paper about the fact that most of the data they obtain from third-

party payors are not adjusted beyond the fact that there is a commercial population and a Medicare

population. Since they knew that a difference exists between the two groups, they have adjusted the

data and classified patient’s age within 1-year, 5-year, or 10-year spans, as well as by gender. Such data

he noted comes largely from the insurance industry and are based on insurance claims.

Data-collection systems on which this information is based were not inept, but systems were designed

for bill payment, not medical management. He believes data mining is an iterative process that takes a

great deal of time with the intent to change physician behaviour to what we want it to become, through

the use of data.

Biritwum et al (2000:204) identified the lack of comprehensive map of data storage for Hospitals in the

past which meant that most reports have been made on incomplete data.

2.5.4 Health Data Mining Software

The potential for patient care and public health has encouraged a few pioneering organisations to

develop their own software programs. Medical researchers at Children’s Hospital, Boston developed an

analytical system called AEGIS (Automated Epidemiologic Geotemporal Integrated Surveillance). It

gathers and analyses data from various emergency departments, looking for patterns such as potential

infectious disease outbreaks, unexplained rapid increases in paediatric asthma and even lead poisoning

in children. (Hagland, 2004:2).

Health Data Desk is also another software tools to help government regulatory agencies, insurance

companies, HMOs, and healthcare providers find and explore patterns and exceptions in hospital

electronic billing claims record. Using both public and private databases, the top-level displays identify

hospitals with anomalous medical records, coding/billing patterns, inefficient and/or clinically

inappropriate treatment patterns by clinical product/service lines. According to (Hagland, 2004:1), Data

Description Inc, reported that analysis of Medicare public data using Health Data Desk is leading to the

recovery of hundreds of millions of dollars by HCFA and the Department of Justice from selected

hospitals throughout the USA.

Some Software developed for use in mining any data include SPSS Clementine, SAS Enterprise Miner,

S-Plus, MATLAB, Business Objects Business Miner, Cognos Scenario, Oracle Darwin, SGI MineSet,

WEKA and IBM Intelligent Miner.

2.6 Conclusion

The available literature on data mining, data warehousing and its related concepts and their application

in health care have been reviewed. It was realised that there is difficulty among researchers in defining

the term data mining. Although data mining is still in its infancy, it is becoming a trend and ubiquitous.

The problems of data mining as well as current data mining researches, especially in the health sector

were identified. The next chapter will take a critical look at the system under study.

CHAPTER THREE

THE EXISTING PATIENT’S RECORDS MANAGEMENT SYSTEM

3.0 Introduction

This chapter provides an overview and spells out the aims and objectives of the Medical Records Unit.

It also describes the existing patient records management system using Data Flow Diagram (DFD) and

SWOT analysis.

3.1 Description of the Existing System

To be able to know the requirements of an organisation, it is essential to describe the existing system.

The existing system of keeping medical records is outlined in this section.

3.1.1 The Process

The purpose of a patient’s visit to the hospital may be for review, or for treatment. When a patient

comes to the hospital for treatment for the first time, the medical records unit assigns a unique

Identification Number and a patient ID card labelled with the patient’s name and patient ID given. In

the meantime, general information such as the name of patient, address, age, gender, and status are

recorded on the card, which is given to the patient. When the patient sees the Physician, he or she

shows this card to the Physician. The patient may next go for a laboratory test or an x-ray and then go

straight to pharmacy with the prescription.

In the meantime, the Physician writes detailed information about the patient’s visit on his or her

treatment card and the card sent to the records unit at the end of the day.

The prescription is the Physician’s written instruction stating what drugs a patient should be issued

with. The pharmacist dispenses the drug and the patient makes payment at the accounts office. The

Pharmacist packages the drugs and gives them to the patients once they show receipt of payment from

the Accounts Office.

The patient may be discharged, admitted or referred after the visit. The medical record cards of all those

who are either discharged or referred are sent to the records unit for storage. For those who are

admitted, the medical record cards are kept with the patient in the ward. It is after the outcome of

admission that their medical records cards are returned to the records unit. The outcomes of admission

include discharged, referred, absconded, or died.

Return patients come to the hospital with their patient ID cards and their medical record cards are sent

to the Physician. The procedure is the same procedure as for a new patient. No new information is

recorded for return patients except data related to that particular visit.

For patients reporting at the emergency unit, they are issued with a new treatment card. Next, their

cards are sent to the medical records unit for ID numbers to be assigned and data recorded. Review

patients are treated as return patients.

Figure 3.1 Data Flow Diagram of Patient Record Management System

3.1.2 Databases and Records Storage

The Medical Records Unit of the University Hospital has no computerised medical records database.

Records of all consultancies, diagnoses and outcome of visit at the time of discharge are kept on

physical cards, which have unique identification numbers.

Each patient’s cards is kept at the records unit of the Hospital. Although a great deal of effort is

invested in maintaining this system, it has several draw backs. First, the patient‘s information is not

readily accessible to medical staff and researchers. The folders and all the information in them are

vulnerable to considerable loss or damage as a result of exposure to dust and from shortage of space.

The integrity of files and folders cannot be assured over time because there is no simple and effective

check of folders borrowed or taken away by patients.

3.1.3 Hardware

There are no computers and their accessories in the Records Management Unit.

3.1.4 Software

No computer software is being used currently at the Records Management Unit.

3.1.5 Human ware

The categories of staff at the Records unit are: One Research Grade Staff, Two Administrative

Assistants, Three clerks and a cleaner.

3.1.6 Strengths, Weaknesses , Opportunities and Threats ( SWOT) Analysis

There is the need to review the internal environment of the Medical records unit in order to access its

strengths and weaknesses, and also review the external environment to find out what opportunities

exist, and to identify the existing threats that are likely to impinge upon the operations of the Unit.

3.1.6.1 Strengths

§ High caliber of staff in the unit

§ Large volumes of patient data are available at the Unit.

3.1.6.2 Weaknesses

§ The patient‘s information is not readily accessible to medical

staff and researchers.

§ Patient’s age instead of Date of Birth is captured

§ Each University of Ghana member of staff and his/her

dependents use the same Medical Records Number.

§ The files are vulnerable to considerable loss or damage as a

result of exposure to dust and from shortage of space.

§ The integrity of files cannot be assured over time because there

is no simple and effective check of files borrowed or taken away by Doctors,

Researchers and some patients.

§ Retrieving patient’s records at times is very cumbersome due tooccasional mis-filing and the increasing number of patients.

§ Lack of appreciation of the data collected and the absence of data

collection tools such as computers and hand gloves.

§ Due to the current medical record management system, the

majority of out-patients do not have full medical records. With this situation, physician’s

time is wasted by collecting all information again.

§ It becomes difficult to keep track of the patients and also almost

impossible for the physician to review this kind of medical record. This threatens the

ability of clinicians to carry out high quality clinical research.

§ Inadequate staff and lack of motivation of the few staff at post.

§ Lack of computers (Every thing in the department is being donemanually)

3.1.6.3 Opportunities

§ To use Data Mining techniques to process the large volume

of patient

data at the unit.

§ To computerise the activities of the unit to generate timely

and reliable

information which would assist in the process of arriving at decisions by management at

the Hospital. This to a larger extent would have a positive impact on the Health

Insurance Scheme, which has been introduced. This is because information is valuable

in health care delivery.

3.1.6.4 Threats

§ Competition from adjoining Hospitals and clinics that have

been

registered under the health insurance scheme and are operating their health

facilities on purely business lines would require that accurate, timely,

relevant and detailed information is provided to the management of the

University Hospital by the Medical Record Unit.

3.2 Conclusion

It has been deduced from the analysis that the Medical Records Unit has problems, which requires a

new system to solve them. The next chapter identifies the information requirements of the decision

makers in the hospital.

CHAPTER FOUR

INFORMATION REQUIREMENTS OF DECISION MAKERS

4.0 Introduction

This section looks at the information required by the various decision makers in the hospital. The

records unit is supposed to provide the required information as reports to the Administrator and/

Director and the various decision makers i.e. Heads of Nursing, Pharmacy, and Finance Departments.

These reports are Monthly Statistical Report, Financial Statistical Report, Patient Status Report, and

Pharmaceutical report

4.1 Monthly Statistical Report

They are supposed to submit this report to the administrator at the end of each month. The main

components of this report include Top Ten Causes of Out-patient department attendance, Main causes

of hospital attendance, State of the wards - Number Admitted, Status of patients, Outcomes of

Admission, Drugs consumed per month, and Incomes for the month by general services.

The purpose of this report is to enable management identify pattern of diagnosis, determine patterns in

drug consumption, determine total cost per diagnosis, determine total income for the month, plan for

the wards and plan for staff and day-to-day administrative duties.

4.2 Financial Statistical Report

The report for the finance department is titled the Financial Statistical Report. The main components of

this report include the number of people who visited the Hospital and Income for the month by general

services. At the moment, this report is not being generated for the Head of Finance.

This report would assist management to reconcile the data collected by accounts clerks, be able to bill

clients especially institutions at the end of the month on time, to prepare annual and monthly budgets

for the hospital, to determine monthly revenue and expenditure, and to determine charges per patient in

the various units.

4.3 Patient Status Report

A report on patient’s status is supposed to be generated for the Chief Nursing officer. The Patient

Status Report must contain the following information on number of patient by breakdown of diagnosis

by month, by age, gender, address and status. Currently, the Chief Nursing Officer is not receiving

these reports.

The purpose of patient status report is to enable the chief nursing officer allocate Nurses to wards and

Know the localities where communicable diseases are endemic and inform the public health nurses to

carry out their outreach programme in those localities.

4.4 Pharmaceutical report

The Pharmaceutical report must contain information on hospital attendance of patient. It should be

according to the number of patient by status, age, gender, drugs prescribed, and diagnosis as well as

report on Top-ten diseases per month. The Head of Pharmacy currently, does not receive this report

from the records unit.

This report would enable the chief pharmacist to determine the types of drugs to order, to ascertain the

re-order level of drugs for specific diagnosis, to determine the amount of drugs dispensed to old-age

patients as well as children under age 5 years and to plan for suppliers of drugs.

The only report which is currently generated by the unit and submitted is the monthly statistical report

but not all the required information is included. The components of the incomplete monthly statistical

report include, top ten causes of out-patient department attendance, main causes of admission to the

hospital, and State of the wards - Number Admitted, Status of patients, Outcomes of Admissions.

4.5 Conclusion

It has been deduced from the information requirement analysis that the medical records unit requires a

system that would assist them to generate the required reports for decision makers. The next chapter

takes us through the necessary steps to follow when developing a system to solve these problems.

CHAPTER FIVE

DESIGN AND DEVELOPMENT OF THE PATIENTS’ DATA MINING SYSTEM

5.0 Introduction

This chapter looks at the steps that must be taken to develop a new system to solve the problems in the

existing system as enumerated in Chapters Three and Four.

5.1 Database Design for Data Mart

The processes identified in the existing system include records processing, issuing prescription,

receiving laboratory request, receiving prescription and receiving payments. The data requirements

associated with these processes are shown in the normalised Entity Relationship (ER) diagram of

Figure 5.1. The ER diagram forms part of the design documentation, which describes the Online

Transaction Processing (OLTP) system required to support the business processes of the hospital since

there is no computerised database available currently at the records unit. The entities identified in the

existing system which are being used to develop the database for the data mart include: “Physician”,

“Outcome”,” Diagnosis”, “Treatment”, “Drug”, “Visit”, “Admission”, “Patient” and “Purpose

”.

Figure 5-1: Entity Relationship(ER) Diagram

5.1.1 Data Dictionary of the Database of the Data Mart

To understand the entities in the database, it is necessary to describe the data in it so that the user is

aware of the content and make-up of the data fields. The data dictionary provides description of the

entity tables in the data base. The data dictionary for the relational database is shown below.

Physician Entity

Description: Details of Doctors working at the Hospital

Primary key: Physician_ID

Attributes Description Data type(length)Physician ID

Physician Name

Physician Type

A unique number for eachphysician

The name of the physician

The status of physician,either a Consultant orResident

Text (8)

Text(20)

Text (12)

Patient Entity

Description: Contains details of patients who visited the hospital

Primary Key: Patient_ID

Foreign Key: Treatment_ID references Treatment (Treatment_ID)

Foreign Key: Physician_ID references Physician (Physician_ID)

Foreign Key: Date_of_Visit references Physician_Diagnosis (Date_of_Visit)

Attributes Description Data type ( length)

Medical RecordNumberPatient_ID

Patient Name

Address

Date of Birth

Gender

Status

Treatment_ID

Physician_ID

Date_of_Visit

A number used to identify each patient

A unique alphanumeric value for eachpatient assigned by the researcher to ensureconfidentiality.

The name of each of the patients that visitedthe Hospital

Department which patient belongs to /or where they reside

Date of birth of a patient

Gender of patient

Status of patient (Senior, Junior and staffdependents, or private Patient)

A unique number for each treatment

A unique number for each physician

Date patient visited the hospital

Text (8)

Text(5)

Text(25)

Memo(35)

Date (12)

Text(2)

Text (15)

Text (8)

Text (8)

Date (12)

Drug Entity

Description: Describes the drugs prescribed for a patient

Primary Key: Drug_ID

Foreign Key: Patient_ID references Patient (Patient_ID)




Attributes Description Data type(length)

Drug_ID

Drug Name

Patient_ID

Treatment_ID

Date_of_Visit

Physician_ID

A unique number for each drug

The name of the drugs prescribed

A unique alphanumeric value for eachpatient assigned by the researcher toensure confidentiality.




Text (8)

Text (50)

Text(5)

Text(8)

Date(12)

Text(8)

Treatment_Notes Entity

Description: Describes the treatment Notes given to a patient

Primary Key: Treatment_ID




Treatment_ID

Date_of_Visit

Physician_ID

Treatment Notes

A unique number for eachtreatment


A unique number for eachphysician

Notes about the treatment

Text (8)

Date(12)

Text(5)

Text(150)

Admission Entity

Description: Records details of the patient when admitted

Primary Key: Start Date





Patient_ID

Ward Name

End Date

Maintenance Cost

Start Date

Date_of_Visit

Physician_ID

A unique alphanumeric value foreach patient assigned by theresearcher to ensure confidentiality.

Ward to which patient was admitted

Date the outcome occurred

Total amount spent for eachadmission

Date patient was admitted



Text (5)

Text (15)

Date (12)

Currency (10)

Date (12)

Date(12)

Text(5)

Diagnosis Entity

Description: Describes the diagnosis for a patient

Primary Key: Diagnosis_ID






Diagnosis_ID

Diagnosis Name

Patient_ID

Date_of_Visit

Physician_ID

Treatment_ID

A unique number for each diagnosis

The diagnosis that caused the physician toprescribe the treatment.

A unique alphanumeric value for eachpatient




Text (8)

Text (50)

Text(5)

Date(12)

Text(8)

Text(8)

Visit Entity

Description: Describes the visit made by each patient


Foreign Key: Purpose_ID references Purpose (Purpose_ID)



Foreign Key: Outcome_ID references Outcome (Outcome_ID)


Patient_ID

Date_of_Visit

Treatment Cost

Outcome_ID

Purpose_ID

Physician_ID

A unique alphanumeric value for eachpatient assigned by the researcher toensure confidentiality.

Date on which patient made the visit

Total cost of each treatment

A unique number for each outcome ofvisit

A unique number for each purpose ofvisit


Text (5)

Date (12)

Currency(10)

Text (8)

Text (8)

Text(8)

Outcome Entity

Description: Describes the outcome of each visit by a patient

Primary Key: Outcome_ID





Foreign Key: Start Date references Admission (Start Date)


Outcome_ID

Patient_ID

Date_of_Visit

Physician_ID

Treatment_ID

Start Date

Outcome Name

Outcome Type

A unique number for each outcome

A unique alphanumeric value foreach patient




Date patient was admitted

The outcome of the treatment-admitted, discharged, referred ordied.

Specifies the type of outcome,whether it is the result of a visit oradmission

Text (8)

Text(5)

Date(12)

Text(8)

Text(8)

Date (12)

Text (10)

Text (15)

Purpose Entity

Description: Describes the purpose of visit by patient to the hospital

Primary Key: Purpose_ID





Purpose_ID

Purpose_of_Visit

Physician_ID

Patient_ID

Date_of_Visit

A unique number for each purpose ofvisit

Purpose of each visit by patient


A unique alphanumeric value foreach patient assigned by theresearcher to ensure confidentiality.


Text (8)

Text(15)

Text (8)

Text (5)

Date(12)

5.2 The Star Schema Model

It is necessary to identify the grains of the fact table which means deciding exactly what a fact table

record represents. For this study, the fact table has been identified as “Summary” and the grain is

Income. Dimensions set the context for asking questions about the facts in the fact table. Dimension

tables in the study are “DM_Physician”, “DM_Diagnosis”, “DM_Treatment”, “DM_Outcome”,

“DM_Patient”, DM_Purpose, DM_Drug, and “DM_Time”. Each of these dimensions and fact

tables has attributes. An attribute is a property of an entity that is of interest to the organisation. For

example, attributes of the entity --- DM_Diagnosis --- are Diagnosis_ID and Diagnosis Name.

The star schema model for the study is shown as Fig. 5-2. A star schema model gives a detailed logical

representation of the data for an organisation or for a business area. The Star schema model is

expressed in terms of dimensions tables and a fact table, the relationship among them and their

attributes. This schema has been optimized for reporting and analysis.

Figure 5-2: Logical Data Model of Data Mart (Star Schema)

5.2.1 Data Dictionary for the Star Schema of Data Mart

Often a large number of data and information may be involved in a data mart. It is possible to have

hundred or more data fields or attributes in any given data mart. No one is capable of keeping track or

remembering the names and definitions of all these fields.

Additionally, there can be numerous dimensions and a fact table, each with its own attributes. To use

all these data prudentially, it is necessary to describe the data so that the user is aware of the content

and make-up of the data fields. The data dictionary provides description of the fact and every

dimension tables in the star schema of the data mart as well as their attributes and any special features

associated with them. The data dictionary for the Patient star schema is shown below.

Physician Dimension

Description: Details of Doctors working at the Hospital

Primary Key: Physician_ID

Attributes Description Data type(length)Physician_ID

Physician Name

Physician Type


The name of the physician

The status of physician, either aConsultant or Resident

Text (8)

Text(20)

Text (12)

Patient Dimension

Description: Contains details of patients who visited the hospital

Primary Key: Patient_ID

Attributes Description Data type ( length)

Medical Record Number

Patient_ID

Address

Date of Birth

Gender

Status

A number used to identify eachpatient

A unique alphanumeric valuefor each patient assigned by theresearcher to ensureconfidentiality.

Department which patientbelongs to /or where theyreside

Date of birth of a patient

Gender of patient

Status of patient (Senior, Juniorand staff dependents, or privatePatient)

Text (8)

Text(5)

Memo(35)

Date (12)

Text(2)

Text (15)

Diagnosis Dimension

Description: Describes the diagnosis for a patient

Primary Key: Diagnosis_ID

Attributes Description Data type(length)Diagnosis_ID

Diagnosis Name


The diagnosis that caused the physicianto prescribe the treatment.

Text (8)

Text (50)

Treatment Dimension

Description: Describes the treatment notes given to a patient

Primary Key: Treatment_ID

Attributes Description Data type(length)Treatment_ID

Treatment Notes

A unique number for eachtreatment

Notes about the treatment

Text (8)

Text(150)

Drug Dimension

Description: Describes of the drugs given to a patient

Primary Key: Drug_ID

Attributes Description Data type(length)Drug_ID

Drug Name

A unique number for eachdrug

The name of the drugsprescribed

Text (8)

Text (50)

Purpose Dimension

Description: Describes the purpose of visit by patient to the hospital

Primary Key: Purpose_ID


Purpose_ID

Purpose_of_Visit

A unique number for eachpurpose of visit

Purpose of each visit bypatient

Text (8)

Text(15)

Outcome Dimension

Description: Describes the outcome of each visit by a patient

Primary Key: Outcome_ID

Attributes Description Data type(length)Outcome_ID

Outcome Name

Outcome Type


The outcome of the treatment-admitted, discharged, referred ordied.

Specifies the type of outcome,whether it is the result of a visit oradmission

Text (8)

Text (15)

Text (10)

Time Dimension

Description: Describes the time each visit was made by patient

Primary Key: Time_ID


Time_ID

DayWeek

Month

Quarter

Year

A unique identifier for the timeof visit

The day patient made the visit

The week patient made the visit

The month patient made thevisit

The quarter patient made thevisit

The year patient made the visit

Number (8)

Number (2)Number (2)

Number (2)

Number (2)

Number (4)

Summary Fact Table

Description: Describes the fact in the data mart

Foreign Key: Physician_ID references DM_Physician (Physician_ID)

Foreign Key: Patient_ID references DM_Patient (Patient_ID)

Foreign Key: Diagnosis_ID references DM_Diagnosis (Diagnosis_ID)

Foreign Key: Treatment_ID references DM_Treatment (Treatment_ID)

Foreign Key: Outcome_ID references DM_Outcome (Outcome_ID)

Foreign Key: Purpose_ID references DM_Purpose (Purpose_ID)

Foreign Key: Time_ID references DM_Time (Time_ID)

Foreign Key: Drug_ID references DM_Drug (Drug_ID)


Physician_ID

Patient_ID

Diagnosis_ID

Treatment_ID

Outcome_ID

Purpose_ID

Time_ID

Drug_ID

Treatment_Cost

Maintenance_Cost


A unique identifier for each patient


A unique number for each treatmentnotes


A unique number for the purpose of each visit made by patient

A unique number for the time the patient made the visit


Cost of treating a patient per visit

Cost of admission incurred by eachpatient

Text (8)

Text (5)

Text (8)

Text (8)

Text (8)

Text (8)

Text (8)

Text (8)

Currency (15)

Currency (15)

5.3 Development Tool

Microsoft Access 2000 was used to develop the new system. Microsoft Access is a database

management system (DBMS) that provides a means of storing and managing data and information. MS

Access can also support the size of the data mart and it is also cheaper when compared to other

software products such as Oracle, Sybase and Microsoft SQL.

5.4 Input Design

Once the data fields have been adequately described, data entry can commence. Two different formats

for data entry are generally available, namely Default format and customized format.

Default data entry screens are adequate if the person inputting data is the same person who generated

the data description. The customized format is used for the patient data mining system application. This

was to achieve the following:

§ Offer a more attractive appearance

§ Assist the data entry persons with hard to remember details

§ Use more descriptive phrase for fields

§ Provide instructions on what to enter for several of the fields

A means of catching inadvertent errors has been devised. Edit checks have been built into the screens

as a way of alerting the data entry person that an error has occurred.

The main switch board shown in Figure 5.3 lists the functions the data mining system has and is the

main guide of the data mart navigation. When the data mining system application is opened, the main

switchboard is shown on the screen. From it, the user can go to the” data entry” form to enter data about

each Patient, Treatment, Diagnosis, Visit, Admission, Outcome and Drug. The data mart can be

searched to get the needed information by selecting “queries” or “reports” which summarise medical

information by patients.

Figure 5-3: The Main switchboard form

The “Main switchboard” form opens first when the patients’ data mining application is launched. It

enables users to locate quickly the forms they will need for data entry or the reports they need to view

or print. It can also be used to control the objects in a database you will allow users to access. Some of

the menus on it include: Patient Data Entry, Extract and Load Data Mart, Queries, Reports and Exit.

Figure 5-4: The Patient Information form

This form is used by the data entry clerk to enter patient data into the data mart. Patients data captured

include Medical record number, Patient Identification Number (Patient_ID), Address, Date of Birth,

Gender, Treatment Identification Number (Treatment_ID), Physician Identification Number

(Physician_ID), Date of Visit, and Status. The Patients table would be updated. It also provides menus

for capturing data on Diagnosis, Treatment, Outcome, Visit, Admission, Purpose of Visit, and Drug for

each patient.

Figure 5-5: The Treatment Notes Form

This form serves as an interface for entering patient’s Treatment notes data into the data mart by Data

Entry clerk. The data goes to update the Treatment table. The treatment data captured include

Treatment Identification number (Treatment_ID), Treatment notes given by the Physician, Physician

Identification Number (Physician_ID), and Date of Visit.

Figure 5-6: The Diagnosis Form

This form serves as a user interface for entering patient’s diagnosis data into the data mart by the Data

Entry clerk. Diagnosis data captured for each patient include Diagnosis Identification number

(Diagnosis_ID), Physician Identification Number (Physician_ID), Date of Visit, Treatment

Identification Number (Treatment_ID), Diagnosis Name and Patient Identification Number

(Patient_ID). The diagnosis table is updated.

Figure 5-7: The Visit Form

The visit form serves as an interface for capturing visit and its related data into the data mart by the

Data Entry clerk. Visit data captured to update the visit table include Date of visit, Patient

Identification Number (Patient_ID), Treatment Cost, Outcome Identification Number (Outcome_ID),

Purpose Identification Number (Purpose_ID), and Physician Identification Number (Physician_ID).

Figure 5-8: The Outcome Form

This form serves as an interface for capturing the outcome data of each patient treatment into the data

mart by the Data Entry clerk. The outcome data captured to update the outcome table include Outcome

Identification Number (Outcome_ID), Physician Identification Number (Physician_ID), Date of Visit,

Treatment Identification Number (Treatment_ID), Patient Identification Number (Patient_ID), Start

Date, Outcome name, and Outcome type. Outcome type captures the type of outcome, i.e. either a

Visit Outcome or Admission Outcome.

Figure 5-9: The Drug Form

This form serves as an interface for capturing drug data into the data mart by Data Entry clerk The drug

data captured to update the Drug table include Drug Identification Number (Drug_ID), Patient

Identification Number (Patient_ID), and Drug Name(s). Physician Identification Number

(Physician_ID), Date of Visit, and Treatment Identification Number (Treatment_ID) are also captured.

Figure 5-10: The Admission form

This form serves as an interface for capturing the admission data of each patient admitted into the data

mart by Data Entry clerk. The data captured to update the Admissions table are Patient Identification

Number (Patient_ID), Date of Admission (Start Date), Date of Outcome of Admission (End Date),

Maintenance Cost, Ward Name and Outcome Identification Number (Outcome_ID). Treatment

Identification Number (Treatment_ID), Physician Identification Number (Physician_ID), and Date of

Visit were also captured.

Figure 5-11: Purpose of Visit

This form captures data to update the purpose of visit table. Some of the data it captures include

Physician Identification Number (Physician_ID), Date of Visit, Purpose Identification Number

(Purpose_ID), Patient Identification Number (Patient_ID) and purpose of visit. The purpose of Visit

can be for review, see a consultant or for treatment.

Figure 5-12: Physician Form

This form captures data to update the physician table of the database. Data captured on each physician

include Physician Identification Number (Physician_ID), Physician Type and Physician Name.

5.5 Preparing the Data Mart for Mining

The transactional data which has just been entered must be loaded into the data mart. The user must

select Extract and Load Data Mart from the main switchboard.

Figure 5-13: Extract and Load Data Mart form

Step 1: Double click on the Extract and Load Data button to load data from the

transactional data base tables into the fact table and dimensions of the data mart.

As shown in Figure 5-14, a message box come up to notify you that you are about to run an append

query that will modify data in your table.

Step 2: Click yes

Figure 5-14: Extract and Load Data Mart Message Box One

The message box below tells you that once you click yes, you cannot use Undo command to reverse the

changes.

Step 3: Click yes.

Figure 5-15: Extract and Load Data Mart Message Box Two

Steps 1 and 2 are repeated for all the dimensions and fact table of your data mart.

You would receive the message above, to confirm that you have successfully loaded the data mart.

Step 4: click Ok.

Figure 5-16: Extract and Load Data Mart Message Box Three

The code for executing the extraction and loading of data mart is shown below:

Private Sub Extract_and_Load_DM_Click ()

DoCmd.OpenQuery "Load_DM_Diagnosis", acViewNormal, acEdit DoCmd.OpenQuery "Load_DM_Outcome", acViewNormal, acEdit

DoCmd.OpenQuery "Load_DM_Patient", acViewNormal, acEdit

DoCmd.OpenQuery "Load_DM_Physician", acViewNormal, acEdit

DoCmd.OpenQuery "Load_Time_Dim", acViewNormal, acEdit

DoCmd.OpenQuery "load_DM_purpose", acViewNormal, acEdit DoCmd.OpenQuery "Load_DM_Treatment", acViewNormal, acEdit

DoCmd.OpenQuery "Load_DM_Drug", acViewNormal, acEdit DoCmd.OpenQuery "Load_DM_Summary", acViewNormal, acEdit

MsgBox "You have successfully loaded the fact table"

End Sub

5.6 Queries

The “Queries” button on the main switchboard leads the user to the queries built into the data mart (Fig.

5-17, page 83). These queries were developed trying to provide information for the report needed by

decision makers in the hospital and the most frequently asked questions from clinical researchers.

Some of the queries of the queries are (The italic part can be replaced):

i. Patients with a diagnosis

For example: “How many patients were diagnosed with malaria”?

ii. Patient with a diagnosis (detailed information that physicians can use to

recruit patients into a clinical trail or to contact with patient if necessary)

For example: “What is the general information of patients with the diagnosis malaria”?

iii. Patient have a diagnosis in a certain period

For example: How many patients were diagnosed with Malaria in December, 2004”?

iv. Patients with a diagnosis in a certain age

For example: “How many patients who were diagnosed with Malaria are older than 30”?

v. How many patients were attended to by a Physician

For example: “How many patients were attended to by Physician D004”?

vi. How many patients were admitted in December, 2004 and their diagnosis

vii. How many Patients were Admitted and what were outcomes of admission

viii. How much did each admitted patient pay?

ix. “What medications were given to each malaria patient by a physician”?

x. “Based upon what treatment notes did each physician diagnose a patient of malaria”?

Figure 5-17: Queries Interface

5.7 Reports

Reports are other important components of the data mart. The “Reports” button on the main

switchboard leads the user to the reports built into the data mart (Figure 5-18, page 84). This data mart

not only contains reports that summarise some of the medical information such as diagnosis, treatments

notes, outcomes but the information such as gender of patients, status of patients, address of patients,

treatment and maintenance costs for those who have been admitted.

Figure 5-18: Reports Interface

A VIEW OF THE PATIENT DATA MINING SYSTEM

Figure 5-19: Patients’ Data Mining System

The operational data A, B and C i.e. Patient data, physicians data, Drug data, etc has been summarized

into a Patient data mart using Microsoft Access 2000, a relational database software. The data mart is

now ready to be mined by the End-users.

5.8 Conclusion

This chapter was devoted to the designing of the data mart and in particular, its input forms and

description of what data are captured by each of the forms. The queries and reports needed by decision

makers as well as researchers were also developed in this chapter. The next chapter presents the results

of the patient data mining as reports.

CHAPTER SIXPATIENT DATA MINING RESULTS

6.0 Introduction

This chapter presents the results of the patient data mining. The data mining was based on the queries

developed in the preceding chapter.

6.1 Reports

The reports have been generated from the data mart to satisfy the information needs of the decision

makers at the University Hospital.

6.2 Monthly Statistical Report6.2.1 Top ten causes of Hospital Attendance

UNIVERSITY OF GHANA HOSPITALTOP TEN CAUSES OF ATTENDANCE

MONTH: DecemberYEAR: 2004

Diagnosis Name Frequency

Malaria 9

Upper Respiratory Tract Infection 4

Tonsillitis 2

Road Traffic Accident 2

conjunctivitis 1

menopausal syndrome 1

Hypertension, Rhinitis 1

Gastroenteritis, second degree malaria 1

Ententis , Upper Respiratory TractInfection

1

Dyspepsia 1

Figure 6-1: Top ten causes of Hospital Attendance

This component of the report shows the top ten main causes of hospital attendant by patients in 2004.

Most of the patients, 9 were diagnose of Malaria. Other results are Upper respiratory tract infection (4),

Tonsillitis (2), and Road traffic Accident (2).

One each of the patient were diagnoses of Conjunctivitis, menopausal syndrome, Hypertension &

Rhinitis, Gastroenteritis & Second degree Malaria, Ententis and Upper respiratory tract infection and

Dyspepsia respectively. The syntax of the query used to generate this report is as shown below:

SELECT TOP 10 Top_Ten_Best.[Diagnosis Name],

Main_causes_of_admission.Frequency, DM_TIME.Month,DM_TIME.Year

FROM DM_TIME, Top_Ten_Best INNER JOIN Main_causes_of_admissionON Top_Ten_Best.[Diagnosis Name] = Main_causes_of_admission.[Diagnosis Name];

6.2.2 Total Treatment and Maintenance Incomes

UNIVERSITY OF GHANA HOSPITALTOTAL INCOME


Number ofPatients

Total TreatmentIncome

Total MaintenanceIncome

TotalIncome

48 ¢1,770,930.00 ¢1,504,000.00 ¢3,274,930.00

Figure 6-2: Total Treatment and Maintenance Incomes

The total treatment and maintenance incomes for the month of December, 2004 are shown in the figure

2. A total of 48 patients attended the hospital and an amount of ¢1,770,930.00 and ¢ 1,504,000.00

respectively were realize as treatment and maintenance incomes. The total amount realised was ¢

3,274,930.00. The syntax for the query used to generate this report is shown below:

SELECT Count (DM_Summary.Patient_ID) AS [Number of Patients], Sum (DM_Summary.Treatment_Cost) AS [Total Treatment Income], Sum

(DM_Summary.Maintenance_Cost) AS [Total Maintenance Income], SumDM_Summary.Treatment_Cost+DM_Summary.Maintenance_Cost) AS [Total Income],DM_TIME.Month, DM_TIME.Year

FROM DM_TIME INNER JOIN DM_Summary ON DM_TIME.Time_ID =DM_Summary.Time_ID;

6.2.3 Main Causes of Hospital Attendance

Figure 6-3: Main Causesof Hospital Attendance

This component of the report

shows the top ten main causes

of hospital attendant by

patients in 2004. Most of the

patients, 9 were diagnose of

Malaria. Other results are

Upper respiratory tract

infection (4), Tonsillitis (2),

and Road traffic Accident (2).

One each of the patient was

diagnosis of the disease as

shown. The syntax for the

query used to generate this

UNIVERSITY OF GHANA HOSPITALMAIN CAUSES OF HOSPITAL ATTENDANCE


Diagnosis Name Number of Patients

Malaria 9


Allergy Conjuctivitis 2

Tonsilitis 2


conjunctivitis 1

Hypetension, Rhinitis 1

Gastroenterotis, second degree malaria 1

Ententis , Upper Respiratory Tract Infection 1

Dyspepsia 1

menopausal syndrome 1

conjuvititis 1

Old age 1

Candidiasis 1

Assaulted 1

Assault 1

Acute Tonsillitis 1

Cut 1

See eye specialist 1

Urinary Tract Infection, Candidiasis 1

Urinary Tract Infection 1

Upper Respiratory Tract Infecton, No malaria 1


Swelling 1

Mild Bronchititis 1

Sore, bacterial Infection and scalp 1

Accident 1

Pregnacy 1

Phanguvitis 1

Pain 1

report is:

SELECT DM_Diagnosis. [Diagnosis Name], Count

(DM_Summary.Diagnosis_ID) AS Number of Patients

FROM DM_Diagnosis INNER JOIN DM_Summary ON

DM_Diagnosis.Diagnosis_ID=DM_Summary.Diagnosis_ID

GROUP BY DM_Diagnosis.[Diagnosis Name]

ORDER BY Count (DM_Summary.Diagnosis_ID) DESC;

6.2.4 Number of Patients Admitted By Status

UNIVERSITY OF GHANA HOSPITALNUMBER OF PATIENTS ADMITTED BY STATUS


Status Number of Patients

Private patient 6

Senior member dependent 2

Junior staff 1

Figure 6-4: Number of Patients Admitted By Status

This figure shows the status of patients who were admitted in December, 2004. Whilst 6 of them were

private patients, 2 were senior member dependents.

Only one of them was a junior staff. The syntax of the query for generating this report is as shown on

the next page:

SELECT DM_Patient.Status, Count (DM_Patient.Status) AS [Number of

Patients],

FROM DM_Patient INNER JOIN (DM_TIME INNER JOIN

DM_Summary ON

DM_TIME.Time_ID=DM_Summary.Time_ID) ON

DM_Patient.Patient_ID=DM_Summary.Patient_ID

WHERE (((DM_Summary.Maintenance_Cost)>0))

GROUP BY DM_Patient.Status;

6.2.5 Outcomes of Patients Attendance

UNIVERSITY OF GHANA HOSPITALOUTCOMES OF PATIENTS ATTENDANCE


Number ofPatients Outcome Name

31 Discharged after visit

9 Admitted

6 Refereed after visit

1 Discharged against medicaladvice

1 Detained for Observation

Figure 6-5: Outcomes of Patients Attendance

This component of the report shows the outcomes of hospital attendance in December, 2004. 31 of the

patients were discharged after visiting the hospital whilst 9 patients were admitted after their visit to the

hospital.

Whereas six patients were refereed after their visit to the hospital, one each of the patients was

discharged against medical advice and Detained for observation respectively. The syntax of the query

used to generate this report is as shown on the next page.

SELECT Count (DM_Patient.Patient_ID) AS [Number of Patients], DM_Outcome.[Outcome

Name], DM_TIME.Month, DM_TIME.Year

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN (DM_Outcome INNER JOIN

DM_Summary ON DM_Outcome.Outcome_ID = DM_Summary.Outcome_ID) ON

DM_Patient.Patient_ID = DM_Summary.Patient_ID) ON DM_TIME.Time_ID =

DM_Summary.Time_ID

GROUP BY DM_Outcome.[Outcome Name], DM_TIME.Month, DM_TIME.Year;

6.2.6 Gender of Admitted Patients

UNIVERSITY OF GHANA HOSPITALGENDER OF ADMITTED PATIENTS


Gender Number of Patients

Female 6

Male 4

Figure 6-6: Gender of Admitted Patients

This report shows the gender of the patients who were admitted to the hospital in December, 2004.

Whilst 6 of them were females, only 4 were males. The syntax of the query used to generate this report

is as shown below:

SELECT DM_Patient.Gender, Count (DM_Patient.Gender) AS [Number of Patients]

FROM DM_Patient INNER JOIN (DM_TIME INNER JOIN DM_Summary ON

DM_TIME.Time_ID = DM_Summary.Time_ID) ON DM_Patient.Patient_ID =

DM_Summary.Patient_ID


GROUP BY DM_Patient.Gender;

6.2.7 Wards of Admitted Patients

UNIVERSITY OF GHANA HOSPITAL WARDS OF ADMITTED PATIENTS

MONTH: December YEAR: 2004

Ward Name Number of Patients

Children 5

Female Ward B 2

Male Ward A 1

Female Ward A 1

Figure 6-7: Wards of Admitted Patients

This component of the report shows the wards in which patients were admitted into at the hospital.

Majority of the patients, 5 were in the Children’s ward whilst 2 patients were admitted into the Female

ward B. One each of the patients was in the Male ward A and Female Ward A respectively. The

syntax for the query used to generate this report is as shown below:

SELECT Count (DM_Patient.Patient_ID) AS [Number of Patients], DM_Summary. [Ward

Name]

FROM DM_Patient INNER JOIN ((DM_TIME INNER JOIN DM_Summary ON

DM_TIME.Time_ID=DM_Summary.Time_ID))


GROUP BY DM_Summary. [Ward Name];

6.2.8 Status of Admitted Patients

UNIVERSITY OF GHANA HOSPITALSTATUS OF ADMITTED PATIENTS


Status Number of Patients

Private patient 6

Senior staff dependent 2

Senior member dependent 1

Junior staff 1

Figure 6-8: Status of Admitted Patients

This report shows the status of patients who were admitted to the hospital in December, 2004. Whilst 6

of them were Private patients, only 2 were senior staff dependents. One each of the patients was a

senior member dependent and junior staff respectively. The syntax of the query used to generate this

report is as shown below:

SELECT DM_Patient.Status, Count (DM_Patient.Status) AS [Number of Patients]

FROM DM_Patient INNER JOIN (DM_TIME INNER JOIN DM_Summary ONDM_TIME.Time_ID=DM_Summary.Time_ID) ONDM_Patient.Patient_ID=DM_Summary.Patient_ID


GROUP BY DM_Patient.Status;

6.2.9 Patient Diagnosis and Prescription

UNIVERSITY OF GHANA HOSPITALPATIENT DIAGNOSIS AND PRESCRIPTION


Patient Name Drug Name Diagnosis Name

BenjaminBourkum

Urine catteter Dyspepsia

Bernand Don-Bortey

Alaxin, Paracetamol Accident

Cedric Dorlu Malox Pain

Chien Tin Gum Artesunuate course, Paracetamol, WidalSTO, STH, Cipro 500mg

Swelling

Cynthia Nzolezo Dil saline IVF , Chloroquine, Phenegen Cut

Eric Ramaphosa Gvitter, Maxitrol, Gvitter Naphcos A Upper Respiratory TractInfecton, No malaria

Ezekiel Alammu Paracetamol, Amoxyl, cough Mixture,Alaxin

Malaria

Faustina Sonnel liberal oral fluids, steam inhalation,paracetamol syrup

Mild Bronchititis

Felix Hayna Chloroquine, Vit C, paracetamol,Workadine Antiseptic mouth wash

Acute Tonsillitis

Gladys Pains Tetanol injection vaginal candidiasis

Grace D'Or Paracetamol, Diaraprine 10mg Upper Respiratory TractInfection

Harriet Tagoe Clotrimazole, vaginal creame, apply 2times daily

refered to surgical specialist


MONTH: DecemberYEAR: 2004HarrisonKoomson

Alaxin tablets, Vit B'Co, Diclofenac Gastroenterotis, seconddegree malaria

Isaac Odoom Chloroquine Road Traffic Accident

Jeffery Archer Paracetamol Malaria

Kenneth Haggin Camoquine, Paracetamol, MistMagnessium, Trisilicate

Malaria

Kesiwaa Titus Suspension Amoxyl 7.5l, SyrupParacetamol 7.5l, Antibiotic eye drops

Allergy Conjuctivitis

Kinsley Adiepena Amosikklav 625mg, Decatylen Lozenges,paracetamol, Vit C, Histacet

Malaria

Krofa Okronipa Amoxyl 125mg, Cough Mixture Upper Respiratory TractInfection

Leornard Kuffuor Gvitter 160mg,Alaxin 60mg, Paracetamol Urinary Tract Infection

Mary-Anne Issifu Soluble Aspirin, Celinfine T Upper Respiratory TractInfection

Micheal Farady Beverage eye ointment conjuvititis

Monica Coffie cough mixture, paracetamol See eye specialist

Monica Coffie Ephedine nasal drops(0.5%), Sys. Actified2.5mls tabs, suspension Zitronax 80mg

Candidiasis

Nana Yaw Akyiah Amoxyl, Diclofenac, Mercurochromepaintings, leopard balme

Malaria

Panford Andrews Amosiklav 625mg, wokadine anticepticsyrup,Decatylen lozenges

Tonsilitis

Panie Agnes Paracetamol Stomach ulcer

Parnorld Dogbe Phenegen, gvitter, artesunuate,paracetamol, flagyl,cough mixture

Malaria

Patience Amos Chloramphenicol, ORS, Breastfeedfrequently, Baby cough mixture

Ententis , Upper RespiratoryTract Infection

Paulina Amah Flucloxaccilkine, acyclovine cream,tetanol, Dichlophenac, flagyl

Assaulted

Penni Angela refered to gynaecologist Myomas

Penni Angela Tylenol,Paracetamol, IVF, D/L, Phenegan Urinary Tract Infection,Candidiasis

Pernest Nguiema Diclofenac, Leopard Ointment malaria


MONTH: DecemberYEAR: 2004PhilominaKpitinge

Gvitter 80mg, Alaxin 60mg andParacetamol

Road Traffic Accident

Phylis Zuma Plasmodium course, Paracetamol, Vit C,Folic Acid

Upper Respiratory TractInfection

Porshia Ansah Penicillium V, wokadine anticeptic gargle,decatylen lozenges

Tonsilitis

Regina BinLadden

paracetamol, Valium, Vit C menopausal syndrome

Richard Asuma Inj. Diclofenac 75mg, Inj. Tetanus 0.5mg,Paracetamol tables

Phanguvitis

Rosemary Sonko Flueconazole tablets Assault

Ruby Bienta Syrup paracetamol 5ml, Ephedine (0.5%),Syrup enythromycin, cough mixture


Salifu Mumuni Aspirin tabs, Amoxyl 500mg Hypetension, Rhinitis

Siaw Nsafoa Diclofenac, Vit C, Valium, Crepe Bomdage Malaria

Stella Dugan Chloranphenicol eye drop, Amoxyl 500mg,Paracetamol

conjunctivitis

Victoriayaoborteley

Chloroquine 3cc, phenegen, Amoxyl250mg

Malaria

Figure 6-9: Patient Diagnosis and Prescription

This figure shows the diagnosis and the prescriptions for each patient that visited the hospital inDecember, 2004. The syntax of the query used to generate this report is shown below:

SELECT DISTINCT DM_Patient.Patient_Name, DM_Drug.Drug_Name, DM_Diagnosis.

[Diagnosis Name]

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN (DM_Drug INNER JOIN

(DM_Diagnosis INNER JOIN DM_Summary ON DM_Diagnosis.Diagnosis_ID =

DM_Summary.Diagnosis_ID) ON DM_Drug.Drug_ID = DM_Summary.Drug_ID) ON


DM_Summary.Time_ID

GROUP BY DM_Patient.Patient_Name, DM_Drug.Drug_Name,

DM_Diagnosis.[Diagnosis Name];

6.3 Financial Statistical Report

6.3.1 Total Treatment and Maintenance Incomes

UNIVERSITY OF GHANA HOSPITALTOTAL INCOME FROM TREATMENT AND MAINTENACE

MONTH: December

YEAR: 2004Number of Patients Total Treatment Income Total Maintenance Income Total Income

48 ¢1,770,930.00 ¢1,504,000.00 ¢3,274,930.00

Figure 6-10: Total Treatment and Maintenance Incomes

The total treatment and maintenance incomes for the month of December, 2004 are shown in the figure

6-10. A total of 48 patients attended the hospital and an amount of ¢1,770,930.00 and ¢ 1,504,000.00

respectively were realise as treatment and maintenance incomes. The total amount realised was ¢

3,274,930.00. The syntax for the query used to generate this report is shown below:

SELECT Count (DM_Summary.Patient_ID) AS [Number of Patients], Sum

(DM_Summary.Treatment_Cost) AS [Total Treatment Income], Sum

(DM_Summary.Maintenance_Cost) AS [Total Maintenance Income], Sum

DM_Summary.Treatment_Cost+DM_Summary.Maintenance_Cost) AS [Total Income]

FROM DM_TIME INNER JOIN DM_Summary ON DM_TIME.Time_ID =

DM_Summary.Time_ID;

6.4 Patient Status Report

6.4.1 Patient Attendance by Status

UNIVERSITY OF GHANA HOSPITALPATIENT ATTENDANCE BY STATUS

MONTH: December

YEAR: 2004

Number ofPatients Status

16 Private patient

15 Senior member

10 Senior member dependent

3 Senior staff dependent

2 Junior staff dependent

1 Senior staff

1 Junior staffFigure 6-11: Number of Patients by status

This component of the report specifies the number of patients, who attended the hospital in December,

2004 by status. The report shows that 16 of the patients were private patients. Whilst 15 of them were

senior members, 10 patients were senior member dependents.

Three patients were senior staff dependents while two were junior staff dependents. One each of them,

were senior staff and junior staff respectively. This report is based on the query with syntax:

SELECT Count (DM_Summary.Diagnosis_ID) AS [Number of Patients], DM_Patient.Status

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN DM_Summary ONDM_Patient.Patient_ID = DM_Summary.Patient_ID) ON DM_TIME.Time_ID =

DM_Summary.Time_ID

GROUP BY DM_Patient.Status


6.4.2 Patient Attendance by Gender

UNIVERSITY OF GHANA HOSPITALPATIENT ATTENDANCE BY GENDER

MONTH: December

YEAR: 2004

Gender Number of Patients

Male 24

Female 24

Figure 6-12: Patient Attendance by Gender

Figure 6-12 depicts the number of patients by gender. Twenty-four each of the patients are males and

females respectively. The syntax of the query used to generate this component of the patient status

report is:

SELECT DM_Patient.Gender, Count (DM_Summary.Diagnosis_ID) AS [Number of Patients]

FROM DM_Patient INNER JOIN (DM_Diagnosis INNER JOIN (DM_TIME INNER JOIN

DM_Summary ON DM_TIME.Time_ID = DM_Summary.Time_ID) ON

DM_Diagnosis.Diagnosis_ID = DM_Summary.Diagnosis_ID) ON

DM_Patient.Patient_ID = DM_Summary.Patient_ID

GROUP BY DM_Patient.Gender


6.4.3 Number of Patients According To Diagnosis

UNIVERSITY OF GHANA HOSPITAL

NUMBER OF PATIENTS ACCORDING TO DIAGNOSIS

MONTH: December

YEAR: 2004

Number of Patients Diagnosis Name

9 Malaria

4 Upper Respiratory Tract Infection



MONTH: December

YEAR: 2004

2 Tonsillitis

2 Road Traffic Accident

1 Candidiasis

1 menopausal syndrome

1 Hypertension, Rhinitis

1 Gastroenteritis, second degree malaria

1 Ententis , Upper Respiratory Tract Infection

1 Dyspepsia

1 Cut

1 Mild Bronchitis’s

1 Conjunctivitis

1 Phanguvitis

1 Assaulted

1 Assault

1 Allergy Conjuctivitis

1 Acute Tonsillitis

1 conjuvititis

1 Sore, bacterial Infection and scalp

1 vaginal candidiasis

1 Urinary Tract Infection, Candidiasis

1 Urinary Tract Infection

1 Upper Respiratory Tract Infecton, No malaria

1 Upper Respiratory Tract Infection

1 Old age

1 Stomach ulcer

1 Myomas



MONTH: December

YEAR: 2004

1 See eye specialist

1 refered to surgical specialist

1 Refered

1 Pregnancy

1 Pain

1 Accident

1 SwellingFigure 6-13: Number of Patients According to Diagnosis

Figure 6-13 shows the diagnosis for all the patients that attended the hospital in December 2004. It

shows that majority of the patients, 9 were diagnosed of Malaria. Upper Respiratory Tract Infection was

recorded 4 patients. Two patients each were diagnoses of Tonsillitis and Road Traffic Accident

respectively. The rest of the diagnosis is for a patient each. The syntax of the query used to generate

this report is:

SELECT DISTINCT Count (DM_Patient.Patient_ID) AS [Number of patients],

DM_Diagnosis.[Diagnosis Name]

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN (DM_Drug

INNER JOIN (DM_Diagnosis INNER JOIN DM_Summary ON

DM_Diagnosis.Diagnosis_ID = DM_Summary.Diagnosis_ID)

ON DM_Drug.Drug_ID = DM_Summary.Drug_ID) ON

DM_Patient.Patient_ID = DM_Summary.Patient_ID) ON

DM_TIME.Time_ID = DM_Summary.Time_ID

GROUP BY DM_Diagnosis.[Diagnosis Name];

6.4.4 Number of Patients According Address

UNIVERSITY OF GHANA HOSPITALNUMBER OF PATIENTS ACCORDING ADDRESS

MONTH: December

YEAR: 2004

Address Number of Patients

Home Science 5

Adenta 4

east legon 2

Isser 2

Music Department 2

Nogouchi 2

Economics 2

Dansoman 1

Kwabenya 1

Crop Science 1

Achimota 1

computer science 1

Botany 1

Frafraha 1

Haatso 1

Bawalashie 1

Islamic University College 1

ARS 1

English 1

Parakuo Estates 1

UGMS 1

Statistics 1

Sociology 1

Social studies 1

UNIVERSITY OF GHANA HOSPITALNUMBER OF PATIENTS ACCORDING ADDRESS

MONTH: December

YEAR: 2004

Shiashie 1

Religion 1

KIA 1

Psychology 1

Kisseman 1

Nutrition and food science 1

Madina Estates 1

Linguistics 1

Language centre 1

La-bawaleshie 1

Univ. Hospital 1

Registrar's Office 1

Figure 6-14: Number of Patients According Address

The number of patients by address is depicted in Figure 19. Most of the patients who visited the

hospital, 5 are from the Home science department or related to a staff at that department.

4 patients are from Adenta whereas 2 each of the patients are from or related to someone from East

Legon, ISSER, Music, Nugouchi and Economics Departments. One each of the patients is from

localities or departments as shown in the report. The syntax of the query used to generate the report is:

SELECT DM_Patient.Address, Count (DM_Summary.Diagnosis_ID) AS [Number of Patients]

FROM DM_Patient INNER JOIN (DM_TIME INNER JOIN DM_Summary ON

DM_TIME.Time_ID=DM_Summary.Time_ID) ON

DM_Patient.Patient_ID=DM_Summary.Patient_ID

GROUP BY DM_Patient.Address


6.4.5 Patient Attendance by Age


PATIENT ATTENDANCE BY AGE MONTH: December

YEAR: 2004

Patient Name AGE

Benjamin Bourkum 45

Bernand Don-Bortey 33

Cedric Dorlu 51

Chien Tin Gum 4

Cynthia Nzolezo 35

Eric Ramaphosa 30

Ezekiel Alammu 38

Faustina Sonnel 27

Felix Hayna 31

Gladys Pains 7

Grace D'Or 39

Harriet Tagoe 5

Harrison Koomson 45

Isaac Odoom 37

Jeffery Archer 3

Kenneth Haggin 3

Kesiwaa Titus 33

Kinsley Adiepena 12

Krofa Okronipa 65

Leornard Kuffuor 35

Mary-Anne Issifu 54

Micheal Farady 39

UNIVERSITY OF GHANA HOSPITAL PATIENT ATTENDANCE BY AGE

MONTH: December

YEAR: 2004

Monica Coffie 16

Nana Yaw Akyiah 2

Panford Andrews 5

Panie Agnes 32

Parnorld Dogbe 11

Patience Amos 28

Paulina Amah 15

Penni Angela 35

Pernest Nguiema 45

Philomina Kpitinge 24

Phylis Zuma 24

Porshia Ansah 49

Regina Bin Ladden 55

Reginald Valery King 67

Rexford Asiama 62

Richard Asuma 11

Rosemary Sonko 22

Stella Dugan 3

Victoria yaoborteley 37

Yvette Listowels 27

Figure 6-15: Patient Attendance by Age

Figure 6-15 gives the number of patients by age, who attended the hospital in

December, 2004. Three each of the patients are aged 3, 35 and 45 years respectively whilst two each of

them are aged 5, 11, 22, 24, 27, 33, 37, and 39 years respectively. However each of the remaining

patients is aged between 2 and 67 years. It was generated based on the query with syntax shown below:

SELECT DISTINCT DM_Patient.Patient_Name, DateDiff ('yyyy', DM_Patient![Date

of Birth],Now()) AS AGE, Month, Year

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN

DM_Summary ON DM_Patient.Patient_ID =

DM_Summary.Patient_ID) ON DM_TIME.Time_ID =

DM_Summary.Time_ID

GROUP BY DM_Patient.Patient_Name;

6.4.6 Patients Diagnosis

UNIVERSITY OF GHANA HOSPITALPATIENTS DIAGNOSIS

MONTH: DECEMBER

YEAR: 2004

PatientName

MedicalRecordNumber

Address Age

Gender Status Diagnosis Name

PatienceAmos

6519 Adenta 28 Male privatepatient

Ententis , UpperRespiratory TractInfection

EzekielAlammu

5588 Statistics 38 Male Seniormember

Malaria

RosemarySonko

6492 La-bawaleshie

22 Female privatepatient

Assault

Victoriayaoborteley

6516 MadinaEstates

37 Female privatepatient

Malaria

PanieAgnes

5557 Nutritionand foodscience

32 Female seniormemberdependent

Stomach ulcer

PanfordAndrews

5327 CropScience

5 Male seniormemberdependent

Tonsilitis

UNIVERSITY OF GHANA HOSPITALPATIENTS DIAGNOSIS

MONTH: DECEMBER

YEAR: 2004

PenniAngela

5561 Homescience

35 Female Seniormember

Myomas

PenniAngela

5561 Homescience


Pregnacy

BenjaminBourkum

5578 Nogouchi 45 Male Seniormember

Dyspepsia

CynthiaNzolezo

5553 Homescience


Cut

FelixHayna

6517 Kwabenya 31 Male privatepatient

Acute Tonsillitis

Figure 6-16: Patients Diagnosis

This component of the report shows the diagnosis for all patients that attended the Hospital in

December, 2004. The syntax of the query used to generate this report is shown below:

SELECT DM_Patient.Patient_Name, DM_Patient.MedRecordNo, DM_Patient.Address, DateDiff

("yyyy", DM_Patient! [Date of Birth], Date ()) AS Age, DM_Patient.Gender,

DM_Patient.Status, DM_Diagnosis.[Diagnosis Name]

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN (DM_Diagnosis INNER JOIN

DM_Summary ON DM_Diagnosis.Diagnosis_ID = DM_Summary.Diagnosis_ID) ON


DM_Summary.Time_ID;

6.5 Pharmaceutical Report

6.5.1 Prescriptions by Gender of Patients

UNIVERSITY OF GHANA HOSPITALPRESCRIPTIONS BY GENDER OF PATIENTS

MONTH: December

YEAR: 2004

Gender Drug Name

Female Artesunuate course, Paracetamol, Widal STO, STH, Cipro 500mg

Female Chloranphenicol eye drop, Amoxyl 500mg, Paracetamol

Female Chloroquine 3cc, phenegen, Amoxyl 250mg

Female Clotrimazole, vaginal creame, apply 2 times daily

Female Dichlorophenac, Tetanus, Amoxyl, flagyl

Female Dil saline IVF , Chloroquine, Phenegen

Female Ephedine nasal drops(0.5%), Sys. Actified 2.5mls tabs, suspensionZitronax 80mg

Female Flucloxaccilkine, acyclovine cream, tetanol, Dichlophenac, flagyl

female liberal oral fluids, steam inhalation, paracetamol syrup

Female Tylenol,Paracetamol, IVF, D/L, Phenegan

Male Alaxin tablets, Vit B'Co, Diclofenac

Male Alaxin, Paracetamol

Male Amosikklav 625mg, Decatylen Lozenges, paracetamol, Vit C, Histacet

Male Amosiklav 625mg, wokadine anticeptic syrup,Decatylen lozenges

Male Amoxyl 125mg, Cough Mixture

Male Amoxyl, Diclofenac, Mercurochrome paintings, leopard balme

Male Aspirin tabs, Amoxyl 500mg

Male Beverage eye ointment

Male Camoquine, Paracetamol, Mist Magnessium, Trisilicate

Male Chloramphenicol, ORS, Breastfeed frequently, Baby cough mixture

Figure 6-17: Prescriptions by Gender of Patients

This report shows the gender of patients and the prescriptions they were each given. The syntax of the

query used to generate this report is as shown below:

SELECT DISTINCT DM_Patient.Gender, DM_Drug.Drug_Name


DM_Summary ON DM_Drug.Drug_ID = DM_Summary.Drug_ID) ON


DM_Summary.Time_ID

GROUP BY DM_Patient.Gender, DM_Drug.Drug_Name;

6.5.2 Drugs Consumed By Patients

UNIVERSITY OF GHANA HOSPITALDRUGS CONSUMED BY PATIENTS

MONTH: December

YEAR: 2004

Drug Name Number of Patient

Alaxin tablets, Vit B'Co, Diclofenac 1

Alaxin, Paracetamol 1

Amosikklav 625mg, Decatylen Lozenges, paracetamol,Vit C, Histacet

1

Amosiklav 625mg, wokadine anticeptic syrup,Decatylenlozenges

1

Amoxyl 125mg, Cough Mixture 1

Amoxyl, Diclofenac, Mercurochrome paintings, leopardbalme

1

Artesunuate course, Paracetamol, Widal STO, STH,Cipro 500mg

1

Beverage eye ointment 1

Camoquine, Paracetamol, Mist Magnessium, Trisilicate 1

Chloramphenicol, ORS, Breastfeed frequently, Babycough mixture

1

UNIVERSITY OF GHANA HOSPITALDRUGS CONSUMED BY PATIENTS

MONTH: December

YEAR: 2004

Chloranphenicol eye drop, Amoxyl 500mg, Paracetamol 1

Chloroquine 1

Chloroquine 3cc, phenegen, Amoxyl 250mg 1

Chloroquine, Vit C, paracetamol, Workadine Antisepticmouth wash

1

Clotrimazole, vaginal creame, apply 2 times daily 1

cough mixture, paracetamol 1

Dichlorophenac, Tetanus, Amoxyl, flagyl 1

Diclofenac, Leopard Ointment 1

Diclofenac, Vit C, Valium, Crepe Bomdage 1

Diclophenac, Buscopan, Onizoral creame, Puriton 1

Dil saline IVF , Chloroquine, Phenegen 1

Ephedine nasal drops(0.5%), Sys. Actified 2.5mls tabs,suspension Zitronax 80mg

1

Figure 6-18: Drugs Consumed By Patients

This component of the report shows the number of patients by the prescription they were given. The

drugs were prescribed to one patient each. The syntax of the query used to generate this report is shown

below:

SELECT DM_Drug.Drug_Name, Count (DM_Summary.Patient_ID) AS Number of Patients

FROM DM_TIME INNER JOIN (DM_Drug INNER JOIN (DM_Diagnosis INNER JOIN

DM_Summary ON DM_Diagnosis.Diagnosis_ID = DM_Summary.Diagnosis_ID) ON

DM_Drug.Drug_ID = DM_Summary.Drug_ID) ON DM_TIME.Time_ID =

DM_Summary.Time_ID

GROUP BY DM_Drug.Drug_Name;

6.5.3 Prescribed Drugs and Age of Patients

UNIVERSITY OF GHANA HOSPITALPRESCRIBED DRUGS AND AGE OF PATIENTS

MONTH: December

YEAR: 2004

Patient Name Drug Name AGE

Benjamin Bourkum Urine catteter 45

Bernand Don-Bortey Alaxin, Paracetamol 33

Cedric Dorlu Malox 51

Chien Tin Gum Artesunuate course, Paracetamol, Widal STO, STH,Cipro 500mg

4

Cynthia Nzolezo Dil saline IVF , Chloroquine, Phenegen 35

Eric Ramaphosa Gvitter, Maxitrol, Gvitter Naphcos A 30

Ezekiel Alammu Paracetamol, Amoxyl, cough Mixture, Alaxin 38

Faustina Sonnel liberal oral fluids, steam inhalation, paracetamol syrup 27

Felix Hayna Chloroquine, Vit C, paracetamol, Workadine Antisepticmouth wash

31

Harriet Tagoe Clotrimazole, vaginal creame, apply 2 times daily 5

Harrison Koomson Alaxin tablets, Vit B'Co, Diclofenac 45

Kenneth Haggin Camoquine, Paracetamol, Mist Magnessium, Trisilicate 3

Kesiwaa Titus Suspension Amoxyl 7.5l, Syrup Paracetamol 7.5l,Antibiotic eye drops

33

Kinsley Adiepena Amosikklav 625mg, Decatylen Lozenges, paracetamol,Vit C, Histacet

12

Krofa Okronipa Amoxyl 125mg, Cough Mixture 65

Leornard Kuffuor Gvitter 160mg,Alaxin 60mg, Paracetamol 35

Mary-Anne Issifu Soluble Aspirin, Celinfine T 54

Micheal Farady Beverage eye ointment 39

Monica Coffie cough mixture, paracetamol 16

Figure 6-19: Prescribed Drugs and Age of Patients

The specific drugs that were given to each patient and their ages are displayed by this component of the

report and the syntax of the query used to generate this report is shown below:

SELECT DISTINCT DM_Patient.Patient_Name, DM_Drug.Drug_Name, DateDiff

('yyyy',DM_Patient![Date of Birth],Now()) AS AGE

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN (DM_Drug INNER

JOIN DM_Summary ON DM_Drug.Drug_ID = DM_Summary.Drug_ID)

ON DM_Patient.Patient_ID = DM_Summary.Patient_ID) ON

DM_TIME.Time_ID = DM_Summary.Time_ID;

6.5.4 Diagnosis and Drugs Prescribed Per Patients

UNIVERSITY OF GHANA HOSPITALDIAGNOSIS AND DRUGS PRESCRIBED PER PATIENTS

MONTH: December

YEAR: 2004

Patient_Name Diagnosis Name Drug Name

PatienceAmos

Ententis , Upper RespiratoryTract Infection

Chloramphenicol, ORS, Breastfeed frequently,Baby cough mixture

EzekielAlammu

Malaria Paracetamol, Amoxyl, cough Mixture, Alaxin

FaustinaSonnel

Mild Bronchititis liberal oral fluids, steam inhalation, paracetamolsyrup

RosemarySonko

Assault Flueconazole tablets

Victoriayaoborteley

Malaria Chloroquine 3cc, phenegen, Amoxyl 250mg

Panie Agnes Stomach ulcer Paracetamol

PanfordAndrews

Tonsilitis Amosiklav 625mg, wokadine antisepticssyrup,Decatylen lozenges

Penni Angela Myomas refered to gynaecologist

Chien TinGum

Swelling Artesunuate course, Paracetamol, Widal STO, STH,Cipro 500mg

UNIVERSITY OF GHANA HOSPITALDIAGNOSIS AND DRUGS PRESCRIBED PER PATIENTS

MONTH: December

YEAR: 2004

CynthiaNzolezo

Cut Dil saline IVF , Chloroquine, Phenegen

Felix Hayna Acute Tonsillitis Chloroquine, Vit C, paracetamol, WorkadineAntiseptic mouth wash

HarrisonKoomson

Gastroenterotis, second degreemalaria

Alaxin tablets, Vit B'Co, Diclofenac

KrofaOkronipa


Amoxyl 125mg, Cough Mixture

Ruby Bienta Upper Respiratory TractInfection

Syrup paracetamol 5ml, Ephedine (0.5%), Syrupenythromycin, cough mixture

Siaw Nsafoa Malaria Diclofenac, Vit C, Valium, Crepe Bomdage

Figure 6-20: Diagnosis and Drugs Prescribed Per Patients

The diagnosis for each patient as well as the prescriptions given to them is displayed by this component

of the report. The syntax of the query used to generate this report is as shown below:

SELECT DM_Patient.Patient_Name, DM_Diagnosis.[Diagnosis Name],

DM_Drug.Drug_Name


(DM_Diagnosis INNER JOIN DM_Summary ON DM_Diagnosis.Diagnosis_ID

= DM_Summary.Diagnosis_ID) ON DM_Drug.Drug_ID =

DM_Summary.Drug_ID) ON DM_Patient.Patient_ID =

DM_Summary.Patient_ID) ON DM_TIME.Time_ID = DM_Summary.Time_ID;

6.5.5 Top Ten Causes of Hospital Attendance

UNIVERSITY OF GHANA HOSPITALTOP TEN CAUSES OF HOSPITAL ATTENDANCE

MONTH: December

YEAR: 2004

Diagnosis Name Number of Patients

Malaria 9


Tonsilitis 2


Conjunctivitis 1

Menopausal syndrome 1

Hypetension, Rhinitis 1

Gastroenterotis, second degree malaria 1

Ententis , Upper Respiratory Tract Infection 1

Dyspepsia 1

Figure 6-21: Top Ten Causes of Hospital Attendance

This component of the report shows the top ten main causes of hospital attendant by patients in 2004.

Most of the patients, 9 were diagnose of Malaria. Other results are Upper respiratory tract infection (4),

Tonsillitis (2), and Road traffic Accident (2). One each of the patient were diagnoses of Conjunctivitis,

menopausal syndrome, Hypertension & Rhinitis, Gastroenteritis & Second degree Malaria, Ententis and

Upper respiratory tract infection and Dyspepsia respectively. The syntax of the query used to generate

this report is as shown below:

SELECT TOP 10 Top_Ten_Best.[Diagnosis Name],

Main_causes_of_admission.Frequency

FROM DM_TIME, Top_Ten_Best INNER JOIN Main_causes_of_admission ON

Top_Ten_Best.[Diagnosis Name] = Main_causes_of_admission.[Diagnosis Name];

6.5.6 Status of Patients and Drugs Prescribed

UNIVERSITY OF GHANA HOSPITALSTATUS OF PATIENTS AND DRUGS PRESCRIBED

MONTH: December

YEAR: 2004

Status Drug Name

Junior staff Diclofenac, Vit C, Valium, Crepe Bondages

Junior staff dependent Ephedine nasal drops (0.5%), Sys. Acetified 2.5mls tabs,suspension Zitronax 80mg

Private patient Amoxyl, Diclofenac, Mercurochrome paintings, leopardbalme

Private patient Artesunuate course, Paracetamol, Widal STO, STH, Cipro500mg

Private patient Camoquine, Paracetamol, Mist Magnessium, Trisilicate

Private patient Chloramphenicol, ORS, Breastfeed frequently, Baby coughmixture

Private patient Chloroquine 3cc, phenegen, Amoxyl 250mg

Private patient Chloroquine, Vit C, paracetamol, Workadine Antisepticmouth wash

Private patient Gvitter 80mg, Alaxin 60mg and Paracetamol

Private patient Inj. Diclofenac 75mg, Inj. Tetanus 0.5mg, Paracetamoltables

Senior member Penicillium V, wokadine anticeptic gargle, decatylenlozenges

Senior member Tylenol,Paracetamol, IVF, D/L, Phenegan

Senior member Urine catteter

senior member dependent Amosiklav 625mg, wokadine antiseptics syrup,Decatylenlozenges

senior member dependent Chloranphenicol eye drop, Amoxyl 500mg, Paracetamol

senior member dependent Clotrimazole, vaginal cream, apply 2 times daily

senior member dependent Flucloxaccilkine, acyclovine cream, tetanol, Dichlophenac,flagyl

UNIVERSITY OF GHANA HOSPITALSTATUS OF PATIENTS AND DRUGS PRESCRIBED

MONTH: December

YEAR: 2004

senior staff Gvitter, Maxitrol, Gvitter Naphcos A

senior staff dependent Amosikklav 625mg, Decatylen Lozenges, paracetamol, VitC, Histacet

Figure 6-22: Status of Patients and Drugs Prescribed

Figure 6-22 shows the status of patients and the drugs that were prescribed for each of them in

December, 2004. The query used to generate this report is as shown below:

SELECT DISTINCT DM_Patient.Status, DM_Drug.Drug_Name

FROM DM_TIME INNER JOIN (DM_Patient INNER JOIN (DM_Drug INNER

JOIN DM_Summary ON DM_Drug.Drug_ID = DM_Summary.Drug_ID)

ON DM_Patient.Patient_ID = DM_Summary.Patient_ID) ON

DM_TIME.Time_ID = DM_Summary.Time_ID

GROUP BY DM_Patient.Status, DM_Drug.Drug_Name;

6.6 Conclusion

A new system has thus been developed in line with the three stage Data Mart Development Life Cycle

adopted and shall be released for use by the Records Unit to generate reports that summarise the

medical information of patients for decision-making.

CHAPTER SEVEN

RECOMMENDATIONS AND CONCLUSIONS

7.0 Recommendations

For hospitals and other institutions to benefit from data mining, the following recommendations are

required to establish and administer a Data Mart:

A dedicated personal computer with at least Pentium IV, Microsoft Windows 98, 128 MB

RAM, 40 HDD, and 8-bit graphics adapter and display should be made available solely for use

as a server for the Data Mart.

Date of Birth rather than the age of patients should be collected

A Patient Identification Number should be used in addition to the Medical Record number

currently being used at the Hospital. This will enable clinical research work to be carried out

easily and also ensure the confidentiality of each patient. Currently, staff and their dependents

are given the same number.

Archiving and backing-up of data in the data mart must be routinely carried out to ensure that

the Data Mart can be rebuilt following data loss or software /hardware failures. Pen Drives and

CDRW are recommended for this purpose

The staff of the Biostatistics Unit should be encouraged to collect returns and feed them into the

data mart. The incoming data should be scrutinized carefully.

A member of staff should be trained to handle the data entry and updating services. This should

ensure that the data mart is updated continuously.

Staff of the Medical Records Department and Decision Makers who will operate and manage

the new system should be trained to do so. User manuals will be provided. Senior management

will also be trained in less structured manner using video demonstrations to give them an

overview of the system.

Appropriate File, Hardware and Systems conversion methods, timely updating as well as

upgrading the system to keep pace with new product, services, customers, government

regulations and other requirements should be adopted.

With the passage of the Health Insurance Bill into Law, the Ghana Health Service will be

required to organize its internal systems to provide services as prescribed by law. This will

require extensive restructuring of both outpatient and inpatient services and in particular the

billing system to be operated. Improvements should be made to the system so that data mining

can be used to detect fraud in the Health Insurance scheme.

To operate the system, these procedures must be followed:

i. Key data into the various tables and save

ii. Double click on the “Load and Extract data mart” on the switch board

iii. Run the appropriate queries by clicking on the query buttons on the

switch board

iv. Print the appropriate reports by clicking on the report buttons on the

switch board

v. Close and exit the data mart system

7.1 Conclusion

The University of Ghana Hospital, like any other hospital in Ghana, has been collecting the medical

data of patients but the data is not processed electronically. The volume of patient medical data at the

various hospitals has been increasing over the years. However, the data is not properly managed. As a

result of this, majority of out-patients do not have full medical record. With this situation, the

physician’s time is wasted since they have to collect this information again and in addition, it becomes

very difficult for them to keep track of the patients. This reduces the ability to carry out high quality

clinical research in the hospitals, and compromises the continuity of healthcare as well as the quality of

healthcare delivery in the hospital.

A Data Mart can be designed to collect, store, organize and retrieve the medical information of patients.

In this dissertation, a Data Mart was designed and built using Microsoft Access 2000 DBMS for the

Medical Records Unit of the University of Ghana Hospital. The Data Mart was mined using MS Access

to provide timely, accurate and reliable reports to meet the information needs of clinical research,

improving health care continuity and research at the University Hospital.

Although the Data Mart was developed particularly for the Medical Records Department of the

University of Ghana hospital, it can be adapted for use in the Medical Records Unit of other hospitals.

Data mining tools such as the association rules, sequential rules and neural networks can be used to

mine the patient data mart to identify previously undiscovered patterns and trends important to the

hospital.

The Data Mart can be built upon and deployed in Hospitals currently operating under the National

Health Insurance Scheme for easy detection of false claims and disease management.

REFERENCES

1. Abbott, P. A., Goodwin, L., Cullen, P. and Delaney, C. (1999). ‘The ABC’s of Data Mining: A

primer for Health care Professionals” AMIA 1999 annual Seminar. Workshop/Session W11.

http://medicine.ucsd.edu/f99/index.html 18th June 2004.

2. Bamgboye, E.A and Familusi (1990). “Morbidity trends at the children’s emergency room, University

College Hospital, Ibadan, Nigeria”. Afr. J. Med. Sci. Vol 19. Pp 49-56

3. Bamgboye, E.A and Familusi (1990). “Mortality pattern at the children’s emergency ward, University

College Hospital, Ibadan, Nigeria”. Afr. J. Med. Sci. Vol 19. Pp 127-132

4. Baylis, Philip. (1999) “Better health care with data mining, Clementine-Working with health care”:

An SPSS white Paper. Pp. 2-9

5. Berdt, D.J., Hevner, A.R. and Studnicki , J (2000). “Hospital Discharge Transacations: A data

warehouse component”. Proceedings of the 33 rd Hawaii conference on system sciences. IEEE. Pp 1-10

6. Berry, M. and Linoff, G. (2000). Data mining Techniques. John Wesley and Sons, Inc. USA: In

Bicen, Pelin and Oktay Firat, S.U. (2003). “Knowledge Discovery in databases (KDD) and Data Mining:

An application of customer segmentation analysis in banking sector”. Bulletin of the International

Statistical Institute, 54th Session, Volume LX, Invited Papers, August 2003, Berlin, , Germany. Pp.136

7. Bicen, Pelin and Oktay Firat, S.U. (2003). “Knowledge Discovery in databases (KDD) and Data

Mining: An application of customer segmentation analysis in banking sector”. Bulletin of the International

Statistical Institute, 54th Session, Volume LX, Invited Papers, August 2003, Berlin, Germany. Pp.136

8. Biritwum, R. B, Gulaid, J and Amaning, A. O.(2000) “Pattern of diseases or conditions leading to

hospitalization at Korle Bu Teaching Hospital, Ghana in 1996”. Ghana Medical Journal , Vol. 34, Number

4, Dec 2000, pp 197 -205

9. Bramson, Robert T. (2000). “Data mining for medical management: Data mining represents an

opportunity to shape the way in which medicine will be practiced in the 21st century”. An article excerpted

from mining the data, presented at Radiology in the 21st Century: the Digital Department on February 25,

2000, in Maui, Hawaii retrieved from www.imaginingeconomics.com on June 5th, 2004.

10. Bresnahan, J (1997).” Lives may not hang in the balance of your company’s data mining efforts. But

the experiences of those in the business of curing patients could help you come up with a prescription for a

healthier business”. CIO magazine. July 15, 1997 Issue.

11. Christensen, W. F., and Di Cook (1998). “Data mining soil characteristics affecting corn yield”.

citeseer.nj.nec.com/christensen98data.html: In Abdullah et al (2003): “Learning Dynamics of Pesticide

abuse through data mining”. A paper which appeared at the Australasian Workshop on Data Mining and

Web Intelligence (AWDM&WI, 2003) Dunedin, New Zealand. Conference in Research and Practice in

Information Technology Journal. Vol 32. http://crpit.com/confpapers/CRPITV32Abdullah.pdf on 14th

October 2004

12. Computer Science Innovations, Inc. (2001): “Data mining in Health Care”. September, 2001. A

Company report. http:// www.csi-inc.com

13. Connolly, T. and Begg, C. (2005) Database systems. 3rd Ed. Addison-Wesley. Pp. 1239-1240

14. Carbone, P(2000): Data mart Logo Courtesy www.mitre.org/news/the_edge/august_00/carbone.html

(August 2000, Volume 4 No. 2) retrieved on 26th June 2004

15. Degruy, B. Kristin. “Health care applications of Knowledge Discovery in Databases”. Published in the

Online Journal of Healthcare Information Management (JHIM) http://www.himss.org November 13, 2004

16. Dilly, R (1995). Data mining: An Introduction. Lecture notes for students of the Queens University,

Belfast, UK. http://www.pcc.qub.ac.uk/tec/courses/datamining/stu_notes/dm_book_1.html 28th June

2004. Pp.1 and 4

17. Edelstein, Herbert (1996). “Technology How To: Mining Data Warehouses.” Information Week

(January 8, 1996): In Laudon, Kenneth C and Laudon, Jane Price(2000): Management Information systems:

Organisation and Technology in Networked Enterprise, 6th ed. Prentice-Hall, Upper Saddle River, New

Jersey, pp 469-470

18. Edelstein, Herbert (1999). Introduction to Data Mining and Knowledge Discovery, 3rd Ed. A

Publication of Two Crows Corporation, Pp 5-34 retrieved from http://www.twocrows.com on 23rd August

2003

19. Epstein, S and Denius, H (2001): “Health Care’s Rocky Road: Data Mining for Better Quality of

Care Is a Path Worn Well into the Future. Published in ADVANCE for Health Information Executive.

20. Frand Jason (2004). W hat is data mining?

http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.html 10th June 2004

21. Frawley, W.J., Piatetsky-Shapiro, G., Smyth, P, and Matheus, C.J.(1995). Knowledge Discovery in

Databases: An Overview: In Zaïane, R. O. (1999). CMPUT690 Principles of Knowledge Discovery in

Databases lecture Notes. Department of Computing Science, University of Alberta.

22. Grossman, R., Kasif, S., Moore, R., Rocke, D., and Ullman, J. (1998). Data Mining Research:

Opportunities and Challenges. A report of three National Science Foundation Workshops on Mining Large,

Massive, and Distributed Data retrieved from http://www.nsf.gov/ on June 28th, 2004

23. Gϋrsakal, Necmi and Acar, F. (1999). “Statistics, Data Analysis and Data mining”. IV National

Econometry and Statistics Symposium. Antalya, Turkey. Pg 2: In Bicen, Pelin and Oktay Firat, S.U. (2003).

“Knowledge Discovery in databases (KDD) and Data Mining: An application of customer segmentation

analysis in banking sector”. Bulletin of the International Statistical Institute, 54th Session, Volume LX,

Invited Papers, August 2003, Berlin, , Germany. Pp.136

24. Hagland, Mark (2004). “Stronger Computer tools allow deeper analysis of medical research, patient

care and insurance data in Health care Informatics”, April 2004. www.healthcare-

informatics.com/datamining.htm June 28th, 2004.

25. http://www.angoss.com

26. Http://www.billinmon.com

27. Http://www.datawarehousing.com

28. Http://www.dwinfocenter.org

29. Http://www.dw-institute.com

30. Http://www.ralphkimball.com

31. Inmon, W. H., Welch, J. D., Glassey, L. Katherine (1997). Managing the Data Warehouse: Practical

Techniques for monitoring Operations and Performances, Administering Data and tools and managing

Change and Growth. John Wiley & sons, Inc. USA. Pp 65-74.

32. Jones, Greg (2002). “Building Effective Data Marts: A practical, step-by-step guide to developing and

deploying the data mart that meets your business needs. An Article Published at www.oramag.com by

Fortune City. Http://www.fortunecity.com/skyscrapper/oracle/699/orahtml/oramag/38data2.htm 30th

October, 2004.

33. Kantor, J (2001): “A Case Study for ROI: With Intel® Processor-Based Data Mining “, In Company

newsletter of Clalit Health Services, Israel's second largest health service provider (HSP).

34. Krippendorf, M and Song, I (2002): The Translation of Star Schema into Entity-Relationship

diagrams”. College of Information Science and Technology Journal, Drexel University.

35. Laudon, Kenneth C and Laudon, Jane Price (2000). Management Information systems: Organisation

and Technology in Networked Enterprise, 6th ed. Prentice-Hall, Upper Saddle River, New Jersey.

36. Lin, W., Orgun, M.A, and Graham, J.W (2001): Mining Temporal Patterns from Health Care Data

retrieved www.csiro.org.au

37. Karuna, P. J (1997): Analysis of Data Mining Algorithms

38. Makulowich, John (1999). Government Data Mining Systems Defy Definition . Washington

Technology Online, Vol. 13 No. 22

http://www.washingtontechnology.com/news/13_22/tech_features/393-2.html on 15th October 2004

39. Mallach, G Efrem (2000). Decision Support Systems and Data Warehouse Systems. Irwin McGraw-

Hill. Pp 3-23, 197-228, and 465-552.

40. Marakas, M. George (1999). Decision Support Systems in the twenty-first century: DSS and data

mining technologies for tomorrow’s manager. Prentice-Hall, Inc. Upper saddle River, New Jersey 07458.

pp 1-11 and 321-382.

41. Marcel Holshemier & Arno Siebes (1994): In Jones, A.E (2001). Lecture Notes of CM35645-5 lecture

for Level 3 students of the Faculty of Computing, Engineering and Technology, Staffordshire University, UK.

Pg 4

42. Murphy, S. N., Morgan, M.M., Barnett, G.O and Chueh, H.C (2000): “Optimising Healthcare

Research Data Warehouse Design through Past COSTAR Query Analysis”. Laboratory of Computer

Science Report, Massachusetts General Hospital, Boston, MA.

43. Perner, P (2000):”Mining Knowledge in Medical Image Databases: In Data Mining and Knowledge

Discovery: Theory, Tools, and Technology, Belur V. Dasarathy (eds.), Proceedings of SPIE, Vol. 4057

(2000). Pp 359-369.

44. PriceWaterHouseCoopers (2000): “ Enhancing the Revenue Generation Capacity and Improving the

Efficiency and Cost Effectiveness of the University of Ghana including all institutes and schools- Final

Business Plan: University Hospital.” A Report submitted to the University of Ghana. November 2000. Pp 5-

9.

45. Raghu, R. and Gehrke, J. (2000): Database Management Systems. 2nd Edition, The McGraw Hill

Companies, Inc., USA. Pp 707-732

46. Schonlau, M., Bigelow, J.H. and Meili, R. C. (2003). “Exploratory analysis of patient level databases:

Building patient profiles for the MEPS data.” Bulletin of the International Statistical Institute, 54th Session,

Volume LX, Invited Papers, August 2003, Berlin, , Germany. Pp.294-296

47. Sokol, L., Garcia, B., West, M., Rodriguez, J and Kirk, J. (2001). “Precursory Steps to Mining HCFA

Health care claims”. Proceedings of the 34 th Annual Hawaii International Conference on System Science,

HICSS –34 Vol 6, January 03-06, 2001, Maui, Hawaii. Electronic Edition of IEEE Journal of Information

Systems.

48. Theus, Martin (2003). “A day in the life of a Data Miner”. Bulletin of the International Statistical

Institute, 54th Session, Volume LX, Invited Papers, August 2003, Berlin, , Germany. Pp.298-301

49. Verma, R. and Harper, J. (2003):“Lifecycle of a Data warehousing Project in Healthcare”. Published

in the Online Journal of Healthcare Information Management (JHIM) and http://www.himss.org and

Retrieved on November 13, 2004

50. Vice-Chancellor’s Annual Report, University of Ghana, 2000

51. Hersh, W (2003). What is Health/Medical/biomedical Informatics? A paper presented to the America

Medical Informatics Association. www.billhersh.info. Retrieved on 15/05/2005

52. Xinping, T. (2003). Building a database for Peking Union Medical College Hospital outpatients with

rheumatic diseases. A master’s thesis presented to the department of Medical Informatics and Outcome

Research and the Oregon Health & Sciences University School of Medicine. May 2003. unpublished

53. Zaïane, R. O. (1999). CMPUT690 Principles of Knowledge Discovery in Databases lecture Notes.

Department of Computing Science, University of Alberta.

developing a patient data mining system for the …

Documents