association rule mining based analysis on …€¦ · association rules may be of different types...

6
http://www.iaeme.com/IJCET/index.asp 76 [email protected] International Journal of Computer Engineering & Technology (IJCET) Volume 8, Issue 3, May-June 2017, pp. 76–81, Article ID: IJCET_08_03_008 Available online at http://www.iaeme.com/ijcet/issues.asp?JType=IJCET&VType=8&IType=3 Journal Impact Factor (2016): 9.3590(Calculated by GISI) www.jifactor.com ISSN Print: 0976-6367 and ISSN Online: 0976–6375 © IAEME Publication ASSOCIATION RULE MINING BASED ANALYSIS ON HOROSCOPE DATA – A PERSPECTIVE STUDY Rahul Shajan Research Scholar, Mahatma Gandhi University, Kottayam, Kerala, India Gladston Raj S Head, Department of CS, Govt. College, Nedumangadu, Thiruvananthapuram, Kerala, India ABSTRACT An efficient way of extracting information from huge data repositories is an art. The extraction of similar patterns from a number of transactions is also an important feature of data mining. We have a lot data mining techniques to extract information. Here, we adopt “Apriory algorithm”, which is one of the popular algorithms mainly used for Association rule mining. The purpose of association rule mining is to find all sets of items that have minimum support specified by the User. These items are the elements of a set called frequent item set. The association rules are generated by using this frequent item set. In astrology, the main database used for prediction is the horoscope of person. The horoscope is a database contains data like the planetary position at the time of birth of an individual. In all the incidents related to a human being, the horoscope of that person will consider to find the chance of happening good and bad things related to that incident. Similarly in some situations we need to find a horoscope that similar to our horoscope. The astrology believers select a life partner who has the similar horoscope (Horoscope matching).In that type of situation, we can utilize the pattern recognition techniques to find the similar horoscopes. In this work, we are trying to study the existing association rule mining algorithm “Apriory” is sufficient or not to find the similar patterns in the horoscope of different Individuals. Key word: Association rule mining, Apriory Algorithm, Pattern recognition, Horoscope. Cite this Article: Rahul Shajan and Gladston Raj S, Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study. International Journal of Computer Engineering & Technology, 8(2), 2017, pp.76–81. http://www.iaeme.com/ijcet/issues.asp?JType=IJCET&VType=8&IType=3

Upload: dinhtuyen

Post on 03-Jun-2018

226 views

Category:

Documents


1 download

TRANSCRIPT

http://www.iaeme.com/IJCET/index.asp 76 [email protected]

International Journal of Computer Engineering & Technology (IJCET) Volume 8, Issue 3, May-June 2017, pp. 76–81, Article ID: IJCET_08_03_008

Available online at

http://www.iaeme.com/ijcet/issues.asp?JType=IJCET&VType=8&IType=3

Journal Impact Factor (2016): 9.3590(Calculated by GISI) www.jifactor.com ISSN Print: 0976-6367 and ISSN Online: 0976–6375

© IAEME Publication

ASSOCIATION RULE MINING BASED

ANALYSIS ON HOROSCOPE DATA

– A PERSPECTIVE STUDY

Rahul Shajan

Research Scholar, Mahatma Gandhi University, Kottayam, Kerala, India

Gladston Raj S

Head, Department of CS, Govt. College, Nedumangadu,

Thiruvananthapuram, Kerala, India

ABSTRACT

An efficient way of extracting information from huge data repositories is an art.

The extraction of similar patterns from a number of transactions is also an important

feature of data mining. We have a lot data mining techniques to extract information.

Here, we adopt “Apriory algorithm”, which is one of the popular algorithms mainly

used for Association rule mining. The purpose of association rule mining is to find all

sets of items that have minimum support specified by the User. These items are the

elements of a set called frequent item set. The association rules are generated by

using this frequent item set. In astrology, the main database used for prediction is the

horoscope of person. The horoscope is a database contains data like the planetary

position at the time of birth of an individual. In all the incidents related to a human

being, the horoscope of that person will consider to find the chance of happening

good and bad things related to that incident. Similarly in some situations we need to

find a horoscope that similar to our horoscope. The astrology believers select a life

partner who has the similar horoscope (Horoscope matching).In that type of situation,

we can utilize the pattern recognition techniques to find the similar horoscopes. In

this work, we are trying to study the existing association rule mining algorithm

“Apriory” is sufficient or not to find the similar patterns in the horoscope of different

Individuals.

Key word: Association rule mining, Apriory Algorithm, Pattern recognition,

Horoscope.

Cite this Article: Rahul Shajan and Gladston Raj S, Association Rule Mining Based

Analysis on Horoscope Data – A Perspective Study. International Journal of

Computer Engineering & Technology, 8(2), 2017, pp.76–81.

http://www.iaeme.com/ijcet/issues.asp?JType=IJCET&VType=8&IType=3

Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study

http://www.iaeme.com/IJCET/index.asp 77 [email protected]

1. INTRODUCTION

1.1. Basic Concepts

Data mining is the exploration and analysis of large data sets, in order to discover meaningful

patterns and rules. The key idea is to find effective ways to combine the computer’s power to

process data with the human eye’s ability to detect patterns [1].Simply data mining is the

process of retrieval of user requested data from the huge and complex data repository. Before

the extraction of information the data source will undergo the different stages like the data

cleaning, integration, pre-processing etc. These tasks will help to remove the errors and

ensure consistency. There are two types of mining. First one is descriptive mining, it

characterises the general properties of the data in the database. Second one is predictive

mining where the current data is used in order to make the predictions. The data mining

techniques are widely used to find the similar patterns in large data sets. The scope of pattern

recognition is initially identified in business enterprises. Business enterprises are beginning to

realize that information on customers and buying patterns are the most valuable information.

Association rule mining is the one of the popular tools for pattern recognition.

1.2. Association Rule Mining

The problem of finding the associations from data is formulated in 1993 by Agrawal et al.

and is often referred to as the Market–basket problem [2]. In this problem there is a set of

items and large number of transactions. Each of the transaction is the subsets of these items.

The purpose is to find the relations between the different items within this transaction or

otherwise called it as baskets. The association rule mining is the process of finding the

interesting relations among the items. It is a two step process. Finding the frequent item set is

the first step and the generation of association rule is the second step in association rule

mining. Association rules may be of different types like Boolean and Quantitative. The

Boolean rules show the association between the presence and absence of items. The

quantitative association rule shows the association between the quantitative items [3].

For a given transaction database T, an association rule is an expression of the form X=>Y,

where X and Y are the subsets of A (set of items) and X=>Y holds with confidence ‘c’, if c%

of transactions in D that support X also support Y. The rule X=>Y has a support‘s’ in the

transaction set T if s% of transactions in T support X U Y [1].

1.3. Apriory Algorithm

Apriory is a popular algorithm also known as level-wise algorithm [1] to find the frequent

item set from a set of transactions. By using this frequent item set the algorithm will generate

the association rules. The frequent items should possess a minimum support that should be

same or exceeds the support specified by the user. Also the association rules should satisfy

the minimum confidence. This algorithm follows a downward closure property [4]. The low

support rules generated by the algorithm are uninteresting and it should be avoided. So the

support value is a key factor to avoid the uninterested rules. The value minimum confidence

which keeps the reliability of the rules generated. In association rule mining both the

minimum support and the minimum confidence are important. In this work, we are trying to

generate the association rules by finding the similar patterns from the horoscope data of

different individuals.

1.4. Horoscope

Horoscope is known as the birth chart of an individual. A birth chart shows the position of ten

planets (astrology consider the term planet in the sense of influencing factors in macrocosms)

in the universe at the time of birth. The ten planets which are considering in the horoscope are

Rahul Shajan and Gladston Raj S

http://www.iaeme.com/IJCET/index.asp 78 [email protected]

Sun, Moon, Mars, mercury, Jupiter, Venus, Saturn, Rahu, Kethu (Rahu and Kethu are the two

specific points in the universe) and the Gulikan. Besides these planets, the astrologers will

find out a specific attribute known as ‘Lagnam’. It is calculated by considering the birth time

with the sunrise time of the day on which the individual born. A birth chart is a 360 degree

chart and it is divided in to twelve houses with 30 degrees each. Aries, Taurus, Gemini,

cancer, Leo, Virgo, Libra, Scorpio, Sagittarius, Capricorn, Aquarius, Pisces are the twelve

houses. Each of the houses may carry one or more planets and some houses may be empty. In

this work, the planets in the different houses are considered as the items and the individuals

are considered as the transactions in the sense of association rule mining common practice.

2. LITERATURE SURVEY

Abhijith Roarane et al studied the advantages and disadvantages of data mining techniques.

The focus here is to know the consumer behaviour, their psychological condition at the time

of purchase and how suitable data mining method apply to improve conventional methods

[5].

Darshan M. Tank made a detailed study on association rule mining and Apriory

algorithm. Here two bottlenecks of frequent item sets mining are the large multitude of

candidate item set and poor efficiency of counting their support. He proposes an algorithm

which decreases pruning operations of candidate item sets, thereby saving time and

increasing efficiency [4].

Neelam Chaplot et al studied the influence of position of planets and stars at the time of

birth of a person for predicting the possibility of person become a doctor [10].

3. METHODOLOGY

For the implementation of the Apriory algorithm and the generation of association rules, there

should be an item set and a number of transactions. For example, in a sales review of a shop

we consider the items sold in that shop as the elements of item set. The different sales

transactions are the transaction set. Each transaction set is a subset of the item set and

different items may be repeated in different transactions also. We know that in a horoscope

we consider the eleven planets including Lagnam. So here, we consider the planets and their

position in the different houses. Any of the planets in the horoscope can occupy in any one of

the twelve houses. It is clearly depend on the astrological rules. So each planet has twelve

possible positional values. We can provide a numerical value for each house.

Table 1 Different houses in horoscope and its corresponding numerical values.

House Corresponding numerical value

Aries 1

Taurus 2

Gemini 3

Cancer 4

Leo 5

Virgo 6

Libra 7

Scorpio 8

Sagittarius 9

Capricorn 10

Aquarius 11

Pisces 12

Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study

http://www.iaeme.com/IJCET/index.asp 79 [email protected]

If ‘X’ is the name of the planet, then it has twelve possible values depend on the house

where it occupies. That is X1, X2, X3..........., and X12.

X3 value means that the planet X is occupied in the third house named Gemini. Similarly

other planets also have these twelve possibilities. So a total of 12*12=144 possible values are

there in a horoscope. We considered 144 positional values as the items. In the horoscope data

analysis by apriory algorithm, the length of the item set is fixed and its size is always 144.

Based on this concept, we have prepared a database by collecting the Name, Birth time,

Birth date, Birth place and average mark of the student in tenth and plus two classes. By

using the Birth details we are able to generate the birth chart and the horoscope details stored

in a data base as follows.

Figure 1 Database structure.

From the figure, it is clear that the attributes kept in our data base are the person id (pid),

planets positional value by combining the keyword of planet name with numerical

correspondence of Houses (Eg: Mo5 means Moon in House 5 that is Moon in Leo) and the

Average mark. The average mark is included in the data base to analyse the educational

performance of the students. In astrology, the educational performance of an individual is

calculated mainly by the planets Jupiter and Mercury. Also the “Bhavam” has a great role in

the astrological prediction. In a horoscope the Lagnam is always considered as the Bhavam 1,

so the house very next to Lgnam is treated as Bhavam 2. Similarly we can calculate the

twelve bhavas. Bhavams like 2, 5 and 10 are also considered for the analysis in the context of

education. Here we could find out the similar patterns and association rules from the given

data base and compare the rules mined with traditional astrological rules. By this analysis we

can find out that, the students with same range of mark have any similar patterns in their

horoscope data.

4. EXPERIMENTAL ANALYSIS

We use the popular data mining application WEKA (Waikato Environment for knowledge

Analysis) for the analysis. Initially the collected data undergoes the basic pre-processing

activities like data cleaning, integration, transformation etc. The data base file is kept as a

Rahul Shajan and Gladston Raj S

http://www.iaeme.com/IJCET/index.asp 80 [email protected]

CSV file and which is loaded to the WEKA application software. Apriory algorithm is

selected and executed, which analyses the frequent items and generate the association rules

with minimum confidence.

5. RESULTS

We got an output window that shows the minimum support and confidence values, total

number of cycles performed, size of the set of large item sets and the best rues found. Here

the minimum support is 0.6 and the minimum confidence is 0.9. The best rules generated

after the execution is shown below.

6. CONCLUSION AND FUTURE WORK

In the execution of the algorithm the first step is to check the count value of the occurrence of

the item to find the frequent item sets. Lk-1 is the frequent item set found in the (k-1)th

pass to

generate the candidate item set Ck, then the support of candidate in Ck is counted. In our

problem L1, L2 and L3 are generated. Also we got 10 Association rules which have the

minimum confidence. There are different rules which show the connection among the Rahu –

Kethu and Saturn-Rahu-Kethu. But only we have one rule which specifies the relation among

the Jupiter and some other planets. Earlier we found that in the context of education and

knowledge the key factors are the Jupiter and the Mercury. But there is no rule with Mercury.

Also the Bhavam oriented relations are also neglected. That is because Apriory algorithm

directly checks the count of items in different transactions and calculates the support and

confidence. The presence of same item in different transactions is considered in counting. But

in astrology counting is performed by checking the presence of same positional values of

planets in different individual’s horoscope.

If we can apply some conditions in to the algorithm just before the generation of frequent

item set and candidate item set, we get more association rules that help to analyse the

horoscope of an individual. For that we need to upgrade the Apriory algorithm as a

Conditional Apriory Algorithm (CAA). We conclude that the existing Apriory algorithm is

not fully capable to analyse the horoscope data in the context of astrology and a new

Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study

http://www.iaeme.com/IJCET/index.asp 81 [email protected]

upgraded Algorithm can perform effectively in Horoscope data analysis. This type of

horoscope analysis and pattern recognition will help to identify an individual with specific

type of birth chart. There are so many practical examples for the application of this work like

the identification of a life partner with matching horoscope from the huge data repository of

Marriage bureau.

REFERENCES

[1] Arun K Pujari, “Data mining techniques”, Hyderabad, India, Universities Press Private

Ltd, ISBN 978-81-7371-380-4.

[2] Aggrawal Charu, and Yu Philip, “Mining large item sets for association rules”, Bulletin of

the IEEE Computer Society Technical Committee on Data Engineering, no.1, March

1998.

[3] Smitha T.V. Sundaram, “Comparative Study of Data Mining Algorithms For High

Dimensional Data Analysis”, International Journal of Advances in Engineering &

Technology, Vol. 4, Issue 2, Sept 2012.ISSN: 2231-1963.

[4] Darshan M. Tank., ”Improved Apriori Algorithm for Mining Association Rules”,

International journal of Information Technology and Computer Science, 2014, 07, 15-

23,June 2014.

[5] Abhijit Raorane & R.V.Kulkarni.”, Data Mining Techniques: A Source for Consumer

Behaviour Analysis”, International Journal of Database Management Systems (IJDMS), Vol.3, No.3, August 2011.

[6] Pragya Agarwal, Madan Lal Yadav, Nupur Anand.”Study on Apriori Algorithm and its

Application in Grocery Store”. International Journal of Computer Applications (0975 –

8887) Volume 74– No.14, July 2013.

[7] Lin, H., GouminZ., Liu, Q., “Application of Apriori Algorithm to Data Mining of the

Wildfire”, In the proceeding of 6th International Conference on Fuzzy Systems and

Knowledge Discovery, 2009, pp.426-429.

[8] Maragatham G and Lakshmi M, ”A Recent Review On Association Rule Mining”, Indian

Journal of Computer Science and ISSN : 0976-5166E) Vol. 2 No. 6 Dec 2011-Jan 2012.

[9] Merseron, A. and Yacef, K., “Interestingness Measures for Association Rules in

Educational Data”, In the proceeding of 1st International Conference on Educational Data

Mining, 2008, pp. 1-10.

[10] Neelam Chaplot, Praveen Dhyani and O. P. Rishi, “Astrological Prediction for Profession

Doctor using Classification Techniques of Artificial Intelligence”, International Journal of

Computer Applications (0975 – 8887) Volume 122 – No.15, July 2015.

[11] Aruna J. Chamatkar, Dr. P.K. Butey. A Study on Association Rule Mining With Neural

Based Framework. International Journal of Computer Engineering and Technology

(IJCET), Volume 5, Issue 9, September (2014), pp. 172-181

[12] Nilamadhab Mishra, Art of Software Defect Association & Correction Using Association Rule Mining. International Journal of Computer Engineering and Technology (IJCET),

Volume 1, Number 1, May - June (2010

[13] M. Venkatesh, Dr. M. Krishnamurthy, Mining Association Rules For High Utility Item

sets Using Up Growth+Algorithm From Transactional Databases. International Journal of

Computer Engineering and Technology (IJCET), Volume 5, Issue 3, March (2014), pp.

164-173