association rule mining based analysis on …€¦ · association rules may be of different types...
TRANSCRIPT
http://www.iaeme.com/IJCET/index.asp 76 [email protected]
International Journal of Computer Engineering & Technology (IJCET) Volume 8, Issue 3, May-June 2017, pp. 76–81, Article ID: IJCET_08_03_008
Available online at
http://www.iaeme.com/ijcet/issues.asp?JType=IJCET&VType=8&IType=3
Journal Impact Factor (2016): 9.3590(Calculated by GISI) www.jifactor.com ISSN Print: 0976-6367 and ISSN Online: 0976–6375
© IAEME Publication
ASSOCIATION RULE MINING BASED
ANALYSIS ON HOROSCOPE DATA
– A PERSPECTIVE STUDY
Rahul Shajan
Research Scholar, Mahatma Gandhi University, Kottayam, Kerala, India
Gladston Raj S
Head, Department of CS, Govt. College, Nedumangadu,
Thiruvananthapuram, Kerala, India
ABSTRACT
An efficient way of extracting information from huge data repositories is an art.
The extraction of similar patterns from a number of transactions is also an important
feature of data mining. We have a lot data mining techniques to extract information.
Here, we adopt “Apriory algorithm”, which is one of the popular algorithms mainly
used for Association rule mining. The purpose of association rule mining is to find all
sets of items that have minimum support specified by the User. These items are the
elements of a set called frequent item set. The association rules are generated by
using this frequent item set. In astrology, the main database used for prediction is the
horoscope of person. The horoscope is a database contains data like the planetary
position at the time of birth of an individual. In all the incidents related to a human
being, the horoscope of that person will consider to find the chance of happening
good and bad things related to that incident. Similarly in some situations we need to
find a horoscope that similar to our horoscope. The astrology believers select a life
partner who has the similar horoscope (Horoscope matching).In that type of situation,
we can utilize the pattern recognition techniques to find the similar horoscopes. In
this work, we are trying to study the existing association rule mining algorithm
“Apriory” is sufficient or not to find the similar patterns in the horoscope of different
Individuals.
Key word: Association rule mining, Apriory Algorithm, Pattern recognition,
Horoscope.
Cite this Article: Rahul Shajan and Gladston Raj S, Association Rule Mining Based
Analysis on Horoscope Data – A Perspective Study. International Journal of
Computer Engineering & Technology, 8(2), 2017, pp.76–81.
http://www.iaeme.com/ijcet/issues.asp?JType=IJCET&VType=8&IType=3
Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study
http://www.iaeme.com/IJCET/index.asp 77 [email protected]
1. INTRODUCTION
1.1. Basic Concepts
Data mining is the exploration and analysis of large data sets, in order to discover meaningful
patterns and rules. The key idea is to find effective ways to combine the computer’s power to
process data with the human eye’s ability to detect patterns [1].Simply data mining is the
process of retrieval of user requested data from the huge and complex data repository. Before
the extraction of information the data source will undergo the different stages like the data
cleaning, integration, pre-processing etc. These tasks will help to remove the errors and
ensure consistency. There are two types of mining. First one is descriptive mining, it
characterises the general properties of the data in the database. Second one is predictive
mining where the current data is used in order to make the predictions. The data mining
techniques are widely used to find the similar patterns in large data sets. The scope of pattern
recognition is initially identified in business enterprises. Business enterprises are beginning to
realize that information on customers and buying patterns are the most valuable information.
Association rule mining is the one of the popular tools for pattern recognition.
1.2. Association Rule Mining
The problem of finding the associations from data is formulated in 1993 by Agrawal et al.
and is often referred to as the Market–basket problem [2]. In this problem there is a set of
items and large number of transactions. Each of the transaction is the subsets of these items.
The purpose is to find the relations between the different items within this transaction or
otherwise called it as baskets. The association rule mining is the process of finding the
interesting relations among the items. It is a two step process. Finding the frequent item set is
the first step and the generation of association rule is the second step in association rule
mining. Association rules may be of different types like Boolean and Quantitative. The
Boolean rules show the association between the presence and absence of items. The
quantitative association rule shows the association between the quantitative items [3].
For a given transaction database T, an association rule is an expression of the form X=>Y,
where X and Y are the subsets of A (set of items) and X=>Y holds with confidence ‘c’, if c%
of transactions in D that support X also support Y. The rule X=>Y has a support‘s’ in the
transaction set T if s% of transactions in T support X U Y [1].
1.3. Apriory Algorithm
Apriory is a popular algorithm also known as level-wise algorithm [1] to find the frequent
item set from a set of transactions. By using this frequent item set the algorithm will generate
the association rules. The frequent items should possess a minimum support that should be
same or exceeds the support specified by the user. Also the association rules should satisfy
the minimum confidence. This algorithm follows a downward closure property [4]. The low
support rules generated by the algorithm are uninteresting and it should be avoided. So the
support value is a key factor to avoid the uninterested rules. The value minimum confidence
which keeps the reliability of the rules generated. In association rule mining both the
minimum support and the minimum confidence are important. In this work, we are trying to
generate the association rules by finding the similar patterns from the horoscope data of
different individuals.
1.4. Horoscope
Horoscope is known as the birth chart of an individual. A birth chart shows the position of ten
planets (astrology consider the term planet in the sense of influencing factors in macrocosms)
in the universe at the time of birth. The ten planets which are considering in the horoscope are
Rahul Shajan and Gladston Raj S
http://www.iaeme.com/IJCET/index.asp 78 [email protected]
Sun, Moon, Mars, mercury, Jupiter, Venus, Saturn, Rahu, Kethu (Rahu and Kethu are the two
specific points in the universe) and the Gulikan. Besides these planets, the astrologers will
find out a specific attribute known as ‘Lagnam’. It is calculated by considering the birth time
with the sunrise time of the day on which the individual born. A birth chart is a 360 degree
chart and it is divided in to twelve houses with 30 degrees each. Aries, Taurus, Gemini,
cancer, Leo, Virgo, Libra, Scorpio, Sagittarius, Capricorn, Aquarius, Pisces are the twelve
houses. Each of the houses may carry one or more planets and some houses may be empty. In
this work, the planets in the different houses are considered as the items and the individuals
are considered as the transactions in the sense of association rule mining common practice.
2. LITERATURE SURVEY
Abhijith Roarane et al studied the advantages and disadvantages of data mining techniques.
The focus here is to know the consumer behaviour, their psychological condition at the time
of purchase and how suitable data mining method apply to improve conventional methods
[5].
Darshan M. Tank made a detailed study on association rule mining and Apriory
algorithm. Here two bottlenecks of frequent item sets mining are the large multitude of
candidate item set and poor efficiency of counting their support. He proposes an algorithm
which decreases pruning operations of candidate item sets, thereby saving time and
increasing efficiency [4].
Neelam Chaplot et al studied the influence of position of planets and stars at the time of
birth of a person for predicting the possibility of person become a doctor [10].
3. METHODOLOGY
For the implementation of the Apriory algorithm and the generation of association rules, there
should be an item set and a number of transactions. For example, in a sales review of a shop
we consider the items sold in that shop as the elements of item set. The different sales
transactions are the transaction set. Each transaction set is a subset of the item set and
different items may be repeated in different transactions also. We know that in a horoscope
we consider the eleven planets including Lagnam. So here, we consider the planets and their
position in the different houses. Any of the planets in the horoscope can occupy in any one of
the twelve houses. It is clearly depend on the astrological rules. So each planet has twelve
possible positional values. We can provide a numerical value for each house.
Table 1 Different houses in horoscope and its corresponding numerical values.
House Corresponding numerical value
Aries 1
Taurus 2
Gemini 3
Cancer 4
Leo 5
Virgo 6
Libra 7
Scorpio 8
Sagittarius 9
Capricorn 10
Aquarius 11
Pisces 12
Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study
http://www.iaeme.com/IJCET/index.asp 79 [email protected]
If ‘X’ is the name of the planet, then it has twelve possible values depend on the house
where it occupies. That is X1, X2, X3..........., and X12.
X3 value means that the planet X is occupied in the third house named Gemini. Similarly
other planets also have these twelve possibilities. So a total of 12*12=144 possible values are
there in a horoscope. We considered 144 positional values as the items. In the horoscope data
analysis by apriory algorithm, the length of the item set is fixed and its size is always 144.
Based on this concept, we have prepared a database by collecting the Name, Birth time,
Birth date, Birth place and average mark of the student in tenth and plus two classes. By
using the Birth details we are able to generate the birth chart and the horoscope details stored
in a data base as follows.
Figure 1 Database structure.
From the figure, it is clear that the attributes kept in our data base are the person id (pid),
planets positional value by combining the keyword of planet name with numerical
correspondence of Houses (Eg: Mo5 means Moon in House 5 that is Moon in Leo) and the
Average mark. The average mark is included in the data base to analyse the educational
performance of the students. In astrology, the educational performance of an individual is
calculated mainly by the planets Jupiter and Mercury. Also the “Bhavam” has a great role in
the astrological prediction. In a horoscope the Lagnam is always considered as the Bhavam 1,
so the house very next to Lgnam is treated as Bhavam 2. Similarly we can calculate the
twelve bhavas. Bhavams like 2, 5 and 10 are also considered for the analysis in the context of
education. Here we could find out the similar patterns and association rules from the given
data base and compare the rules mined with traditional astrological rules. By this analysis we
can find out that, the students with same range of mark have any similar patterns in their
horoscope data.
4. EXPERIMENTAL ANALYSIS
We use the popular data mining application WEKA (Waikato Environment for knowledge
Analysis) for the analysis. Initially the collected data undergoes the basic pre-processing
activities like data cleaning, integration, transformation etc. The data base file is kept as a
Rahul Shajan and Gladston Raj S
http://www.iaeme.com/IJCET/index.asp 80 [email protected]
CSV file and which is loaded to the WEKA application software. Apriory algorithm is
selected and executed, which analyses the frequent items and generate the association rules
with minimum confidence.
5. RESULTS
We got an output window that shows the minimum support and confidence values, total
number of cycles performed, size of the set of large item sets and the best rues found. Here
the minimum support is 0.6 and the minimum confidence is 0.9. The best rules generated
after the execution is shown below.
6. CONCLUSION AND FUTURE WORK
In the execution of the algorithm the first step is to check the count value of the occurrence of
the item to find the frequent item sets. Lk-1 is the frequent item set found in the (k-1)th
pass to
generate the candidate item set Ck, then the support of candidate in Ck is counted. In our
problem L1, L2 and L3 are generated. Also we got 10 Association rules which have the
minimum confidence. There are different rules which show the connection among the Rahu –
Kethu and Saturn-Rahu-Kethu. But only we have one rule which specifies the relation among
the Jupiter and some other planets. Earlier we found that in the context of education and
knowledge the key factors are the Jupiter and the Mercury. But there is no rule with Mercury.
Also the Bhavam oriented relations are also neglected. That is because Apriory algorithm
directly checks the count of items in different transactions and calculates the support and
confidence. The presence of same item in different transactions is considered in counting. But
in astrology counting is performed by checking the presence of same positional values of
planets in different individual’s horoscope.
If we can apply some conditions in to the algorithm just before the generation of frequent
item set and candidate item set, we get more association rules that help to analyse the
horoscope of an individual. For that we need to upgrade the Apriory algorithm as a
Conditional Apriory Algorithm (CAA). We conclude that the existing Apriory algorithm is
not fully capable to analyse the horoscope data in the context of astrology and a new
Association Rule Mining Based Analysis on Horoscope Data – A Perspective Study
http://www.iaeme.com/IJCET/index.asp 81 [email protected]
upgraded Algorithm can perform effectively in Horoscope data analysis. This type of
horoscope analysis and pattern recognition will help to identify an individual with specific
type of birth chart. There are so many practical examples for the application of this work like
the identification of a life partner with matching horoscope from the huge data repository of
Marriage bureau.
REFERENCES
[1] Arun K Pujari, “Data mining techniques”, Hyderabad, India, Universities Press Private
Ltd, ISBN 978-81-7371-380-4.
[2] Aggrawal Charu, and Yu Philip, “Mining large item sets for association rules”, Bulletin of
the IEEE Computer Society Technical Committee on Data Engineering, no.1, March
1998.
[3] Smitha T.V. Sundaram, “Comparative Study of Data Mining Algorithms For High
Dimensional Data Analysis”, International Journal of Advances in Engineering &
Technology, Vol. 4, Issue 2, Sept 2012.ISSN: 2231-1963.
[4] Darshan M. Tank., ”Improved Apriori Algorithm for Mining Association Rules”,
International journal of Information Technology and Computer Science, 2014, 07, 15-
23,June 2014.
[5] Abhijit Raorane & R.V.Kulkarni.”, Data Mining Techniques: A Source for Consumer
Behaviour Analysis”, International Journal of Database Management Systems (IJDMS), Vol.3, No.3, August 2011.
[6] Pragya Agarwal, Madan Lal Yadav, Nupur Anand.”Study on Apriori Algorithm and its
Application in Grocery Store”. International Journal of Computer Applications (0975 –
8887) Volume 74– No.14, July 2013.
[7] Lin, H., GouminZ., Liu, Q., “Application of Apriori Algorithm to Data Mining of the
Wildfire”, In the proceeding of 6th International Conference on Fuzzy Systems and
Knowledge Discovery, 2009, pp.426-429.
[8] Maragatham G and Lakshmi M, ”A Recent Review On Association Rule Mining”, Indian
Journal of Computer Science and ISSN : 0976-5166E) Vol. 2 No. 6 Dec 2011-Jan 2012.
[9] Merseron, A. and Yacef, K., “Interestingness Measures for Association Rules in
Educational Data”, In the proceeding of 1st International Conference on Educational Data
Mining, 2008, pp. 1-10.
[10] Neelam Chaplot, Praveen Dhyani and O. P. Rishi, “Astrological Prediction for Profession
Doctor using Classification Techniques of Artificial Intelligence”, International Journal of
Computer Applications (0975 – 8887) Volume 122 – No.15, July 2015.
[11] Aruna J. Chamatkar, Dr. P.K. Butey. A Study on Association Rule Mining With Neural
Based Framework. International Journal of Computer Engineering and Technology
(IJCET), Volume 5, Issue 9, September (2014), pp. 172-181
[12] Nilamadhab Mishra, Art of Software Defect Association & Correction Using Association Rule Mining. International Journal of Computer Engineering and Technology (IJCET),
Volume 1, Number 1, May - June (2010
[13] M. Venkatesh, Dr. M. Krishnamurthy, Mining Association Rules For High Utility Item
sets Using Up Growth+Algorithm From Transactional Databases. International Journal of
Computer Engineering and Technology (IJCET), Volume 5, Issue 3, March (2014), pp.
164-173