a primer for applying big data in higher education: a case ... · the college of computer sciences...

11
International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017 35 A Primer for Applying Big Data in Higher Education: A Case for King Faisal University College of Computer Sciences and Information Technology Abdulrhman Haytham Uthman King Faisal University CCSIT, Al Hassa, Kingdom of Saudi Arabia [email protected] Corresponding Author: [email protected]

Upload: others

Post on 06-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

35

A Primer for Applying Big Data in Higher Education: A Case for King Faisal University College of Computer Sciences and Information Technology Abdulrhman Haytham Uthman King Faisal University CCSIT, Al Hassa, Kingdom of Saudi Arabia [email protected] Corresponding Author: [email protected]

Page 2: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

36

1. Abstract A paradigm of case-supported steps to start implementing Big Data in Higher Education is proposed to help the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application has been growing in marketing, government, security, and medicine. Though it already started its way to higher education, many universities and colleges are at a loss as to how they are going to embrace the technology. KFU’s CCSIT is one of them. Although the college had recently leveled up regarding international standards, its application of big data is in its infancy. This paper hopes to come up with a plan for this college and other higher education institution a framework to jump into big data. With prior understanding that it’s not going to be a one day process and that embarking on big data takes maturity, the focus is on initial steps to startup the implementation by preparing the data that must be collected. Keywords: big data, data mining, analytics, higher education 2. Introduction Big Data is one of the latest rock stars in Information Technology. And like other rock stars, it has been around for some time before it got the attention that it has right now. Its name speaks for itself as it’s so “big” that no one has yet given a precise definition of the concept. No institution can box it into one single state, and everything that is available are partial descriptions of what big data is. (Rocco, 2014) Its usage primarily upsurges in marketing, security, government, and hospitals. The catalyst of big data as applied in education has already begun. Several education-related types of research on big data have now been published, and the number keeps on growing. But applying big data is not in a snap of a finger by suddenly shifting to big data the next day. It is a long process, a continuum process that should start somewhere and will regenerate over and over. This study hopes to serve as a primer for King Faisal University College of Computer Sciences and Information Technology to start its transactions in readying for big data application. Currently, students’ grades and courses schedules are digitized. Registration, attendance, discussions, materials, assignments, class, and some quizzes are already on Blackboard online classroom software. But none of these data have been turned into a useful state to improve the manner of teaching and learning. With this, some questions need to be answered before pursuing: 1. Is big data application relevant in higher education? 2. Are there parallelisms in the existing applications of big data to higher education? 3. Are stakeholders ready to share their data? 4. Are the data currently digitized enough to apply big data? 5. What is the ideal state of the college to start its application? 6. What should the college do to begin the process?

Page 3: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

37

3. Big Data, Analytics and Data Mining Defined There are several definitions for big data available in almost every institution and research. But Gartner’s 3 “V”s are always in the gist of every definition. These 3 “V”s are volume, velocity, and variety. (Sicular, 2013). Volume refers to the massive amount of data available for storage and processing. Given a scenario where your mother asked you to find in a drawer the receipt of a television which she bought on a certain date. Upon opening the drawer, you saw hundreds of receipts inside. Wouldn’t it be a pain in the neck looking for that particular one? Now imagine all the receipts of your neighbors are also there, the whole city, the whole country, the whole world? That’s the literal lifelike situation of the cliché “finding a needle in a haystack” because of the “volume” of data. To solve this searching problem, “velocity” enters the scene. The power of present computing allows velocity or speed in searching for an item. Big data permits any search to be done at super speed, and the procedure may be comparable to browsing in Google or Yahoo. But is fast searching its only use? No, because “variety” is the essence of big data. More than finding an exact item, like what your mother wants to locate, big data gives valuable information to the user like what she prefers to buy, the period when she spends more, and her attitude towards packaged sales from companies. Knowing these things will help them to send your mother valuable information in the right place at the right time. For a company, big data starts with the collection of all digitized data available inside and outside its premises, whether the data be on the web or just on their local computers. They can be in text, sound or video format. Data can also come from machines with the advent of the internet of things (IoT). And because these raw data can be voluminous and diverse, they have to be cleaned in preparation for the questions that the stakeholders might ask. Answering them can be done by using a term called Analytics. Moreover, it can also provide relevant information they didn’t knew can be extracted from their big data. This process can be done through Data Mining. Big Data Analytics refers to the manner of looking at breaking and re-grouping of data to assess trends and compare a segment to another. It visualizes data to show areas of developments and relationships. For example, an owner can inquire about the comparison between the sales of his stores on location, period, or product. He can ask about the performance of his employees about certain defined factors such as age, home proximity, or benefits. Using Predictive Analytics, he can also ask what products should be made available in which stores at certain times and what products should be avoided.” (Junk, 2015) Data Mining provides details that a company didn’t realize they want to know beforehand. This information digs to the data for previously unrecognized patterns. (Junk, 2015). It is widely known in marketing’s basket analysis of items that are usually purchased together. The result of which became the reason in the re-arrangement of groceries and the suggestions for next purchase in an online store. If big data can be useful because of Analytics and Data Mining, what examples of what has already been provided by big data can be patterned as useful applications or outputs in higher education?

Page 4: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

38

4. What Big Data Can Offer in Higher Education Listed in Table 1 is a summary of selected popular applications of big data and their possible likeness when adapted in higher education. Each item in the table is further discussed in the succeeding sections. Table 1: Parallelism of Big Data Applications and Their Possible Likeness in Higher Education Application Possible Likeness in Higher Education 1. Government’s power investigation and deceit recognition Identify an idling student and find ways to help him. 2. Medicine uses it for health check analytics and suggesting possible diets or medicinal procedures depending on given variables A suggestion of a study plan to a student using variables such as age, habit, learning style, and GPA 3. Traffic management Class scheduling 4. Using association, a customer who buys product A will also buy product B A student who has difficulty in course outcome one will also have difficulty in course outcome three 5. Using clustering, a customer who joins group 1 and group 2 Grouping students as advanced or beginners to identify the type of materials they should work on 6. Using classification, a customer who chose an item and later cancels it for another item A student who picks an answer in an exam and later cancels it for another one. 7. Using prediction, a customer whose age is at least 20, female and fails to pay credit card at least two times will not be loyal to a bank A student whose age is at least 20, female and fails the same course at least two times will drop the college Application No. 1 IBM’s security consultant and former policeman, Shaun Hipgrave, narrated his experience in using big data and analytics to determine trouble spots in some towns and cities. He said that by identifying so-called “trouble families” and becoming aware of absences in school can already trigger the police to start an investigation to prevent crimes. (Ward, 2014). Its parallelism in higher education can be on identifying a nonserious student who submits plagiarized assignments, codes or essays. Once recognized, he should be helped and be oriented by the effects of his actions on himself. Application No. 2 In an interview with Dr. Eric Schadt, founding director of the Icahn Institute for Genomics and Multiscale Biology at New York’s Mount Sinai Health System, he said that by building better health profiles through big data, models could be generated for better diagnosis and treatment. (Chilukuri, 2015). Likewise in education, by building better student profiles with information such as age, habit, learning style, and GPA; a suggested study plan from a model can be prescribed to a student.

Page 5: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

39

Application No. 3 In the UK, trillion sensors, internet of things and big data have helped in making a smart transport that effectively solved traffic problems. (Mcclelland, 2016). In higher education, though it can also be used in the same traffic problems during the first and last class periods, a better parallelism at the onset is that of class scheduling which also causes conflicts in classrooms. Application No. 4 The well-known market basket analysis that looks for frequent patterns or items that come together maintains that if a customer will buy product A, there’s a good chance that he will also buy product B. (Han&Kamber, 2006). Its parallelism in higher education can be about course’s outcomes. For example, we can determine that a student who has difficulty in outcome 1 is likely to have difficulty also in outcome 3. Given the generated information, the system or a teacher can come up with a solution to help the student. Application No. 5 To use clustering in education, students can be divided into 2 or more groups to identify the level of knowledge of each group and thereby apply the necessary materials for study plan for each one. Application No. 6 Using classification where data are classified based on training sets and values, teachers can identify questions that were confusing or vague to students. Moreover, it can also recognize a behavior from a student if he is always disorganized and what causes it. (Han & Kamber, 2006). Application No. 7 Using prediction, where models are created to predict unknown or missing values, teachers and administrators can predict if a student will likely to drop the college. (Han & Kamber, 2006).

Page 6: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research a There are three stakeholders in the proceHowever, most of the data will be coming

Figur If you ask the majority, they will aof little information such as your searcheprivacy mode and claim that by doing soBut a statement released by Symantec, mabe as private as what they claim. As a cawere exposed in an AOL study. (BoatmanTo verify the majority’s earlier resame two questions and 42 or 92% and 3about students in the College of Computein front of computers and still believes thaDo they know that when they dethem? When you buy or even just searchinterest and will use social media that youWhen you watch a video on youtube, thisvideos that are related to those. For examare those related to football or even exclusIt’s like going to a market and lpurchase it. In the digital world, just byalready watching your next move and thAdidas running shoes signifies that you ltransaction will lead you to promotion rethose three things would want to take a ha

rch and Innovative Technology ISSN: 2313-3759 Vo

40

5. Stakeholder’s Stand process: the students, the teachers, and the administoming from the students. How much are they willing

gure 1: Stakeholders of Big Data Application will answer with a big “NO.” And even if you tone iearches, many of them will still say “No.” Almost alling so, everything you do is with anonymity and witec, maker of Norton Antivirus, revealed that some das a case of an elderly Georgia woman whose supposatman, 2017). rlier response, I asked 48 students from different lev and 39 or 81% respectively gave the popular “No” mputer Sciences and Information Technology who spves that their searches, readings, and gaming are attachey decided to connect to the internet, data are alrea search something in souq.com, the site already recoat you open to advertising those products or things the, this is already stored as your point of interest, andr example, a football fan would later find that the vid exclusive about his teams. and looking for items that interest you, and once ust by looking (searching) for an item an entire woand that look is already important to them. For examt you like shoes, you like Adidas brand, and you’re aion related to shoes, Adidas, and running gears. Ande a hand of your data. Most of the time, they will buy

Vol. 4 No. 8; August 2017 ministrators (see Figure 1). illing to share their data?

tone it down to just sharing ost all search engines have nd will never be recorded. me data searches might not upposedly private searches nt levels in our college the “No” answer. I am talking who spend half of their life attached with privacy. e already being taken from y records it as part of your ngs that are related to them. st, and they will soon show he videos suggested to him once you found one, you world of companies are r example, a search for an ou’re a runner. This simple . And companies related to ill buy your data.

Page 7: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

41

After explaining to each of the 48 students the case of big data, 40 or 83% are now willing (said “Yes”) to share their transactional information in preparation for big data. The reality may not be what many hoped for, but when we decided to digitize our habits and connect to the internet, we are already susceptible to an agreement of sharing our data. Given the fact that we are all sharing our clicks and choices in the world of web, what advantages and disadvantages will it do? In this study, we will concentrate on answering the benefits and will leave the other half for another discussion. 6. The Current State of KFU CCSIT KFU has been using Blackboard for more than half a decade now. Registration and grading systems have also been computerized. But with so much data coming from all corners of repositories, our generation now embraces a vast amount of data loosely termed as big data and the application through its analytics are creating a loud roar in almost every industry whether it be in marketing, government, security, medicine, and education. So why not bring this big thing in our university and embrace its gains? The current state of classroom setup where one lecture fits all students is a dogma of the past. The current generation uses technologies which were never available when the first founders of education have set up an ideal classroom setting. The challenge is now up to the incumbent administrators with the support of all teachers and students. Those who will be hesitant to change the status quo, should be clarified further. Similar to most universities right now, CCSIT uses the traditional classroom setup where: 1. a student receives a copy of the syllabus on the first day of class; 2. the syllabus contains description, goals, course outcomes, outlines, materials information, and grading system; 3. same lecture is presented periodically; 4. same set of assessments is given to the class cyclically; 5. a teacher evaluation is done at the end of the term; and 6. an indirect outcome assessment is asked among the students. The setup leads to problems such as: 1. students who are either fast learners or have prior knowledge of the course either get bored or may express disinterest in the class; 2. students who are either slow learners, missed the lecture or had difficulty understanding the last lesson could be left behind and may trigger a domino effect to his future courses; and 3. a teacher may struggle identifying who achieved the outcomes based on the current mode of paper exam. 7. The Ideal State of KFU CCSIT The ideal state in the college is explained below in 3 timelines: before registration, during the semester, and after the semester. Before Registration 1. an assessment is done to qualify the level of knowledge of a student starting a course/subject; 2. at least two sections are available to group the students according to the result of individual assessment in no. 1; and 3. an assessment is done to qualify the teacher who will handle the course.

Page 8: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research a During the Semester There exists a software that: 1. allows a student to study at his2. contains a number of support l3. can monitor the searches done 4. has suggestions (like ads) depe5. allows a teacher to monitor studying, etc.; 6. allows a teacher to suggest or c7. allows a student to rate teacher After the Semester 1. a post assessment of the studen2. a post assessment of the teache 8. SugGiven all the discussions on big data, thassess students’ learning and teachers’ per Figure 2 shows the suggested step

Figure 2: Step 1: Digitize the Lectures By digitizing the lectures, we donhave to be placed in a system where teachof the groupings of students. Step 2: Digitize the Assessments If exams are digitized, we can traWe can record and analyze why students core, and so should be the exams. An exam that can lead a student tolearning

rch and Innovative Technology ISSN: 2313-3759 Vo

42

at his own pace; port lectures and assessments; done to identify his concerns; ) depending on his searches and the result of evaluationitor the progress of a student, the sites he visiteest or create an action plan for the student to follow; aneacher’s performance. student is generated suggesting his achievement of the teacher is generated signifying his performance in theSuggested Initial Plan to Apply Big Data ata, the ideal state mentioned in the section above isrs’ performance by recording their data. d steps.

ure 2: Initial Steps in Big Data Application e don’t mean merely putting them in presentation fie teachers can re-arrange and modify the contents basan track how students answer questions when they cdents answered one and then chose another. Individudent to learn while taking it is no longer an assessmen

Vol. 4 No. 8; August 2017 aluations; visited, how long he was low; and t of the outcomes; and in the semester. ove is achievable. We can

tion file. Besides that, they nts based on his knowledge they change their answers. dividualized learning is the essment but a continuum of

Page 9: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research a Step 3: Reinventing the System. The system is the core of the newmade by its users. Let’s say for example that a cou(quarter 1, 2, 3, & 4) with each period hacan be fulfilled during quarters 1, 2 & 3; o

Fi Now if we are going to apply bigwhere are we going to put the supportinstudents to access as shown in Figure 4. The lectures are what’s going toonline. And when the system is assuredprogression. After an exam, links will be sent him in the errors he made. This task would

rch and Innovative Technology ISSN: 2313-3759 Vo

43

e new learning algorithm. It should be able to monitoa course/subject (see Figure 3) is periodically dividiod having a set of lectures and assessments. In the e & 3; outcome two during quarters 1 and 2; and so on.

Figure 3: Present Course Model ly big data to ensure that students will be able to mting lectures and assessments? It should be avai ing to appear as advertisements to students’ terminssured he has mastered the outcomes, he can move sent by analytics to a student on suggested readings would have been difficult for a teacher but would be

Figure 4: Proposed Course Model

Vol. 4 No. 8; August 2017 monitor all the transactions divided into four quarters the example, outcome one so on.

e to meet all the outcomes, e available all the time for erminals whenever they’re move to the next step of adings and tutorials to help ld be easy for a student.

Page 10: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

44

Behaviors that can be recorded in digitized lectures Some of the actions that can be recorded to assess better a student are: 1. time spent by a student in using every lecture (there can be many lectures for a specified topic); 2. usual time when he studies; 3. keywords he used in searching; and 4. ads or system generated suggestions that he clicked or opened. Behaviors that can be recorded in digitized assessments Some of the actions that can be recorded to assess better a student are: 1. time that a student spent to answer every question; 2. question where a student changed his answer; and 3. identify if he is either answering or is re-reading. There are two sorts of things that CCSIT will get from the system: (1) the individual student’s behavior and (2) the general behavior of students. For the individual student’s behavior, the system becomes a tutor and a friend to a student by knowing more about his study habits and how he grasps information, thereby suggesting lectures that can help him to understand the lessons more. Some possible information that can be derived through analytics or data mining: 1. If the student answered the question fast and it was correct, then the question is easy for him; otherwise; the question is hard for him. 2. If the student changed his answer to a question several times, then the question confused him. For the general students’ behavior, the general behavior of students can be the basis of a model which can be used to classify and predict scenarios for future students. Some possible information that can be derived through analytics or data mining: 1. identifying easy and hard questions for a student and the group and why so; 2. identify a future student who may experience difficulty 3. understand why students are underperforming What specific behaviors to record All student transactions should be logged. Unlike records of traditional databases that can be modified and deleted, big data logs can never be edited nor deleted. All entries are just added. Some of the behaviors that can be recorded are: 1. click to choose an answer; 2. click to change the current answer and select a new one; 3. time spent on a question; 4. time allotted on every part of the lecture; 5. frequency of browsing the same instruction; 6. search key words; and 7. dependence on ads shown.

Page 11: A Primer for Applying Big Data in Higher Education: A Case ... · the College of Computer Sciences in Information Technology (CCSIT) of King Faisal University (KFU). Big Data application

International Journal of Scientific Research and Innovative Technology ISSN: 2313-3759 Vol. 4 No. 8; August 2017

45

9. Conclusion Big data can understand a student’s behavior and can be an expert that can predict the conditions of future students. Imagine if CCSIT can find patterns in the learning behavior among the students and see their reoccurrence with other students. Imagine if CCSIT can distinguish a student’s difficulty and suggest to him a plan. Imagine, if engines can tell CSSIT the students’ interest, what they searched after an assignment was given, or how students get information from a question. Imagine if the ads on students’ computers are those that are of relevant to their current courses in the college. Imagine if CCSIT can identify the factors that separate achievers from the non-achievers. Is it affected by time devoted to studying, age, or sex? By venturing on big data, all of these need not be imagined anymore. But if they choose not to be part of big data, they are missing the chance to find out if students will succeed, their college will grow, or their lessons are effective. CCSIT should start digitizing a plethora of lectures and assessments for students and reinvent their system to apply big data. Furthermore, teachers must be trained in big data theory and practices. CCSIT should soon say goodbye to traditional classroom, blackboards and textbooks and say “hello” to big data application. ACKNOWLEDGMENT The author recognizes King Faisal University College of Computer Sciences and Information Technology for the opportunity to embark on researches even as a student. REFERENCES Boatman, K. (2017).Are Your Internet Searches Really Anonymous? Norton by Symantec. Chilukuri, S. (2015). Interview with Dr. Eric Chadt. McKinsey and Company Pharmaceuticals and Medical Products. Han, J.&Kamber, M. (2006). Data Mining Concepts and Techniques, second edition. Morgan Kaufmann Publishers. Junk, D. (2015).Business Intelligence vs Analytics vs Big Data vs Data Mining.Aptera. Mcclelland, J. (2016). How big data is now busting city traffic jams. Raconteur. Rocco, F. (2014). Big Data at School. Interview with Kenneth Cukier. The Economist. Sicular, S. (2013).Gartner's Big Data Definition Consists of Three Parts, Not to Be Confused with Three Vs. Gartner Inc. 2013. Ward, M. (2014).Crime fighting with big data weapons. BBC News.