evaluation and measurement

20
Evaluation and Measurement: What to Measure Your opencourseware evaluation activities will most likely have two main components. The first looks at the outcomes of the program and is mostly externally focused. The second will be more internally focused and looks at the effectiveness and efficiency of the program. While every opencourseware will have different program goals, a useful generalized model to consider looks at the outcomes through three distinct lenses: Access: Who is accessing your opencourseware Web site, what are their profiles (educator, student, self- learner, other), what are their areas of study and interest, where are they located, how are they accessing your Web site and how did they find out about your opencourseware in the first place? Use: How do these educators and learners use your opencourseware materials, and are your materials delivered appropriately to facilitate that use? What subject areas are drawing the most interest? To what extent, and in what ways, are the course materials being adopted or adapted for teaching purposes? Impact: What effects—positive or negative, intended or unintended—are being realized through the use of your opencourseware? The following sections look in more detail at the Access, Use, and Impact components of the program evaluation. Measuring Access Access oriented measures can vary widely. The following are a representative set of access-oriented questions that you may wish to measure: Who is accessing your opencourseware? How many at any given time? When and how often do individuals visit? Where are users coming from? What geographical regions, specific countries and from which specific organizations and institutions etc.?

Upload: mark-reardon

Post on 10-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evaluation and Measurement

Evaluation and Measurement: What to Measure

Your opencourseware evaluation activities will most likely have two main components. The first looks at the outcomes of the program and is mostly externally focused. The second will be more internally focused and looks at the effectiveness and efficiency of the program.

While every opencourseware will have different program goals, a useful generalized model to consider looks at the outcomes through three distinct lenses:

Access: Who is accessing your opencourseware Web site, what are their profiles (educator, student, self-learner, other), what are their areas of study and interest, where are they located, how are they accessing your Web site and how did they find out about your opencourseware in the first place?

Use: How do these educators and learners use your opencourseware materials, and are your materials delivered appropriately to facilitate that use?  What subject areas are drawing the most interest? To what extent, and in what ways, are the course materials being adopted or adapted for teaching purposes?

Impact: What effects—positive or negative, intended or unintended—are being realized through the use of your opencourseware?

The following sections look in more detail at the Access, Use, and Impact components of the program evaluation. 

Measuring Access

Access oriented measures can vary widely. The following are a representative set of access-oriented questions that you may wish to measure:

Who is accessing your opencourseware? How many at any given time? When and how often do individuals visit?

Where are users coming from? What geographical regions, specific countries and from which specific organizations and institutions etc.?

What is the educational profile of the users? What role: Educators, students, self-learners or others and where do they study?

What are the technical contexts through which people access opencourseware? What browsers and operating systems are they using; how fast is their connection to the Internet; are they accessing your Web site from home or work?

How well does your Web site's technical architecture perform in enabling people to access desired content and materials? Is the site responding fast enough under current user loads?

What is triggering awareness of and access to your opencourseware site? What prompted them to visit – an article in the media, a referral from a colleague etc.? What Web site were they using prior to coming to your site? Did they use a search engine and if so, what were they searching for?

Measuring use

Some representative questions for evaluation of use include:

Page 2: Evaluation and Measurement

What are people attempting to accomplish by interacting with opencourseware? What do people expect from opencourseware? How are people using opencourseware? What are the general patterns of online use and interaction? What subject areas and aspects of opencourseware draw the most/least interest

and use? How do people use/reuse opencourseware content offline or outside of

opencourseware? How effective is the opencourseware site and content for users? How usable is the opencourseware site? How useful are the available course materials in supporting users in achieving

their goals and completing their scenarios and tasks? Overall, how useful is opencourseware from the user's perspective?

Specific scenarios of use for your opencourseware materials will depend on a range of factors, primarily driven by the role of the user (educator, student, self-learner or other). The tables below describes typical scenarios of use that are likely to be encountered by your opencourseware program. 

Typical Educator Scenarios

Curriculum development. Establishing or revising overall curriculum organization and content; establishing or improving course offerings within disciplines.

Course development. Planning, developing or improving a course. Developing or enhancing methods and techniques for teaching particular content; Establishing or revising course syllabi, calendars, etc.

Course delivery. Integrating new materials into an existing course; adding elements (demonstrations, problems sets, assignments, etc.) to a course or specific class

Advising students. Providing feedback to students about courses of study and curriculum options.

Advancing research. Understanding current state of knowledge in a research subject area; connecting with colleagues with shared interests and research agendas.

Subject matter learning or reference. Exploring new areas or gaining new insights; understanding the current state of knowledge in an area of interest; connecting with academics who have similar interests.; using opencourseware as a reference tool.

Educational technology development. Planning or developing an educational Web site or related technology initiative using opencourseware content.

Typical Student Scenarios

Subject matter learning in support of current studies. Gaining new and complementary insights and alternative study materials related to a subject currently

Page 3: Evaluation and Measurement

Typical Student Scenarios

being studied.

Subject matter learning in support of courses that are not available. Providing access to course materials that are not provided or otherwise available through the current program in which they are enrolled.

Personal interest subject matter learning or reference. Exploring new areas or gaining new insights; relearning or reviewing materials from previous educational interactions; using opencourseware as a reference tool.

Planning courses of study. Exploring the range of subject matter in a particular discipline; making personal decisions about academic path.

Advancing research. Understanding current state of knowledge in a research subject area; finding links to information related to a research topic.

Typical Self-Learner Scenarios

Personal interest or professional related subject matter learning or reference. Exploring new areas or gaining new insights; re-learning or reviewing materials from previous educational interactions; using opencourseware as a reference tool.

Subject matter learning in lieu of courses that are not available. Providing access to course materials that are otherwise not available to the learner.

Planning future courses of study. Exploring the range of subject matter in a particular discipline; making personal decisions about academic paths.

Educational technology development. Planning or developing an educational web site or related technology initiative using opencourseware content.

Your opencourseware Web site should be designed to support these or other scenarios you may identify.  The evaluation should measure the relative importance of these different scenarios to users and the degree to which the site is successful in that supporting them.  The evaluation may also identify other unanticipated scenarios.

Measuring Impact

Once people access and use opencourseware, the question becomes: what difference does it make? The heart of any program evaluation is to understand and measure the impact the program has on its audiences. You should attempt to understand how individuals' teaching and learning experiences change (if at all) through the use of the site. Representative questions for evaluation of impact include:

What is the impact of OCW on individual teachers, students and learners for the various scenarios of use?

Page 4: Evaluation and Measurement

What is the impact of OCW on your institutions reputation and brand? How could that impact be increased?

Internal Focus

For internal purposes, a portion of your evaluation program can focus on how effectively and efficiently you work in accomplishing your goals.  One approach is to look at five key dimensions: organization, publication, technology, communications, and planning.  In each of these dimensions, you can then measure and assess based on two broad criteria:

Effectiveness (quality).  What is the quality of the processes, content, and organization?  For example, how easy is it for faculty to contribute their materials and how satisfied are they with the experience?

Efficiency (cost).  How much does it cost to operate, and are there ways to reduce costs while maintaining quality?

The following table shows some representative measurements to consider when planning internally focused process evaluation for an opencourseware initiative.

Efficiency

What resources does it take to deliver opencourseware

materials and services?

Effectiveness

What is the quality of these outputs, and how satisfied are

faculty contributors?

Financial reportsCosts, including costs by function, expense type, time period.

N/A

Level of effort tracking data

Level of effort by process step within course, within discipline, overall; non-labor costs

Volume of edits/rework, effectiveness of systems and procedures

Intellectual Property (IP) operations tracking database

Level of effort by IP object, costs for replacement objects

Volume of IP objects, permission process statistics, IP object replacement statistics

Content audit N/A

Richness of opencourseware content by course (numbers and types of sections [course elements])

Faculty survey Time required, number of interactions between faculty and opencourseware

Time spent, quality of interaction with opencourseware (faculty

Page 5: Evaluation and Measurement

Efficiency

What resources does it take to deliver opencourseware

materials and services?

Effectiveness

What is the quality of these outputs, and how satisfied are

faculty contributors?

liaisons, department liaisons), satisfaction with design and quality of published course

Page 6: Evaluation and Measurement

In general, the most thorough and comprehensive evaluation approach for a Web-based initiative is to adopt an integrated “portfolio approach” where you use of a variety of evaluation methods including ongoing user feedback collection, traditional surveys and interviews, online intercept surveys, and Web analytics.  The combination of these methods helps you to achieve both breadth and depth in the evaluation.  Each of the methods offers different reach and level of detail, but each also has its own cost and level of implementation complexity associated with it.

You should determine which of these methods makes most sense for your program based on scale, funding and other considerations. The remainder of this section briefly outlines some available evaluation techniques.

Web analytics

Web analytics refers to the direct monitoring and analysis of online user behavior and interactions with your opencourseware Web site

A large number of commercial Web analytics products exist today, offered on an ASP or self-hosted  basis. Some of the products available include Akamai Sitewise, CoreMetrics, NetGenesis, NetIQ WebTrends, Omniture SiteCatalyst and WebSideStory HitBox. These powerful tool-kits allow capture and analysis of Web site traffic volumes and tracking visitor behavior in a completely anonymous manner.

Typical features include capturing and reporting of navigation paths through your Web site, referral analysis (where users came from, how they navigate through the site, what content they view etc.), and geographic trends analysis (origin of user access).

Note that these tools will typically require some technical integration work to be able to operate with your Web site.

Online intercept surveys

To gather data about your opencourseware users, their intentions, goals, and scenarios of use, you can employ online intercept surveys using tools from companies such as Advanced Online Surveys, Netraker, RaoSoft EZSurvey and Zoomerang. These tools complement the Web analytics tools by providing data that enables a more refined analysis of users’ (self-reported) intentions, goals, and scenarios of use on the site and in some cases, when combined with the Web analytics data, comparisons with overall measured behavior.

Online Intercept tools prompt a random sample of people accessing your Web site and invite them (via a pop-up window) to complete an online survey. A small amount of technical integration work may be required in some cases. The survey, which can be customized to your program’s needs can collect anonymous "profile" information to help characterize the people who are accessing your site as well as the ways they are using the opencourseware materials, both online and offline. 

Page 7: Evaluation and Measurement

Evaluation and Measurement: Tools to Use

The sample composition will be determined solely by the array of people who access your Web site and choose to respond.  You will need to determine the appropriate number of surveys to collect and analyze in order to ensure adequate validity, reliability, and confidence levels. In particular, the sample size should be large enough to enable comparisons across key independent variables (e.g., role [educator, student, self-learner], geographic region, etc.). 

Supplemental surveys

Where appropriate, supplemental online (or offline) surveys can be conducted to complement the intercept surveys.  These will often use a different mode of distribution, generally via e-mail invitation, to members of defined target groups who may or may not have previous experience with your site.  Supplemental surveys can enable you to explorate specific subjects in more depth and identify problems or barriers to use, whether technical, cultural, or for other reasons.

Interviews

Interviews can be conducted with very defined subsets of people in various target groups to gather the deepest understanding as a complement to the broader surveys and Web analytics methods.

Site feedback

It is a good practice to invite voluntary user feedback via a feedback form on your Web site or via simple email.  While this feedback is anecdotal in nature, it can provide valuable insight into user's reactions to your opencourseware Web site as well as suggestions for improvements to the site. 

Page 8: Evaluation and Measurement

Evaluation and Measurement: Evaluation Cycles

Generally, evaluation and measurement of your program should be a permanent, ongoing activity. However, it is advisable to develop a high level plan for implementing your evaluation strategy determining when various activities should be undertaken. Both the program and process evaluation components will be ongoing activities. The frequency and relative depth of study should be commensurate with the scale of the overall program, the program’s goals and any requirements of funding organizations or other stakeholders.

For the program evaluation component, capture of Web analytic data (traffic and access measures) can be a continuous process. Also, collecting, responding to and periodically analyzing ad-hoc user feedback collected via the Web site should be a continuous part of the support process. Periodic reviews of major quantitative indicators should be undertaken and in particular, it is recommended to begin tracking the variation of key indicators over time.

It is likely that more periodic/episodic evaluation activity will be desirable in support of ongoing program management and goal-setting activities. Again, the size and scale of these periodic evaluations should be appropriate for the undertaking. However, it is likely that you should consider:

Periodic comprehensive studies every 12 – 18 months focusing broadly on use and impact to inform your strategic planning and goal setting activity.  This study would include a complete review of questions and indicators and may use surveys and interviews.

Periodic smaller studies to evaluate the effectiveness of your Web site and the published materials, or to answer specific questions that may arise. These can be tailored and may use the appropriate data gathering method depending on whether it is a qualitative or quantitative issue.

Ongoing collection and analysis of both process and program evaluation data to support overall program goal-setting and more tactical goals and objectives setting for the core business areas including Course Publication, Technology, User Support, Communications and Outreach. Examples of key metrics to consider tracking on a longitudinal basis include detailed level-of-effort data per course and per process step to understand the time and resources required to plan, build, publish, and support courses on opencourseware; detailed usage data at the course level; geographic traffic distribution for understanding outreach and communication effectiveness; and role based usage to ensure that you are reaching your target audiences appropriately.

Page 9: Evaluation and Measurement

Evaluation and Measurement: MIT OpenCourseWare's Approach

The need for evaluation and measurement was recognized early in the process of establishing MIT OpenCourseWare (MIT OCW) and as a result, we integrated a substantial evaluation component into the overall program effort. However, in the process of defining the goals and structure of the evaluation, a number of challenges were encountered.

First, the MIT OCW team had to decide why we were undertaking this evaluation process. In the end, the team focused on two major areas:

Tracking the usefulness and usability of MIT OCW, as well as internal efficiency, to help identify improvements to MIT OCW features and services and to set longer term direction to keep MIT OCW relevant over time.

To support communication and outreach activities by measuring use and by demonstrating the impact of MIT OCW and the course materials MIT offers through it.

Anticipated audience questions that were considered when refining the goals for the MIT OCW evaluation process are identified in the following table:

Constituency Questions

All, including general public

Who is using the materials, for what?· What is the educational impact of MIT OCW—is it making a difference?

MIT Faculty

How are my materials being received, what is the perception of quality?

What do my colleagues think about them?

Have my materials been adopted anywhere?

How much time are faculty putting into this?

What is “return” on “investment”? Is MIT OCW worthwhile for the faculty?

How have the MIT students used the sites?  What has been the impact on MIT students? 

MIT Key Stakeholders

How is MIT OCW being received by the public? 

By academic colleagues?

How is MIT OCW enhancing the image of MIT?

Page 10: Evaluation and Measurement

Constituency Questions

Have there been any changes/impact on the schools/departments that have participated thus far?

Funders

Is MIT OCW’s impact in line with our philanthropic program goals?

Is the output/outcome/value of MIT OCW in line with our expectations and worth the money we have invested so far?

Should we continue to invest?

Other Institutions

What does it take to publish course materials in terms of costs, organization/people

What is the impact of publishing?

What benefits accrue back to the institution?—Is OCW worth doing here?

OCW Staff

How effective/usable is OCW

How efficient is our publication process—what can we do better?

How effective are our communications—are people aware of OCW, especially potential users?

Where should we focus effort and resources in the future?

 

Second, given the nature of MIT OCW and its diverse constituency groups, it was challenging to pin down what was to be evaluated and what approach to use. The MIT OCW team wrestled with issues such as:

Was it an evaluation of actual academic educational outcomes? Was it an evaluation of the faculty’s course content quality? Are we simply evaluating a Web site and it’s performance and usage? Should we use a more “academic/research-based” approach to evaluation or a

more “business/evaluator-based” approach The academic/research-based approach suggested publishing the site, then

examining the reactions, and from that, learning what the program's goals should be

Page 11: Evaluation and Measurement

The business/evaluators approach pointed to the need for a more structured process where MIT OCW would first define hypothesis and quantifiable goals around MIT OCW use and then determine if they are valid.

Consistent with the dual mission of MIT OCW, the evaluation strategy was segmented into program evaluation and process evaluation.  Within each category, MIT OCW uses an "evaluation portfolio" approach that comprises a variety of data collection methods in order to achieve both breadth and depth in the evaluation.

Program evaluation

Program evaluation focuses on outputs—course materials, ancillary publications, and services—and the outcomes that result from them. We organized the program evaluation into:

Access. Who is accessing the MIT OCW Web site, what are their profiles (educator, student, self-learner, other), what are their disciplines (or other interests), and where are they located?

Use. How do educators and learners use MIT OCW, and is the Web site designed appropriately to facilitate that use? To what extent, and in what ways, are MIT course materials adopted or adapted for teaching purposes?

Impact. What effects, both positive or negative, intended or unintended, are being realized through the use of MIT OCW?

The following table summarizes data being collected by MIT OCW and the tools that are being used to collect it.

Access

Who is using OCW?

Use

How are they using it and does it meet their

needs?

Impact

What outcomes

result from this use?

Web analytics

(All site activity)

Traffic volumes, geographic origination, linked referral source, site entry points

Usage patterns including frequently visited departments, courses, sections.

N/A

Online intercept surveys

(Random, representative sampling of users; self-reported)

User profiles (role [educator, student, self-learner, other], institution profiles, country/context of origin, technology context/means of access, reliability, performance,

User goals/purposes/ scenarios/tasks, user expectations, site usability and usefulness/relevance, ability to complete intended tasks, level of adoption of materials, level and nature of

Leads for further follow up via supplemental surveys or interviews on significant outcomes

Page 12: Evaluation and Measurement

Access

Who is using OCW?

Use

How are they using it and does it meet their

needs?

Impact

What outcomes

result from this use?

referral source adaptation

Supplemental surveys

(Targeted sampling of users; self-reported)

Complementary to online intercept surveys to obtain richer understanding for targeted groups (e.g., educators in regions with less developed educational infrastructure)

Interviews

(Targeted sampling of users; self-reported)

Complementary to surveys to gain in-depth insights for development of case studies of MIT OCW use; also gives opportunity to request syllabi, etc., for content analysis

Site feedback analysis (self-selected respondents)

Anecdotal supplement to data about access

Anecdotal supplement to data about use, especially usability and relevance of MIT OCW for specific purposes

Anecdotal information about specific outcomes (may lead to further follow up)

 

For more detailed information on data collected by MIT OCW, see the Evaluation Strategy Document Appendix containing the Evaluation Indicators Matrix.

Process evaluation 

Process evaluation is more operations-oriented.  We measure efficiency and effectiveness of our work, primarily cost, volume, and quality. This data is then used to ensure that we are reaching production goals, meeting quality expectations, managing our finances, and working efficiently with faculty contributors (minimizing their investment of time and effort).  There are both quantitative and qualitative measures across the various dimensions of process evaluation.  The four main reporting components of the MIT OCW process evaluation are:

Financial reports.  The MIT OCW budget tracks expenditures by function and by expense category.  MIT OCW financial performance is analyzed monthly.

Page 13: Evaluation and Measurement

Level of effort tracking.  MIT OCW has developed process management tools that tracks status and level of effort for each course through all its production steps.  The following diagram summarizes the production tracking and analysis protocol:

IP operations tracking database.  Many courses have embedded third-party materials such as photos, graphs, charts, and video clips not originally authored by the contributing faculty member.  All such “IP objects” must be cleared with their respective copyright owners for publication on MIT OCW. We track numbers of IP objects overall and per course and statistics on the resolution of each object (permission granted/denied, object replaced, object deleted).

Content audit. MIT OCW audits all course content via reports generated by the content management system (CMS) and FileMaker databases. This is a mechanism for measuring and monitoring the richness (not the academic quality or rigor) of courses published on OCW. The audit reveals:

o Number of sections (component types, i.e., syllabus, calendar, lecture notes, assignments, exams, problem/solution sets, labs, projects, hypertextbooks, simulations, demonstration/learning tools, tutorials, and video lectures) per course

o Number of files (e.g., PDF documents, HTML pages, etc.) per course and per section within course

These reports can be sorted by school and academic department (showing differences by discipline).

In addition to the evaluation components described above, OCW has adopted a management goal setting and review process. Through this process, specific operational goals are set annually in the areas of organization, publication production process, technology, communications, and planning/evaluation.  Accomplishments and progress toward these goals are then reviewed quarterly. Job performance goals for individual OCW staff are linked to these management goals.

Reporting on evaluations

MIT OCW has adopted a three-tier approach to reporting on evaluation activity.

Annual evaluation report:  This will be the principal report of quantitative and qualitative evaluation findings, analysis, and recommendations, in particular for the program evaluation. 

o The annual report provides summaries and detail of evaluation data along with optional, additional more detailed background materials such as synopses of interviews, case studies, and other more qualitative material developed through the evaluation process.  The following is a representative outline/table of contents for the annual report:

Page 14: Evaluation and Measurement

Quarterly scorecard: The scorecard offers highlights of the ongoing components of the evaluation process. It focuses on key program indicators (particularly usage) and key process indicators. It includes a brief summary of data collected, including web analytics measures and where appropriate, survey data, as well as process measures. 

o The scorecard presents results in a dashboard format to allow them to be efficiently used by the leadership team to evaluate progress against key program goals and metrics and make appropriate decisions based on the progress.

Ad-hoc reports: Evaluation data may be mined from time to time for special analyses as may be required. Continued use of industry standard tools provided by companies like Akamai-Sitewise and Netraker allows for efficient development and delivery of results to ad-hoc surveys and custom views of usage indicators.

For complete details on MIT OCW’s process evaluation strategy, see the OCW Evaluation Strategy Document.