figure 5 usage monitoring system...

FLUENTFast Learning from Unlabeled Episodes of Next-Generation Tailoring

TLA Design DocumentContract: W911QY-16-C-0019

Prepared for:

Advance Distributed Learning (ADL)

Prepared by:

SoarTech

3600GreenCt.,Ste.600,AnnArbor,MI48105Phone:734-327-8000•Fax:734-913-8537•Web:www.soartech.com

TABLE OF CONTENTSTable of Contents........................................................................................................................................2

Table of Figures...........................................................................................................................................3

Revision History.....................................................................................................................................4

1 Introduction.........................................................................................................................................5

2 What is the TLA?..................................................................................................................................5

3 What Process Governs TLA Development?..........................................................................................5

3.1 Design Based Research Progress Summary.................................................................................6

3.1.1 Year1 Accomplishments.......................................................................................................6

3.1.2 Year2 Goals..........................................................................................................................7

4 Architecture.........................................................................................................................................7

4.1 High Level Concept......................................................................................................................7

4.2 High Level Goals...........................................................................................................................8

4.3 Data Centric Design Vision...........................................................................................................8

4.4 2025 System Architecture..........................................................................................................11

4.5 Use Case-Specific System Diagrams...........................................................................................13

4.5.1 Meta-adaptation................................................................................................................13

4.5.2 Usage Monitoring..............................................................................................................15

5 Specification Strategy........................................................................................................................19

5.1 Goals..........................................................................................................................................19

5.2 Challenges..................................................................................................................................19

5.3 Recommended Approach..........................................................................................................21

5.3.1 Data Model........................................................................................................................22

5.3.2 Conceptual API...................................................................................................................22

5.3.3 Isolate Protocol Specifics ..................................................................................................23

5.3.4 Use Component Roles........................................................................................................23

6 Appendix............................................................................................................................................24

6.1 What is a PAL?...........................................................................................................................24

6.2 API Year1 to Year2 API Mapping................................................................................................25

TLA Design Document Version: 0.1.5 of 25

TABLE OF FIGURESFigure 1 TLA Concept...................................................................................................................................7Figure 2 TLA Conceptual Architecture.......................................................................................................11Figure 3 2025 Architecture........................................................................................................................12Figure 4 Meta-Adaptation System Diagram...............................................................................................14Figure 5 Year1 Usage Monitoring System Diagram...................................................................................16Figure 6 Year 2 Usage Monitoring System Diagram...................................................................................17


Revision HistoryName Date Reason for Change Version

SoarTech 5/13/2016 Initial Draft 0.1.0

SoarTech 7/5/2016 FLUENT Base PoP Update 0.2.0

SoarTech 11/1/2016 Remove materials now published via wiki. Conform to versioning pattern.

0.3.0

SoarTech 11/28/2016 Formatting standards 0.3.1

SoarTech 12/01/2016 Inserted new flow and pattern diagrams; Changed Learner Experience Facts with Learner Record Store; per ADL’s request, changed version number to 0.1.3

0.1.3

SoarTech 12/13/2016 Corrected “Learner Record Store” to “Learning Record Store”; corrected minor spelling typos

0.1.4

SoarTech 7/26/2017 Extracted Use Cases to a separate document. Incorporated Year1 Design Based Research revisions.

0.1.5


1 INTRODUCTION

The TLA Design Document describes the Total Learning Architecture (TLA)architectural design concept and specification strategies. Several appropriate use cases for the TLA are described in the document TLA Use Cases.docx. Additionally, specification requirements elicited during the design-based research are described in the document TLA Specification Requirements.docx. The authors suggest that readers reference those documents as needed. The TLA architecture is an ambitious concept with a timeline that extends out to 2025. This document will present the architectural vision for the 2025 timeframe, as well as interim versions that represent progress toward the 2025 vision.

2 WHAT IS THE TLA?

The TLA is an architecture described by specifications and best practices that enables improved learning through data sharing. TLA specifications are formalized documentation that enables multiple providers to contribute individual components to a TLA ecosystem in such a way that all components interoperate seamlessly. Specifications are aimed at long-term goals, and, thus, typically include enough flexibility to support many instantiations that support different use cases (e.g., consider the wide variety of uses the world has found for the HTTP standard, producing an internet filled with a rich abundance of information). Because of this flexibility, developers will find it helpful to have written best practice guides that convey the useful ways that a specification can be applied to meet the needs of a specific use case. The TLA will utilize best practices documentation to supplement the formal specifications to allow further idea sharing between developers. The expectation is that the TLA specifications and best practices will be publicly available, initially through ADL, and later by community-driven organizations focused on standards curation (e.g., IEEE).

3 WHAT PROCESS GOVERNS TLA DEVELOPMENT?

TLA development is using a design-based research approach(Wang and Hannafin, 2005 https://link.springer.com/journal/11423/53/4/page/1). Iterative analysis, design, development, and implementation cycles will shepherd the initial seed of an idea through to maturation. During this process, several terms are relevant and important to understand:

Requirements describe what need/s a software solution should address. Experimental Prototype describes engineering implementations (software) whose purpose is to

test design ideas under real-world conditions to generate data that can be examined to elicit and refine requirements and refine designs that meet the requirement needs.

Specifications describe the formalization of a design that has been proven to be effective at meeting requirements. Specifications are not software! They are guides for creating software. And, in particular, they are guides that are community-driven and public. They do not represent "secret sauce" that is kept private as the confidential intellectual property of one organization.


Instantiations are software implementations of specifications. The specification indicates "how" to build software, like the blueprint for a building. The instantiation is the fully constructed software that implements the specification, like the real building that is constructed from a blueprint. Many different instantiations are possible, in the same way that many actual buildings can be constructed from the same blueprint.

Conformant describes a software instantiation that fully meets the requirements that a specification dictates. The word compliant is often used interchangeably with the word conformant; however, there is better industry-wide acceptance of a common definition for conformance.

Reference Implementation is a software instantiation whose purpose is to serve as an example for developers to use to fully understand exactly how to implement a specification. Reference implementations are conformant to the standard, but are typically not optimized for production use. Instead, their purpose is to provide an example that is easy for developers to understand to help them get started using a specification.

Production Implementation is a specification-conformant software instantiation that has been hardened and deployed, and is capable of standing up to the rigors of long-term use under real-world conditions. Production implementations are often not straightforward for an outside developer to understand, because the work required to ensure they operate efficiently and flawlessly under heavy usage can add layers of software detail that obscure the basics.

The TLA development cycles will utilize Experimental Prototypes to test and refine requirements and design ideas. The TLA is not a software instantiation; it is defined by specifications. The Experimental Prototypes will be used to test designs and produce draft TLA specifications. The draft specifications will be reviewed and improved though community collaboration and continued Experimental Prototyping. To illustrate the proper use of the TLA specifications, ADL may maintain reference implementations(e.g., the Learning Record Store or LRS). However, the reference implementations themselves are not part of the formal definition of the TLA. Any developer is free to create their own replacement implementation of the specification; all that is necessary for labeling the resulting implementation as a TLA instantiation is that the implementation is conformant to the TLA specifications.

3.1 DESIGN BASED RESEARCH PROGRESS SUMMARYFLUENT is a five-year effort to produce TLA specifications. Currently, the effort is at the end of Year1, and embarking on Year2.

3.1.1 Year1 Accomplishments Preliminary use cases were identified and described in TLA Use Cases.docx A preliminary architecture was designed and described in TLA Design Document.docx version

0.1.5. A preliminary TLA specification draft was produced and documented at

https://confluence.soartech.com/display/FLUENTSHARE/TLA+Documentation#TLADocumentation-TLAAPISpecs

TLA specifications were reviewed and critiqued by the community using a formal Delphi process (https://en.wikipedia.org/wiki/Delphi_method)


An initial Experimental Prototype implementation of the draft TLA 1specification the (Fort Bragg prototype) was produced and tested in April 2017 at Fort Bragg, where experimental data was gathered.

Outcomes from the first experimental prototype were analyzed and used to:o Produce an initial specification requirements document. See TLA Specification

Requirements.docxo Identify lessons learned and iterate on the TLA architecture design and specification

strategy (this document).

3.1.2 Year2 Goals Formalize the TLA specifications

o Update specifications to meet requirements identified by Year1 requirements analysiso Leverage existing standardized specifications whenever possible to meet requirements

Increase the amount of data that TLA participants can share Prioritize ease of understanding, extending, and maintaining specifications Balance rapid prototyping needs with community-driven specifications

o Identify a best practice workflow

4 ARCHITECTURE

4.1 HIGH LEVEL CONCEPTConceptually, a TLA enabled learning system is an ecosystem for learning that encourages multiple technology providers to collaborate to produce a better learner experience than one provider could produce in isolation. One of the key lessons learned in Year1 is that at its core, the TLA is about sharing data, the more the better! Figure 1 below depicts TLA-enabled data sharing.

Figure 1 TLA Concept


All technology providers that have a capability to incorporate into the TLA ecosystem can be thought of in a very broad sense as (either or both):

Data Producers, or components that output some learning-related data that they make available for use by other components in the TLA ecosystem.

Data Consumers, or components that ingest learning-related data produced by another component and use it to improve learning.

Another key lesson learned in Year1 is that to reach its full potential, the TLA ecosystem needs to enable sharing many kinds of data. Some types of data directly relate to recording the experiences of the learner and tracking their learning progress. Other types of data relate more indirectly and help researchers to identify and refine science of learning approaches that push forward the envelope of state of the art learning techniques. And still other data help course curators to ensure that the learning materials the learner experiences align with their educational objectives.

4.2 HIGH LEVEL GOALSThe TLA's goals are unique and far reaching, aiming for a 2025 timeline. Specific high-level goals include the following:

Break the status quo that is prevalent in current learning systems. Rather than building stove-piped learning systems that are brittle and isolated, the TLA aims to foster an environment that encourages collaboration.

Encourage collaboration by fostering inclusivity. Rather than designing a fixed solution that meets a single use case, the TLA should be designed to be flexible. New use and unexpected use cases are welcomed. The TLA design and specifications should be robust enough to support new use cases as they arise.

Design with an eye toward the future. It isn't enough to design a system architecture that can meet today's requirements; designing for the long haul means producing a design capable of meeting tomorrow's requirements, too.

4.3 DATA CENTRIC DESIGN VISIONThe sheer quantity of data that producers and consumers in the TLA ecosystem would like to be able to share presents some challenges:

Data spans multiple functional areas. It is difficult to clearly convey to TLA participants how and when to use the different types of

data.

How can we best wrangle all this data into a design that meets the TLA design goals? Let's be more specific about the challenge in front of us. It is very likely that the set of data relevant to learning will continue to grow over time. To manage the growth in a sustainable fashion, the data must be organized in a way that is both easy to extend and easy to understand. A key lesson learned in Year1 is that adding a new name(for a service or component) to the architecture each time a new kind of data relevant to learning is added quickly becomes unwieldy and confusing (see the section titled "Many Components" in the TLA Specification Requirements.docx). Instead, components should be categorized more generally in a way that is applicable to all use cases:


TLA Conformant Apps– User-facing tools that use TLA data to meet an end-user need Processors– Software components that intake data, transform it in some way, and output

another type of data Data Stores– Software components that store, but do not modify data

In this section, we will discuss an architectural approach that is data-centric rather than component-centric. The TLA architecture will establish three data models:

1. Learner: Data that is about a specific, individual learner2. Science of Learning: Data describing how humans learn3. Asset: Data describing specific software components

These three powerful data models are simple to understand and applicable to all use cases that are relevant to the TLA. All types of data discovered in Year1 requirements analysis to be either required or desired are easy to categorize as belonging to one of these three data models.

All TLA communication between producers and consumers is centered on exchanging data. We have categorized the types of data that can be exchanged (Learner, Science of Learning, Asset). Next, we must classify the types of exchanges in a general way that will be flexible enough to allow additions as needed when applying the TLA to new and unexpected use cases. The TLA will use promises to classify data exchanges. A promise classifies the output that a data producer produces:

1. Raw: Data that represents a direct report on an event or state2. Inferred: Data deduced from existing data about current or past states3. Deconflicted: Data disambiguated according to a specified policy governing conflict resolution4. Predicted: Data indicating future conditions deduced from existing data5. Stored: Data that is unmodified from when it was received (historical record)

These five powerful promise types are simple to understand and applicable to all types of TLA data exchanges. The promise types are compatible with all three data model types. Any piece of data that belongs to any of the three data model types (Learner, Science of Learning, Asset) can be output by a data producer, and have one or more of the five promise types attached to it.

Let's examine a concrete example. Let's use a piece of data from the Learner data model mastery estimate.

Raw: When a new learner joins a TLA enabled learning system, a self-evaluation UI might ask the learner to rate their mastery for a specific skill. The evaluation UI, as a TLA data producer, would provide a piece of raw data indicating the mastery estimate that the learner entered.

Inferred: Next, the learner might experience a learning activity. That learning activity will share the learner's experience with the TLA (e.g., answered question 7 correctly, or completed scenario 5). An evidence mapping component might then observe the stream of student experiences and deduce a mastery estimate based on that experience stream. The evidence mapper as a TLA producer is outputting inferred mastery estimate data.

Deconflicted: Now both a raw mastery estimate (produced by the self-evaluation) and an inferred mastery estimate (produced by an evidence mapper) for the same skill (or competency) exist simultaneously. Which is "right"? Well that depends on how you wish to resolve the discrepancy. Many different conflict resolution policies are possible. Examples include using the


most recent answer, averaging the answer, returning all answers instead of just one, or performing a more complex calculation or filtering. A software component that can resolve the discrepancy according to the policy of choice and output the result is producing deconflicted data.

Predicted: Just in time support, that provides assistance at the moment of need, relies on anticipating need. If the learner has completed multiple learning activities that teach the same skill (competency), but the raw and inferred mastery estimates corresponding to that timeframe do not show any improvement, a software component observing this trend could deduce that the learner needs instructor assistance in making progress on this specific competency. This is really a prediction; that is, the observational component is predicting that if the learner were to receive instructor assistance in this competency, their mastery estimate would improve.

Stored: Each mastery estimate that has been produced in the preceding examples might be valuable to another software component as input at some point in the future (potentially far into the future). It is not feasible to assume that the component that might wish to consume the data in the future was active at the time when the data was produced. For example, the learner might wish to bring up a dashboard where they can track their progress for the year. Or a researcher might wish to examine long running historical records of many students to observe trends over time. Software components that can output historical data are producing stored data.

Finally, let's consider the processor components. This was an area that was especially problematic to classify in Year1. Each new use case examined seemed to require different types of processors. However, the introduction of promise types provides a way to easily generalize the categorization of processors:

Inference Processors: Intake data, apply internal algorithms that make inferences about the input, and output the inferences with the promise type inferred.

Prediction Processors: Intake data, apply internal algorithms that make predictions based on the input, and output the predictions with the promise type predicted.

Deconfliction Processors: Intake data, apply internal algorithms to disambiguate the data according to a conflict resolution policy, then output the result with the promise type predicted.

The three data models, five promise types, and three processor types can be organized into a general, and extensible architectural vision as shown in Figure 2 TLA Conceptual Architecture.


Figure 2 TLA Conceptual Architecture

All TLA communication is centered on exchanging data. The three data models (Learner, Science of Learning, and Asset) form the heart of the TLA Cloud. The three types of processors (Inference, Prediction, and Deconfliction) orbit the data models, ingesting data from them, and outputting data to them as a result of internal calculations. Learner facing TLA Apps connect to the TLA Cloud, sharing data with it and drawing data from it to produce optimal learning experiences.

4.4 2025 SYSTEM ARCHITECTURENow that we have defined our data categories (Learner, Science of Learning, and Asset), classified the types of data exchanges (raw, inferred, deconflicted, predicted, stored), and categorized the data processors (Inferred, Predicted, and Deconflicted), it is time to understand how this architecture would be instantiated. A specific instantiation will always be focused on the needs of specific use cases. However, as the TLA is designed to meet the needs of multiple use cases, we can show a composite system architecture diagram (Figure 3 2025 Architecture) that shows highlights of the types of components that might be present in one TLA instantiation that is meeting the needs of multiple use cases simultaneously.


Figure 3 2025 Architecture

As this diagram illustrates, many different types of user facing Apps and TLA Processors can coexist in harmony in the TLA ecosystem, functioning together using the three data models, five promise types and three processor types to meet the needs of multiple use cases that serve different user communities.

We can also see from this diagram that the data models are not designed as monolithic datastores. That is, rather than a single data store icon for each of the three logical data models (Learner, Science of Learning, Asset), there are many smaller data store icons that represent sub-sections logical data model. That is, the logical data models are intended to be implemented as collections of distributed data stores that collaborate to provide a logical construct such as "Learner Data Model." Another important observation is that the names of specific individual processors, while meaningful in the context of specific use cases, are not universally applicable. However, the logical categorization of processors (Inference, Prediction, Deconfliction) are universally applicable and directly derived from the data


promises. To be flexible enough to accommodate new and unexpected use cases, the names of all specific types of processors cannot be fixed apriori. Instead, the categories of processor (Inference, Prediction, and Deconfliction) are fixed, and new use cases can create new processors that reside in one of the processor categories, naming them in a way that is meaningful to the use case. Next, we will explore some examples of system diagrams for TLA instantiations that are use case-specific.

4.5 USE CASE-SPECIFIC SYSTEM DIAGRAMSThe TLA Architecture has been designed to be very flexible, so that TLA enabled learning systems can be configured to support the needs of a variety of users and use cases. In this section, we will explore applying the TLA architecture to specific use cases, focusing on understanding the system level design for instantiating the TLA to meet the needs of the specific use case. This will not be an exhaustive review of use cases, but instead will use some select use cases to illustrate the TLA architectural vision. We will examine what the data flow of information within the TLA architecture looks like, and which specific components would be instantiated in a TLA conformant manner to achieve the use case goals.

4.5.1 Meta-adaptationThe meta-adaptation use case is primarily focused on the learner. The learner’s experiences are shared with the TLA; the data is interpreted, and the learner's experience is adapted to optimize their learning (see TLA Use Cases.docx for a full description of the meta-adaptation use case). Diving into the details of how to construct a TLA instantiation that meets the needs of this use case, we arrive at Figure 4 Meta-Adaptation System Diagram. There's a lot going on here; let's break it down.


Figure 4 Meta-Adaptation System Diagram

First, we can see that all instantiated components (delineated by vertical purple lines) are either

1. TLA Conformant2. Part of the TLA Cloud

Next, the TLA Cloud components can be identified (delineated by horizontal purple lines) as either

1. Part of one of the three data models (Learner, Science of Learning, Asset)2. One of the three types of TLA Processor (Inference, Prediction, Deconfliction)

Third, the data exchanges between components (shown as arrows in the diagram) are classified by the nature of the exchange:

1. Which data model the data is a part of (Learner, Science of Learning, Asset). The APIs have been named to reflect the data model the data is a part of (if you were familiar with the Year1 names for the APIs, you can refer to the Appendix API Year1 to Year2 API Mapping).

2. The type of promise the data producer makes about the data (raw, inferred, deconflicted, predicted, stored).


The specific components that have been instantiated work together to meet the needs of the use case. Let's trace the data flow to see how that works.

The learner’s experiences in an Activity Provider are shared with the TLA as raw data using the LearnerAPI. They are stored for future use by other components in the LRS.

The experiences stored in the LRS are used as input to the Evidence Mapper (using the LearnerAPI), which examines them for evidence of changes in the learner’s mastery estimates. The Evidence Mapper outputs inferred mastery estimates using the LearnerAPI, which are stored for future use by other components in the Mastery & Goals component which is a subset of the Learner Data Model.

Similarly, the Context Inference Engine uses the LearnerAPI to examine the experiences stored in the LRS. It also produces inferred data, but the kind of inferences it makes are different from those of the Evidence Mapper. The Context Inference Engine infers data such as "learner is frustrated" which is stored in the Context component, which is a subset of the Learner Data Model.

The Learner Prediction Processor uses the LearnerAPI to access information stored in the Learner Data Model, which can include the information produced by the Evidence Mapper and Conference Inference Engine. It, in turn, makes predictions about the learner's needs (e.g., because the learner is frustrated, it is predicted that they would benefit from a remediation technique for a specific competency). The predictions are output using the LearnerAPI. These could be stored in the Learner data model if there is a reason to keep an historical record of them. However, it is more likely they will be used immediately by data consumers, and no historical record is necessary.

The Recommender uses the LearnerAPI to receive predictions that are produced by the Learner Prediction Processor. Upon receiving a prediction that the learner would benefit from remediation in a specific competency, the Recommender might use the AssetAPI to query the Activity Index to identify a remedial activity that trains that competency. The Activity Index is one component that is a subset of the Asset Data Model. The Activity Index might return more than one matching activity. The Recommender would then decide if it should present all matching Activities to the learner to select from, or if it will perform some internal logic to narrow the choice to a single activity and then automatically present that remediation activity to the learner.

Through this example we can see how the general architectural constructs of the TLA can be used create a specific TLA instantiation that meets the needs of the meta-adaptation use case. The result is that the learner receives individualized recommendations to switch from one Activity to another based on the learner’s specific needs and progress so that they can maximize their learning. Not only are the needs of the use case met, but they can be met by combining components produced by different technology providers, so that each can focus on bringing its specialty expertise to the overall ecosystem.

4.5.2 Usage MonitoringTo demonstrate the flexibility of the TLA architecture, let's walk through applying it to another use case, usage monitoring. In usage monitoring, data from many learners’ experiences can be analyzed for patterns that can help decision makers determine where they need new or updated content and what content is resulting in good training outcomes (see TLA Use Cases.docx for a full description of the use case).


In Year1's design-based research, the prototyping was focused on meta-adaptation, and the preliminary TLA system architecture was initial based on the needs of meta-adaptation. When we tried to apply Year1's component centric architecture design to the usage monitoring use case, it didn't work very well. As Figure 5 Usage Monitoring System Diagram shows, only the barest sketch of how it might work was possible.

Figure 5 Year1 Usage Monitoring System Diagram

One of the blockers was related to trying to understand how specific software components from the initial meta-adaptation use case like "Evidence Mapper" did or did not apply. We concluded that "Evidence Mapper" did not apply to the Usage Monitoring diagram, but this was unsatisfying because the architecture didn't have a place for the kind of data interpretation that we'd like the TLA to assist with for this use case. With the new data-centric architectural organization, rather than a component-centric organizational paradigm, applying the architecture to a new use case is much more straightforward and satisfying.

Let's take a look at how that works. For usage monitoring, it is desirable to make inferences based on the performance of many individuals. It is also desirable to form predictions about the content to answer questions such as: What new content is needed? What is the anticipated limited "shelf life" for a particular piece of content? What content is likely to be effective in more than one domain? These desires now readily fit into the new generalized architecture where inferences and predictions are primary, re-usable constructs. The new architectural paradigm no longer dictates the names of new components. Instead, it provides categorization that is re-usable. This supports the requirement (see TLA Specification Requirements.docx) to foster creativity to enable adopters to use the TLA in new ways to solve problems not yet considered. Once a use case has been explored in detail (e.g., meta-adaptation), best practice patterns can be published that give names to specific components (e.g.,


Evidence Mapper) that are useful for collaborators working together to design a specific TLA instantiation to meet the needs of a specific use case.

Figure 6 Year 2 Usage Monitoring System Diagram demonstrates applying the general architecture to another use case, usage monitoring. Let's walk through the details, examining what is similar to the meta-adaptation use case and what is different.

Figure 6 Year 2 Usage Monitoring System Diagram

First, we can see that the basic structure remains the same. All instantiated components (delineated by vertical purple lines) are either TLA Conformant or part of the TLA Cloud. The TLA Cloud components can be identified (delineated by horizontal purple lines) as either part of one of the three data models (Learner, Science of Learning, Asset), or one of the three types of TLA Processor (Inference, Prediction, Deconfliction). The data exchanges between components (shown as arrows in the diagram) are again classified by the nature of the exchange (data model type and promise type).

The difference is in the specific, named components that were instantiated. There is some overlap. For example, both use cases utilize the LRS and Activity Index. But there are also different components, both in the data model and in the processors. The architecture allows adding a new, named component, when necessary, but provides a recipe for deciding how to use the different components. Think about the TLA architecture like a kitchen. A kitchen provides all the equipment and ingredients necessary for


baking a cake, but it is up to the cook to decide if they want a vanilla or chocolate cake. The basic formula for all cakes is roughly the same, but the specific ingredients are different, and some of the specific steps are vary a little (like oven temperature). Think of the data models as ingredients. The specific parts of the data model you need are customized to the use case, just as the choice of chocolate or vanilla flavoring is customized for the cake. For the Usage Monitoring use case, we use the Sequence Models subset of the Science of Learning data model, whereas the meta-adaptation use case uses the Competency Framework subset of the Science of Learning data model. Steps that transform the ingredients, like how long to bake the cake, are also varied. A devil's food cake bakes for a different amount of time than an angel food cake. This is similar to how the TLA's data processors transform data. For the Usage Monitoring use case, we need a Content Predictive Processor, Activity Effectiveness Processor, and an Anonymized Activity Pattern Detection Processor to transform the data instead of the Evidence Mapper, Learner Prediction Processor, and Context Inference Engine used for meeting the needs of the meta-adaptation use case.

Let's look at how the specific components (ingredients) and processors (transformation steps) work together with the to meet the needs of this specific use case.

The experience history of many learners is stored in the LRS (part of the Learner data model) as raw data.

The goals and mastery estimate history of those same learners is stored in the Mastery Estimates & Goals component (part of the Learner data model).

The Anonymized Activity Pattern Detection Processor examines the learner history using the LearnerAPI. It makes inferences about the effectiveness of different learner sequences through the learning material. For example, if many learners show improved mastery after completing the same sequence of learning Activities, then an inference can be created that the observed sequence is an effective one. The Anonymized Activity Pattern Detection component outputs its inferences to the Sequence Models (using the ScienceOfLearningAPI) where they are stored for future use by other components.

The Activity Effectiveness Calculator uses the LearnerAPI to retrieve information about the experience history of the learners. It uses the AssetAPI to retrieve metadata about the Activities that can be used to help interpret the experience historyfor each specific Activity. It also uses the ScienceOfLearningAPI to ingest the inferences from the Sequence Models. Using all these data sources, its internal algorithms calculate the inferred effectiveness of individual Activities. It outputs this inferred data to the Activity Index (using the AssetAPI) as supplemental Activity metadata.

The Content Predictive Inferences Processor uses the ScienceOfLearningAPI to retrieve inferences about sequencing from the Sequence Models, and the AssetAPI to retrieve effectiveness inferences from the Activity Index. Its internal algorithms use this data to produce predictions. For example, it might predict where more challenging content needs to be provided to support gifted learners, or when an Activity will have outlived its usefulness and is in need of a refresh. It outputs these predictions to the Activity Index using the Asset API.

The Decision Maker for an organization (e.g.,Georgia CBE) would interact with a App such as the Activity Usage Monitoring Tool to curate the quality of the learning materials in their TLA instantiation. The Activity Usage Monitoring Tool uses the ScienceOfLearningAPI to retrieve Sequence Model inferences and uses the AssetAPI to retrieve effectiveness inferences and


content predictions from the Activity Index. The Activity Usage Monitoring Tool visualizes the data retrieved from the TLA Cloud so the Decision Maker can understand it easily. It conveys effective Activity sequences, effectiveness ratings for individual Activities, and predictions about near future content needs.

Through this example we can see how the general architectural constructs of the TLA can be used to meet the needs of a second use case. The architecture provides the "kitchen" where multi-purpose "ingredients" can be used to create customized "cakes."The TLA architecture can be used to create a customized TLA instantiation that is perfect for the needs of each new use case.

5 SPECIFICATION STRATEGYNow that we've shown how to generalize the design and walked through some example system diagrams that show how to apply the generalized architecture to specific use cases, we will examine how to formalize the specification of the generalized architecture.

5.1 GOALSFirst, what are the specification goals?

Create formal TLA specifications that reflect the requirements identified in TLA Specification Requirements.docx

Leverage existing standardized specifications whenever possible to meet requirements o Foster inclusivity so the TLA is compatible with many existing standards

Balance rapid prototyping needs with community-driven specificationso Support gathering specification refinement input from the communityo Support rapid prototyping to identify additional requirementso Identify a best practice workflow for moving requirements to specification

Prioritize ease of understanding, extending, and maintaining specifications Increase the amount of data that TLA participants can share

5.2 CHALLENGESWhat challenges can we expect to need to overcome to achieve these goals?

TLA data spans multiple functional areas; each may have one or more relevant existing specifications.

Existing specifications may not cover all necessary TLA data. The size of the specification needed to cover all TLA data is a bit daunting.

One of the takeaways from Year1 is that there is general agreement in the TLA community that leveraging existing standards is valuable and of primary importance. But there is not agreement yet on how best to leverage existing standards. At a high level, it is desirable to compose a TLA specification by including relevant parts from multiple existing specifications, similar to Metadata Application Profiles. This would cover a large percentage of the data the TLA needs, but not all of them. We would need to augment and fill in the gaps with a TLA sponsored "proving ground." The proving ground would be an


area where new kinds of data or data exchanges could be added, prototyped, and evaluated. If they prove themselves valuable, then there would be a workflow that governs promoting proving ground ideas to be incorporated into existing specifications. At this level, there is general agreement that this would be desirable, but then things get trickier.

Why is combining existing specifications hard? Getting down to the details reveals some specific challenges:

Existing specifications may overlap fully or partially:o Fostering inclusivity means creating equivalencies between redundant terms

Competing specifications in any area will have subtle to major differences to resolve:o Terminology definitionso Underlying theories

Existing specifications often include troublesome underlying assumptions:o Specific transport protocolso Specific security protocolso Specific intentions for the type of service that should implement a specification

Terminology discrepancies and overlaps can be resolved by creating equivalencies. This is a significant amount of work, but is straightforward to understand. Competing underlying theories can be addressed by ensuring that ways in which data can be discussed are broad enough to accommodate different theories, again creating equivalencies if needed. Of these challenges, the specific underlying assumptions can be most difficult to wrangle.

Why do the assumptions matter? Many current specifications have the luxury of a constrained use case. Constraints may narrow the types of end users the system will need to interact with, the delivery medium (e.g., web only), or the number of places where development teams from different providers need to interact. Each constraint reduces complexity. The less complex a system is, the more straightforward it is to produce a specification that can describe it. The TLA does not have this luxury; the goal of fostering inclusivity means that not only do we anticipate that the TLA needs to address the needs of many use cases, it is also a primary goal to welcome new and unexpected use cases, which means all uses cases will not be known in advance. Designing for this challenge is considerably more difficult.

Let's examine the impact of coupling a specification to a specific transport or security protocol. This is limiting in several ways. First, it reduces modularity. If a new security or transport protocol emerges in industry that is better than the existing choice the specification made, adopting new protocols as they emerge requires a specification change. Making a choice to be tightly coupled means that the specification can include more implementation-specific details. Specifics help programmers communicate, but they also reflect near-term realities. Specificity is great in the near-term, but it can be a hurdle for long-term maintenance. Because the TLA is designing for the long-term, it would be preferable to not require specification changes to adopt a newly emerged transport or security protocol. Second, tightly coupling to a specific protocol makes it tempting to "roll your own" inventing or modifying a protocol to suit a specific use case. Not only can the TLA not make assumptions about specific use cases, a hand-rolled solution is almost always less robust than solutions that have been vetted by multiple industry participants. Third, tightly coupling to the security or transport protocol


limits the ability combine existing specifications, which is contrary to the TLA's goal of fostering inclusivity to be compatible with many standards. Combining parts of existing specifications that use different protocols may be impossible if they have made assumptions that are in direct conflict with one another, or it may prevent combining more than one transport or security protocol. As a further complication, some security protocols are tightly coupled to specific transport protocols; choosing one may dictate the choice of the other. And lastly, tightly coupling the specification to specific protocols means that the specification itself swells to contain the specifics relating to the protocol choice, making it harder to discern what the data is and what can be said about the data.

Next let's examine the impact of service/component-oriented assumptions. When operating under a constrained use case, it is often possible to define a specific type of service/component that will implement a specification (usually an API specification). If this is possible, then the API can be simplified because it is guaranteed to be used in only one way: implemented by a particular kind of service, and accessed by clients who need the capability that the implementing service provides. When the use case is constrained, this is perfectly reasonable. However, the TLA's use case is not this constrained. To enable an ecosystem that allows new services to be added to address emerging use cases, it is not sustainable to orient the specifications around specific components/services. We observed in Year1 that this resulted in a proliferation of individual specifications (see API Year1 to Year2 API Mapping). Not only is this confusing to understand because of the sheer quantity of API specifications, but it also requires a new API specification to be created each time a new type of component/service is added to the ecosystem. Such an approach is unsustainable as the TLA ecosystem grows.

5.3 RECOMMENDED APPROACHHow best to proceed to overcome challenges and meet our goals? We have a lot to work with:

We have many existing specifications that overlap our needs. We know our data categories (Learner, Science of Learning, Asset). We can classify data exchanges (raw, inferred, deconflicted, predicted, stored). We can classify how the data will be transformed (Inference, Prediction, Deconfliction).

Let's divide the description of what the TLA is and how to use it into aspects that are governed by a formal specification and aspects that are informally defined by best practices guidelines.

Formal TLA specifications govern the following:

Data Concept: What data is shared? Data Structure: How is the data structured? Concept API: What can be said about the data?

TLA best practices govern the following:

Transport protocol specifics: How to send data over the wire? Security protocol specifics: How is the data secured? Service definition: What “types” of components exist in a TLA instantiation?

There are many benefits to this approach that overcome many of the challenges. It is easy to discern the conceptual intent of the data because it is separated from low-level specifics such as protocol or security


choices. Long-term maintenance is reduced, because the formal specifications govern slow-moving aspects of data exchange. What the data is, and conceptually what can be said about it, changes over time much more slowly than aspects that are focused on near-term implementation level concerns, such as protocol specifics. Separating the protocol specifics out as a best practice facilitates inclusion of new protocol types as they emerge, as well as combining protocol types as needed to meet the needs of new use cases. Lastly, decoupling the formal specification from the specific type of services in a TLA instantiation facilitates inclusion of new kinds of services as needed by individual use cases.

5.3.1 Data ModelThe specification for encoding the Data Model (Data Concept plus Data Structure) needs to meet these criteria:

Modular: "write once, use anywhere" Human readable Machine understandable Decoupled from assumptions about security or transport protocols Decoupled from the conceptual API description Able to incorporate relevant parts of existing specifications and record provenance

There will be three TLA Data Model specifications, one for each TLA Data Model (Learner, Science of Learning, Asset). Current state of the art has already pioneered decoupling re-usable data model specifications from the API specifications that describe how to use or exchange the data. Each of our Data Model specifications will be composed by including relevant parts of existing specifications using Metadata Application Profiles or a similar approach. We recommend JSON Schema for the formal data model specification language. In addition to meeting these criteria, it has good compatibility with REST and OpenAPI (which we know will be important for initial TLA implementations), and it supports linking so that specific datum can be sourced from multiple applicable existing specifications and provenance can be recorded.

5.3.2 Conceptual APIThe specification for encoding the Conceptual API data model needs to meet these criteria:

Modular: "write once, use anywhere" Human readable Machine understandable Decoupled from assumptions about security or transport protocols Loosely coupled to the Data Model Able to incorporate relevant parts of existing specifications and record provenance

There will be three TLA Conceptual API specifications, one that corresponds to each TLA Data Model (Learner, Science of Learning, Asset), notionally named LearnerAPI, ScienceOfLearningAPI and AssetAPI (as shown in Figure 4 Meta-Adaptation System Diagram). Each Conceptual API can communicate about the data in the associated data model using one or more of the promise types. Another way to say this is that the promise types segment each of the APIs into conceptual sub-sections.

The Conceptual API describes what can be said about the data independent of what transport or security protocols will be used when the data is transmitted. Most state of the art specifications are


tightly coupled to assumptions about services or protocols. However, there are a few forward thinking examples that show how to decouple them such as DDS (http://www.omg.org/omg-dds-portal/) and JSON-RPC ( http://www.jsonrpc.org/specification). In general, these examples use an Interface Definition Language to describe the Conceptual API. The Interface Definition Language is compatible with adding supplemental specifications that describe how to bind it to a specific protocol choice in an implementation. This is exactly the idea we want. A focus for Year2 should be identifying an appropriate methodology for formally recording Conceptual APIs that is compatible with current industry trends. It is possible that the TLA's unique goals and vision are not an exact fit for any existing specification language in this area.

5.3.3 Isolate Protocol Specifics The experimental prototype in Year1 demonstrated the effectiveness of leveraging modular approaches to security with the inclusion of OpenID and OAuth. See the TLA Specifications Requirement.docx for a more detailed description of the need to decouple the protocol specifics from the Conceptual API. Continuing to move towards a decoupled approach is recommended. There are examples of how to do this that we can draw on out in the community already. For example, xAPI provides a set of compatible authentication protocols (OAuth 1.0 (RFC 5849),RFC 7235 - HTTP Basic Authentication, Common Access Cards). While decoupling the transport protocol from the Conceptual API is less common, there are well-established examples as described in the Conceptual API section. Let's examine DDS in more detail. DDS defines a strongly typed, global data space that facilitates data-centric exchange of data between publishers and subscribers. Their design principles match many of the TLA architectural design goals:

Modern software systems need to be able to share a data-distributed environment that may or may not have real-time needs.

To be extensible to future use cases, the conceptual API specification can't be tightly coupled to any one use case. However, searching, sorting and filtering are common data operations that are important to include in a generalized conceptual API.

To be flexible, what the data is and what can be said about it should be independent of the transport protocol. However, to get to implementation, transport protocol choices are required.

DDS 1.4 defines a Platform Independent Model that defines essentially a conceptual API that is decoupled from the transport protocol. There is a family of associated specifications to choose from (e.g., Web-Enabled DDS 1.0, Real-time Publish-Subscribe Protocol (RTPS) DDS Interoperability Wire Protocol 2.2, and Interface Definition Language (IDL) 4.1) that allows the implementer to select a specific transport protocol for exchanging data that complies with DDS 1.4. While it may be more work to separate the conceptual API from the protocol specifics, it is recommended that the TLA follow this approach because the long-term benefits are substantial. If conceptual APIs are decoupled from transport protocol choices, it becomes feasible to rapidly incorporate new protocols as they emerge. New protocols are emerging at a rapid rate from business areas such as big data processing and social media (e.g., Twitter's Heron or Apache Kafka). It should be a goal of the TLA design-based research process to monitor industry for newly invented protocols that can be incorporated into the TLA.

5.3.4 Use Component RolesWe have defined our data models and APIs so they are data-centric rather than component-centric. How then are services and components defined? Let's consider a specific component as an example. The Evidence Mapper (see Figure 4 Meta-Adaptation System Diagram) needs Competency Framework data


(part of the ScienceofLearning data model), metadata from the Activity Index describing which Activities train which competencies (part of the Asset data model) and experience information from the LRS (part of the Learner data model). The Evidence Mapper outputs mastery estimates (part of the Learner data model). In each input and output case, it does not use the entire data model, but only specific parts of each data model.To allow different technology providers to keep their proprietary secrets and still participate in the TLA, it is not necessary to know what internal algorithm their components utilize. Instead, it is enough to ask only that they describe what input they require, and what output they produce. A component role can be loosely defined by the following:

Which parts of each TLA Data Model the TLA component consumes and produces Which parts of each of the TLA Conceptual APIs the component uses What promises the component makes about its output

If we encode these three pieces of information into a data file when creating a TLA instantiation and ask the developer to assign a name of their choice to associate with the role, then component types in a TLA instantiation can be data-driven using role definition files. An Evidence Mapper is an example of a named component that would be defined using a role definition file.

The primary benefit of this approach is that the TLA ecosystem can support creation of new component types to fit new use cases without specification modification. Roles are documented as best practices and can be tailored to individual use cases. Another significant advantage is that data-driven definition of components facilitates automated discovery. A TLA instantiation could choose to include a discovery service component whose job is to act as a central dispatch. When a component that wishes to participate in a TLA ecosystem comes online, it can contact the discovery service and register itself as available and filling a role. When components need to find each other, they can ask the discovery service to dynamically connect them to other components by role. Additionally, the data-driven nature will facilitate automation of content discovery. When a new Activity Provider comes online and registers with the discovery service, we could imagine that a content metadata elicitation component might exist that has asked for a standing notification from the discovery service every time a new component filling the role of Activity Provider comes online. When notified that a new activity is available, the metadata elicitation component could contact it, query it about its capabilities, and automatically populate the Activity Index, alleviating the need for population of the Activity Index to be a manual activity performed before the TLA instantiation is brought online.

6 APPENDIX

6.1 WHAT IS A PAL?Because the term PAL has been so difficult to define, special attention was given to understanding it at the second FLUENT design sprint held in ADL’s office in Alexandria, VA on February 29 and March 1, 2016. The following definition was agreed upon:

The term PAL is a Marketing Construct.

The characteristics of PAL include the following:

User Interface that provides assistance


Context awareness Personal recommendation generation

In the architecture diagram shown in Figure 3 2025 Architecture, no single component is the “PAL” component. Rather, multiple components would contribute to producing the characteristics attributed to the PAL marketing vision. For this reason, to avoid confusion, we will avoid using the term PAL when discussing engineering constructs.

6.2 API YEAR1 TO YEAR2 API MAPPINGThe draft APIs from Year1 (documented at https://confluence.soartech.com/display/FLUENTSHARE/Draft+API+Specifications#DraftAPISpecifications-TLAAPISpecifications) were intended as notional, and their purpose was requirements elicitation. They were structured to be centric to the data needs of individual components in the meta-adaptation system architecture diagram. This was useful for eliciting requirements, but is not generally extensible to other design patterns. Their functionality will be moved around to reside in the new APIs, but will functionally address the same needs. This table provides a mapping to show how the old APIs map to the new ones.

New Year2+ API Names Old Year1 API NamesLearnerAPI xAPI, pAPI, cAPI, aAPI, lAPI, oAPIScienceOfLearningAPI fAPIAssetAPI dAPI, oAPI, iAPI


figure 5 usage monitoring system...

Documents