tlc2016 - a search engine for blackboard learn, the impossible made possible

31
A search engine for Blackboard Learn the impossible made possible.

Upload: blackboardemea

Post on 15-Apr-2017

274 views

Category:

Education


3 download

TRANSCRIPT

PowerPoint Presentation

A search engine for Blackboard Learn the impossible made possible.

OverviewIntroWhyHowWhatFuture thoughtsQ&A2

IntroductionKULeuven Association

3

Toledo ~ Blackboard++32.000 active courses and organizations135.000 active users40.000-50.000 avg active users/day85.000 active users/top days

Self-hosted since 2001Learn April 2014 release

Introduction4

IntroductionContent related 3.529.176 total # Content items 2.765.371 total # Docs (attach)10.6 TB CMS storage500-1.000 changes/dayStart semester: 10.000 - 20.000 changes/day

5

IntroductionTeam (7+4+1 FTE)Strong focus on technology and integrationCentrally supported ICT tools for educationBlackboardExam setupStreaming videoE portfolioE bulletin boardMOOCsEvent subscription

6

WhyStudents/Staff Find stuffThat s the way internet, smartphones, computers, real life? worksPolicy makers: find what you searchKnowledge workers spend from 15% to 35% of their time searching for informationEnterprise Search solutionSearch over all environments (Toledo, sharepoint, website, wikis, blogs)Improve/introduce search functionality in every environmentTeamAny self respecting site has a searchBecause we can!

7

How: Hybrid SolutionEnterprise search: Hybrid solutionMicrosoft SharePoint Search for Microsoft repositoriesElasticSearch for other repositories (Blackboard, SAP, Plone, )Scalable: adding nodes results in automatically reorganization of data over available nodesHigh available: automatically detect new or failed nodes, and reorganize and rebalance dataDeveloper-Friendly, RESTful APIDocument-Oriented as structured JSON documentsReal-time data and analyticsBuild on top of Apache Lucene: a high performance, full-featured Information Retrieval libraryApache 2 Open Source License

8

CONVoor toegangscontrole dient een connector framework opgezet te worden (Google Connector Manager of ManiFoldCF)Voor overkoepelende zoek dient koppeling tussen SharePoint en de Lucene oplossing opgezet te worden (met focus op de afscherming van de resultaten)2 zoekomgevingen te beherenPROVolledige integratie met Microsoft Enterprise contentAndere omgevingen (Toledo, Plone, wikis en vertical search) kunnen vlotter ontsloten worden door bestaande connectoren en met expertise die binnen de respectievelijke teams aanwezig isZoekervaring binnen een omgeving is het hoogst (zowel voor Microsoft als voor de andere omgevingen)Flexibel

8

How: Elastic Search

queryrepositorycontentEnvironmentswith theirrepository:BlackboardSAPPlone WCMElasticSearchSearch architecture

Bevragen en content aanleveren vanuit de verschillende omgevingenFrontend en de connectoren9

How: complexity10

10

How: evolutionIndexing/crawling without filtering who can access whatDifficult to find what you search: irrelevant search resultsNot user-friendly: search result is not accessibleNot secure: certain context of a search result is displayed where user has no access toPeriodical indexing/crawling: does not reflect actual authorization in repositoryAuthorization = show search results dependingAvailability of course/content itemCourse/community membershipNear real time indexing/crawling reflects changes inContentroles: Access to content items by adding membership and availability to content itemAdditional searchindex which reflects memberships of each user Document level security forced by frontendOnly search results where the user has access to11

How: Current search architecture

queryrepositorycontent

rolesEnvironmentswith theirrepository:BlackboardSAPPlone WCMElasticSearchconnectorfrontend

Next slide

Bevragen en content aanleveren vanuit de verschillende omgevingenFrontend en de connectoren12

How: Architecture for indexing data in ElasticSearch

Blackboard

filesystemRabbitMQqueuesfileindexBbSyncqueueSync

PlonePloneSync

SAPODataSyncElasticSearchenvironmentsreusablecomponentsspecificconnectors

queueSync

140 mandagen in 2015, april 2016 demo op Blackboard user conference13

How: Architecture for indexing data in ElasticSearchbbSync:pushes changes in Blackboard to the concerning queue (index, file)queueSync: takes json documents of index queue and indexes them in ElasticSearchtakes CMS link out of file queue document, parses the CMS file from the filesystem and indexes the result in ElasticSearch

14

How: Architecture for indexing data in ElasticSearchEvery developed component and every rabbitMQ queue runs as a seperate Docker containerScalableMultiple instances of every componentPossibility to add an instance(s) of a componentHigh AvailableQueuesIf instance of a component stops working, another instance processes a queue entryMetrics for monitoring/alerting/grafics15

Architecture for querying ElasticSearch

queryrepositorycontent

rolesElasticSearchconnectorfrontendAuthentication Basic Authentication for service accountsOAuth integration possibleDocument level securityshow only search results the user has access to

140 mandagen in 2015, april 2016 demo op Blackboard user conferenceNiet json documenten worden geanalyseerd of geparsed via fileSync (bijv. Pdf, Word, )16

Architecture for querying ElasticSearch

Search UI connects via a service account with Basic Authentication to the ElasticSearch frontendHands the user search query along with the uid if the user17

Architecture for querying ElasticSearch

ElasticSearch frontend first searches the user roles the user with uid has18

Architecture for querying ElasticSearch

19

Architecture for querying ElasticSearch

ElasticSearch frontend expands the user search query with the users roles20

Architecture for querying ElasticSearch

ElasticSearch returns the search results confirm the query and only returning the results for documents which have the right roles21

Architecture for querying ElasticSearch

22

How: the peopleProject manager Enterprise Search 80 MDProject managementTechnical overview and key-decisionsManagement reporting & adviceContact software vendorsInter team facilitator2 java developers 250 MD Senior profilesExtremely skilled (gurus)DevopsSysadmin team 60 MDBleeding edge in several technologiesDockerElastic SearchAgile deployment with PuppetDevops

23

What24

WhatRandom searchSpeedAmount specific data

25

What26

The Beer course

WhatSearch student vs staff & respecting availability27

WhatFull text search (inside files)28

Future thoughtsSearch UI improvements: filter/facets highlighting (context) implement operators phrase matching User Testing Relevance-RankingVertical search applicationsSpecific view on search index (certain subset with specific filter/facets and )Search announcements, courses, Invite a big LMS vendor to incorporate it

29

Q&A30

31

32