cloud powered search

37

Upload: codecampiasi

Post on 16-Jul-2015

47 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cloud powered search
Page 2: Cloud powered search

Search

Cloud powered

LIVIU MAZILURADU PINTILIE

Page 3: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

CODECAMP

Challenges in distributed applications

SQL Azure Federation

HDInsight

DocumentDB

Previous subjects

Page 4: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Azure Search

The need for search

Search explained

Development

Case Scenarios

Agenda

Page 5: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

The need for search

Why do we search for data?

How do we store it to search efficiently?

What’s important?

Page 6: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Is this a search engine?

where [field] like “%codecamp%”

Page 7: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

WHAT IS A SEARCH ENGINE?

Efficient indexing of data On all fields / combination of fields

Analyzing data Text Search

Tokenizing

Stemming

Filtering

Understanding locations

Relevance scoring

Page 8: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Lucene

Document: collection of fields

Field: string based key-value pair

Collection: set of documents

Inverted index: a term can list the number of documents it contains

Score: relevancy for each document matching the query

Page 9: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

How searching works

Id Title UserId ViewCount Tags

1 Controller Action ambiguity

even with [HttpPost]

decoration? (ASP.NET MVC4)

5 352 asp.net asp.net-mvc

asp.net-mvc-4 f#

2 Why can't I use a scrollwheel

on a webpage?

6 109 c# javascript asp.net

asp.net-mvc-4 twitter-

bootstrap-3

3 Access session variable of one

site in another"

7 78 asp.net .net

4 Check if SIM card exists 5 209 c# windows-phone-8

Page 10: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Inverted indexHow searching works

Title

Access session variable of one site in another" 3

Check if SIM card exists 4

Controller Action ambiguity even with [HttpPost] decoration? (ASP.NET

MVC4)

1

Why can't I use a scrollwheel on a webpage? 2

UserID

5 1, 4

6 2

7 3

ViewCount

78 3

109 2

209 4

352 1

Page 11: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Inverted indexHow searching works

Title

Access session variable of one site in another" 3

Check if SIM card exists 4

Controller Action ambiguity even with [HttpPost] decoration? (ASP.NET

MVC4)

1

Why can't I use a scrollwheel on a webpage? 2

UserID

5 1, 4

6 2

7 3

ViewCount

78 3

109 2

209 4

352 1

Query: UserID = 5

Page 12: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Full text search

Id Tags

1 asp.net asp.net-mvc asp.net-mvc-

4 f#

2 c# javascript asp.net asp.net-mvc-

4 twitter-bootstrap-3

3 asp.net .net

4 c# windows-phone-8

How searching works

Term Doc

.net 3

asp.net 1, 2, 3

asp.net-mvc-4 1, 2

c# 2, 4

f# 1

javascript 2

mvc 1

twitter-bootstrap-3 2

windows-phone-8 4

Page 13: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Full text search

Id Tags

1 asp.net asp.net-mvc asp.net-mvc-

4 f#

2 c# javascript asp.net asp.net-mvc-

4 twitter-bootstrap-3

3 asp.net .net

4 c# windows-phone-8

How searching works

Term Doc

.net 3

asp.net 1, 2, 3

asp.net-mvc-4 1, 2

c# 2, 4

f# 1

javascript 2

mvc 1

twitter-bootstrap-3 2

windows-phone-8 4

Query: “javascript” in Tags

Page 14: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Full text search

Id Tags

1 asp.net asp.net-mvc asp.net-mvc-

4 f#

2 c# javascript asp.net asp.net-mvc-

4 twitter-bootstrap-3

3 asp.net .net

4 c# windows-phone-8

How searching works

Term Doc

.net 3

asp.net 1, 2, 3

asp.net-mvc-4 1, 2

c# 2, 4

f# 1

javascript 2

mvc 1

twitter-bootstrap-3 2

windows-phone-8 4

Query: “asp.net” in Tags

Page 15: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Auto-completionUses

Page 16: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Auto-correction

PhrasingIframe security – Security in an Iframe

Word-level distancegrey/gray

color/colour

Uses

Page 17: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Elasticsearch

Distributed: aggregated results of search performed on multiple shards/indices

Schema Less: is document oriented. Supports JSON format

RESTful: supports REST interface

Faceted Search: support for navigational search functionality

Replication: supports index replication

Fail over: replication and distributed nature provides inbuilt fail over.

Near Real time: supports near real time updates

Page 18: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Distributed & highly available

• Multiple servers (nodes) running in a cluster • Acting as single service

• Nodes in cluster that store data or nodes that just help in speeding up search queries.

• Sharding• Indeces are sharded (# shards is configurable)

• Each shard can have zero or more replicas • Replicas on different servers (server pools) for failover

• One in the cluster goes down? No problem.

Elasticsearch

Page 19: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Azure search

Elasticsearch as a managed service

Platform as a service (PaaS)

Admin by Rest API

Data exchange with JSON

Page 20: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Where are we at

Service Ease of use Scalability Easy Administration

Manual search (SQL) No No Partial

Elasticsearch Yes Yes No

AzureSearch Yes Yes Yes

Page 21: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Resource model

ServiceIndex (schema type 1)

Index (schema type 2)Document

DocumentField1

Field2

Field3

Field4

Indexers

Azure Search

Page 22: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Management PortalDemo

Page 23: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Index creation

POST https://codecamp-en.search.windows.net/indexes

"name": "stackoverflow-posts",

"fields": [ {

"name": "name_of_field",

"type": “data_type",

"searchable": true (default where applicable) | false ,

"filterable": true (default) | false,

"sortable": true (default where applicable) | false

"facetable": true (default where applicable) | false ,

"key": true | false (default),

"retrievable": true (default) | false } ] …

Azure Search

Page 24: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Index documents

Indexers

Data sources: Azure SQL Database, DocumentDB

Connects data sources with target search indexes

An indexer can be used in the following ways:one-time copy of the data to populate an index

sync an index with changes from the data source on a schedule

invoke on-demand to update an index as needed

Azure Search

Page 25: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

CRUD Operations

Add, Update, Delete

POST https://codecamp-en.search.windows.net/indexes/stackoverflow/docs/index

{

"@search.action": "upload (default) | merge | mergeOrUpload | delete",

"key_field_name": "unique_key_of_document", (key/value pair for key field from index schema)

"field_name": field_value (key/value pairs matching index schema)

}

Azure Search

Page 26: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Searching through data

GET https://codecamp-en.search.windows.net/indexes/stackoverflow/docs?

search=[string] + (AND operator “code" and “camp")

| (OR operator “code" or “camp" or both)

- (NOT operator. “code–camp" “code" term and/or do not have “camp" )

* (Suffix operator. “cod*" - starts with “cod", ignoring case)

" (Phrase search operator)

( ) (Precedence operator - code+(camp|workshop)

searchMode=any|all

searchFields=[string]

Azure Search

Page 27: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Filtering results

$filter=[string] - Odata syntax

$skip=#

$top=#

$count=true|false

$orderby=[string]

$select=[string]

Azure Search

Page 28: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Emphasizing results

facet=[string] (field names)count

sort

values

interval

highlight=[string] (field names)

highlightPreTag=[string] (default is em)

highlightPostTag=[string]

Azure Search

Page 29: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Suggestions

GET https://codecamp-en.search.windows.net/indexes/stackoverflow/docs/suggest

search=[string]

suggesterName=[string]

fuzzy=[boolean]

searchFields=[string]

Azure Search

Page 30: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Stackoverflow Posts

5.215.584 records

212 MB in Title column

118 MB in Tags column

10,5 GB in Body column

Sample Data

Column Name Data Type

Id int

CreationDate datetime

Score float

ViewCount int

Body nvarchar

OwnerUserId int

Title nvarchar

Tags nvarchar

Page 31: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Search APIDEMO

Page 32: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Scaling

Capacity measured in Search Units

1 Search Unit1 Partition

1 Replica

Horizontal scaling by increasing the number of partitions and/or replicas

Cloud powered search

Page 33: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Storage

Partition limitations:15 million documents

25 GB data

Every Index is split by default in 12 shards

Each partition can store 1,2,3,4,6,12 shards

Cloud powered search

Page 34: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

SCENARIOS

Online retail/ecommerce

User generated/social content

Not just for the web

Hybrid Applications

USE CASE

Page 35: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Conclusions

The need for search

Search explained

Development

Case Scenarios

Page 36: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

Questions

?

Page 37: Cloud powered search

April 25, 2015Cloud powered search

© EXPERT NETWORK

THANK YOU