fast search for sharepoint

Post on 25-Jan-2015

1.117 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

This month C/D/H, with partners BA Insight and Microsoft, hosted a half-day seminar on SharePoint 2010 & FAST Search for SharePoint – and using it as a single, enterprise-wide search tool. View C/D/H’s FAST SharePoint slide deck to see real-world examples of search-driven information portals. We’ll also show you how FAST can dramatically improve end-user productivity. And for more on Search and other topics, visit our blog at www.cdhtalkstech.com.

TRANSCRIPT

C D H

C D HTransform Enterprise Search with

FAST Search for SharePoint

C D H Quick Facts

About Us• 22nd Year• Grand Rapids &

Royal Oak• 30 Staff

Approach• Vendor

Independent• Non-reseller• Professional

Services Only

Partnerships• Microsoft Gold• VMware Enterprise• Citrix Silver• Novell Gold• Cisco Premier

C D H Expertise

C D H Talks TechC D H

C D H About me

David TappanConsultantIOAp, MCITP, MCTS: SharePointdavidt@cdh.com

C D H

C D H FAST Search: Better Insight

C D H Agenda: Insight

• How FAST increases insight

• Insight into how FAST is used to solve

specific business problems

• Insight into what FAST Search high

availability really requires

C D H A question

What is Search, really?

C D H One answer

“Search is the ability to find text strings in documents”

C D H

”What should I know about selling ERP?”

- Alan Brewer, Sales Lead

”What should I know about implementing ERP?”

- Renee Lo, Consultant

The Problem: Hidden meaning in the searcher’s intent

C D H Another answer

“Search is the ability to query any document property”

C D H

C D H Recommended reading

• http://www.well.com/~doctorow/metacrap.htm

C D H A better answer

Search is a service that matches what you mean with what documents mean.

C D H

C D H

How FAST Search for SharePoint enables better meaning extraction

Cool FAST solutions

C D H F4SP Architecture Basics

C D H

• Query terms in title vs. bodyContext

• «Bill Gates» vs. «Bill saw the gates»Query term proximity

• «...a page about Bill Gates...»«Anchors» match query terms

• Others clicked a hit for «Bill Gates»Click history match

In the box: Dynamic rank algorithms at query time

C D HCustomizable Query Processing

What is someone thinking about when they perform a query?

C D H

Looking for a knowledge management solution?!?!?

I love SharePoint

It’s the best Knowledge Management Solution in the market

Have you ever built an e-commerce solution on it?

Our focus is knowledge management, and it just works!

We use it as a web content management system, and we’re so happy with it

Great for WCM, Great for KM!

Just deployed for KM… so good, so far… will get back once the pilot is over!

Search and the activity feed

Knowledge Management

Web Content Management

E-Commerce

C D H

fql = xrank(string(“fast search”),

or(department:or(string(“services”),

string(“engineering”)),

keywords:string(“knowledge management”)), 

boost=10,000)

For the geeks…

C D H

• Prefer shallow URLs

Landing pages

• Links from other pagesAuthority

• Boost sites/documentsHigh quality

In the box: Static rank algorithms at content processing time

C D H Customizable content processingHow to Index Content by Location?

• Address, intersection, zip code, names, etc.– One Microsoft Way, Redmond, WA

• Geodetic coordinates (latitude & longitude)– 47.639767, -122.129755– Degrees, minutes, seconds

• 47° 38’ 23.16” N, 122° 7’ 47.1” W

• Universal Transverse Mercator (UTM) – 10N 565367 5276630

• Military Grid Reference System (MGRS)– 10T ET 65367 76630

Index Schema ( Managed Properties)

C D H Geographic entity extraction

• Requirement– Parse elements from text– Tag documents with the individual values

• Solution – Custom regular expression extraction– Call Bing Maps API– Return latitude and longitude and store as crawled property

{ name: 'Microsoft', address: 'One Microsoft Way, Redmond, 

WA 98052',phone: '1‐800‐Microsoft (642‐7676)',path: 'http://www.microsoft.com', latitude: '47.639767',longitude: '‐122.129755'  }

C D H How they did it

End UsersData Sources

Fede

ratio

n

OpenSearch Source

Content ProcessorIndexerQuery

Processor

Search Center Index

PartitionIndex

Partition

…Format Conversion

Language Detection

Entity ExtractionLemmatizationMapper

FeederFeeder

Geo-coding with Bing Maps API

C D H(

YOUR_TERM(s)_HERE,maxlatitude:range(LOW_LAT,max),minlatitude:range(min,HIGH_LAT),maxlongitude:range(LOW_LON,max),minlongitude:range(min,HIGH_LON)

)e.g. and(football,maxlatitude:range(12,max),minlatitude:range(min,34), maxlongitude(56,max),minlongitude(min,78))

Geographic queries

C D H Takeaways

• Search ain’t beanbag• http://www.well.com/~doctorow/metacrap.htm• FAST Search for SharePoint provides tools

to extract MEANING from content and queries

C D H

C D HScaling FAST Search:

What it takes

C D H

Content Volume

Query Volume

Scale-out multiple “dimensions”

Query VolumeContent VolumeIndexing freshness

Redundancy optionsSearchIndexing

Performance targets*15M Docs/node25 QPS/node50 docs/sec

*Depends on content and hardware specifics

Search and Indexing

Crawling and Content Processing

Query and Result Processing

No theoretical upper bounds!

FAST Search for SharePoint scaleout

C D H

FAST

Admin DB

FAST Content SSA

Admin DB

FAST

Crawl DB

FAST Content SSA

Crawl DB

Crawl comp.Crawl comp.

Admin component

Master Crawl comp.

Crawl comp.

Crawl dataCrawl historyCrawl queue additions

Request crawl

Poll request

Log request

Poll request

Distribute work

FAST Search

Document batches

Content Web Service

Web crawls

Database

Don’t forget SharePoint!

C D H SharePoint Search components

Query

Crawl

Admin

Index P1

CrawlAdmin Props

SharePoint ServerAll Components on one server

Database ServerAll Databases on one Instance

C D HSearch deployment: Query layer build out

Query

Crawl

Admin

Index P1

CrawlAdmin Props

Database ServerAll Databases on one Instance

Query

Index P1

QueryQuery

Index P2

Query

SharePoint ServerQuery Components on Multiple ServersIndex Re-Partitioned

Props

C D HSearch deployment: Crawl layer build out

CrawlCrawl

Crawl

Admin

CrawlAdmin Props

Database ServerAll Databases on one Instance

Query

Index P1

Query Query

Index P2

Query

SharePoint ServerQuery Components on Multiple ServersIndex Re-Partitioned

Props

CrawlCrawl

Crawl

SharePoint ServerCrawl Components on Multiple Servers

C D H

Royal Oak306 S. Washington Ave.Suite 212Royal Oak, MI 48067p: (248) 546-1800

Thank You

Grand Rapids15 Ionia SWSuite 270Grand Rapids, MI 49503p: (616) 776-1600

(c) C/D/H 2007. All rights reservedwww.cdh.com

top related