being rdbms free -- alternate approaches to data persistence

Post on 14-Jul-2015

150 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Being RDBMS FreeAlternative Approaches to Data PersistenceDAVID HOERSTER

About MeC# MVP (Since April 2011)

Sr. Solutions Architect at Confluence

One of the Conference Organizers for Pittsburgh TechFest

Past President of Pittsburgh .NET Users Group and organizer of recent Pittsburgh Code Camps and other Tech Events

Twitter - @DavidHoerster

Blog – http://blog.agileways.com

Email – david@agileways.com

Goals

To allow you to achieve a zen-like state by never having to decide between a left and right outer join

GoalsThat a non-relational solution may be considered an option

What are some areas of a traditional application that could be a non-relational solution

Introduce some non-relational tools

How those tools would be used in a .NET solution (CODE!)

Traditional ArchitectureData persistence is central to application

Generally monolithic

Jack of all trades; master of none

Traditional Architecture

ClientWeb

ServerApp

ServerData

Repository

App DataSessionCache (?)Full Text SearchAudit

Consider…An online employment application

Wizard interface, with 9-12 steps

Most data is 1:1 across steps, but some data is 1:many

How to best structure 1:1 data◦ 6-8 tables, linked by ID?

◦ Or one wide table with lots of nullable columns?

◦ What about joining?

How about 1:many data◦ Several tables with 1:* relationships, which also needs to be joined

Don’t forget searching!!!

Applicant

General

Disclosure

Attestation

Skills

EmplEducat’

n

Database ThawI'm confident to say that if you starting a new strategic enterprise application you should no longer be assuming that your persistence should be relational. The relational option might be the right one - but you should seriously look at other alternatives.

-- Martin Fowler (http://martinfowler.com/bliki/PolyglotPersistence.html)

Monolithic Data PersistenceProvides consistency, but…

Is it always best tool for all jobs?

Is it easy for prototyping / rapid development?

Consider◦ How data will be used

◦ What kinds of data you’ll have

Why Non-RelationalUse Case – Company Intranet / CMS

Overall object is a CMS-like app for a company’s intranet content

Usage is mostly read-only, with pages and attachments◦ Pages, attachments, searching, admin, etc.

Traditional database could be multiple tables with 1:1 relationships and some 1:many relationships

Lots of joins for a page

…or a single document

What if…We could break some pieces out◦ Flatten structures for querying

◦ Highly efficient search services

◦ Pub/sub hubs

◦ Remote caching with excellent performance

◦ Session management outside a DB for load balanced environments

How would app then be architected?

…but consider the costsLearning curve

Distributed systems

Compensating transactions

Consider this with◦ Data

◦ Searching

◦ Caching/Session

◦ Auditing

Data StorageTypically, RDBMS is the de facto standard◦ SQL Server

◦ MySQL

◦ PostgreSQL

◦ Oracle (Yikes!!)

But do you really need it?

Data StorageGet all the orders for user ‘David’ in last 30 days

SELECT c.FirstName, c.MiddleName, c.LastName, soh.SalesOrderID, soh.OrderDate,sod.UnitPrice, sod.OrderQty, sod.LineTotal,p.Name as 'ProductName', p.Color, p.ProductNumber,pm.Name as 'ProductModel',pc.Name as 'ProductCategory',pcParent.Name as 'ProductParentCategory'

FROM SalesLT.Customer c INNER JOIN SalesLT.SalesOrderHeader sohON c.CustomerID = soh.CustomerIDINNER JOIN SalesLT.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderIDINNER JOIN SalesLT.Product p ON sod.ProductID = p.ProductIDINNER JOIN SalesLT.ProductModel pm ON p.ProductModelID = pm.ProductModelIDINNER JOIN SalesLT.ProductCategory pc ON p.ProductCategoryID = pc.ProductCategoryIDINNER JOIN SalesLT.ProductCategory pcParent ON pc.ParentProductCategoryID = pcParent.ProductCategoryID

WHERE c.FirstName = 'David'AND soh.OrderDate > (GETDATE()-30)

Data Storage

Wouldn’t it be great if it were something like this?

SELECT FirstName, MiddleName, LastName, SalesOrderID, OrderDate,UnitPrice, OrderQty, LineTotal, ProductName, Color, ProductNumber,ProductModel, ProductCategory, ProductParentCategory

FROM CustomerSalesWHERE FirstName = 'David'

AND OrderDate > (GETDATE()-30)

Data StorageMaybe a document database can be of use

Number out there◦ MongoDB

◦ RavenDB

◦ Couchbase

Consolidated structures without relational ties to other collections

Object databases

Why Document DatabaseQuick prototyping

Application usage that lends itself to persisting objects

Consider usage of your data before using

Avoid “cool factor”

Consider performance◦ “NoSQL is so much faster...”

◦ Um, not always…

Looking at MongoDBServer can have databases

Databases contain collections (like a table)

Collections contain documents (like rows)

Documents can be structured, have hierarchies, indexes, primary key

Working with Mongo’s C# Clientpublic class MongoContext<T> : IContext<T> where T : class, new() {

private IDictionary<String, String> _config;private readonly MongoCollection<T> _coll;

public MongoContext(IDictionary<String, String> config) {_config = config;

var client = new MongoClient(config["mongo.serverUrl"]);var server = client.GetServer();var database = server.GetDatabase(config["mongo.database"]);

_coll = database.GetCollection<T>(config["mongo.collection"]);}

public IQueryable<T> Items {get { return _coll.FindAll().AsQueryable(); }

} }

Working with Mongo’s C# Client

Encapsulate my queries and commands

public class FindPageById : ICriteria<Page> {private readonly String _id;public FindPageById(String pageId){

_id = pageId;}

public IEnumerable<Page> Execute(IContext<Page> ctx){

return ctx.Items.Where(p => p.Id == _id);}

}

Working with Mongo’s C# ClientInvoke my query/command

public class TemplateController : MyBaseController {private readonly IContext<Page> _pageCtx;

public TemplateController(IContext<Page> ctx) : base() {_pageCtx = ctx;

}

[HttpGet]public IportalPageMetadata Section(String cat, String page) {

var id = String.Format("{0}/{1}", cat, page);

var thePage = new FindPageById(id).Execute(_pageCtx).FirstOrDefault();

...}

}

Working with Mongo’s C# Client

Writing to Mongo is just as simple...

[HttpPost]public Boolean Post(Page page){

var userId = await GetUserId();

new CreatePage(page, userId).Execute(_pages);

_searchPage.Insert(page);

return true;}

Evolving Architecture

ClientWeb

ServerApp

ServerData

Repository

SearchSome data (?)SessionCache (?)

Document Repository

Write

Query

SearchHow do you search?◦ LIKE ‘%blah%’ ?

◦ Dynamic SQL

◦ Full-Text

LIKE and Dynamic SQL can be quick to create◦ Tough to maintain

Full-Text gives power◦ Limited in search options

SearchNumber of search services out there like◦ Lucene

◦ Solr

Lucene is a search engine◦ Embed in apps

◦ .NET port (Lucene.NET)

Solr is search service◦ Built on Lucene

◦ Connect apps to it

Searching with SolrDisconnected from your application

Search content via HTTP REST calls

Can use SolrNet as a client◦ https://github.com/mausch/SolrNet

Document-based

Searching with Solrprivate readonly ISolrOperations<T> _solr;public SolrSearchProvider(ISolrOperations<T> solr) { _solr = solr; }

public IEnumerable<T> Query(String searchString) {var options = new QueryOptions() {

Fields = new List<String> {"title", "body", "lastModified" }.ToArray(),Highlight = new HighlightingParameters() {

BeforeTerm = "<strong><em>",AfterTerm = "</em></strong>",Fields = new List<String> { "title", "body" }.ToArray(),Fragsize = 100

}};var results = _solr.Query(new SolrQuery(searchString), options);return results;

}

Evolving Architecture

ClientWeb

ServerApp

ServerData

Repository

Some data (?)SessionCache (?)

Search Service

Query

Write

Document Repository

Write

Query

Session and Cache DataGenerally short-lived for users

Fairly static for cached data

Key/value stores can serve us well here◦ Redis

Redis has two good .NET client libraries◦ StackExchange.Redis

◦ ServiceStack.Redis

Using Redispublic class RedisSessionManager : ISessionManager {

private static ConnectionMultiplexer _redis = null;private readonly IDictionary<String, String> _config;

public RedisSessionManager(IDictionary<String, String> config) {if (_redis == null) {

_redis = ConnectionMultiplexer.Connect(config["session.serverUrl"].ToString());}_config = config;

}public async Task<Boolean> CreateSessionAsync(String portalId, String userId, String fullName) {

var time = DateTime.UtcNow.ToString();var timeout = _config.ContainsKey("session.timeout");

var vals = new HashEntry[] {new HashEntry("userid", userId), new HashEntry("login", time),new HashEntry("lastAction", time), new HashEntry("fullName", fullName)

};

await RedisDatabase.HashSetAsync(portalId, vals);return await RedisDatabase.KeyExpireAsync(portalId, TimeSpan.FromMinutes(timeout));

}}

Using Redis

public async Task<Boolean> ExtendSessionAsync(String portalId) {var timeout = _config.ContainsKey("session.timeout");await RedisDatabase.HashSetAsync(portalId, "lastAction",

DateTime.UtcNow.ToString());return await RedisDatabase.KeyExpireAsync(portalId,

TimeSpan.FromMinutes(timeout));}

public async Task<Boolean> ExpireSessionAsync(String portalId) {return await RedisDatabase.KeyDeleteAsync(portalId);

}

Using RedisAt login (to stick session id in a cookie):

await Session.CreateSessionAsync(userId, fullName);

Upon log out:

await Session.ExpireSessionAsync(sessionCookie.Value);

Evolving Architecture

ClientWeb

ServerApp

Server

Data Repository

Some data (?)Search Service

Query

Write

Document Repository

Write

Query

Session/Cache

Service

Why Data StoreWe’re left with a database with not much use◦ Transactional data in document store

◦ Search documents in Solr

◦ Session, caching, etc. in key/value or caching service like Redis

What it probably ends up acting as is…

Evolving Architecture

ClientWeb

ServerApp

Server

Event Store2-3 flat tablesEvent data

Search Service

Query

Write

Document Repository

Write

Query

Session/Cache

Service

Queue?

(D)Evolved Architecture

Client

Web Server

App Server

Event Store

Search Service

Query

Write

Doc Repo

Write

Query

Session/Cache

Service

Queue?

(D)Evolved ArchitecturePick and choose what components work best

Don’t use them just to use them

Proof-of-Concept / Prototype

Why look to be RDBMS freeSearching◦ More than just full-text needs

Data◦ Choose a system that you can model the business

◦ Not the other way around

Caching / Session Values / PubSub◦ Offload necessary?

◦ Ensure performance

Maintenance and support big factors to consider

Consider data usage/architecture before just jumping in

ToolsMongoDB◦ http://mongodb.org

◦ RoboMongo http://robomongo.org

◦ Perf Best Practices http://info.mongodb.com/rs/mongodb/images/MongoDB-Performance-Best-Practices.pdf

◦ Operations Best Practices http://info.mongodb.com/rs/mongodb/images/10gen-MongoDB_Operations_Best_Practices.pdf

Solr◦ http://lucene.apache.org/solr/

Redis◦ http://redis.io/

◦ Redis Manager http://redisdesktop.com/

top related