piterpy 2016: parallelization, aggregation and validation of api in python

44
Parallelisation, aggregation and validation API with Python Max Klymyshyn CTO at CartFresh @maxmaxmaxmax

Upload: max-klymyshyn

Post on 23-Jan-2017

848 views

Category:

Software


0 download

TRANSCRIPT

Page 1: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Parallelisation, aggregation and validation API with Python

Max KlymyshynCTO at CartFresh

@maxmaxmaxmax

Page 2: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ 12+ years of experience, 7 years with Python, 6 with JS

‣ Was part of oDesk, Helios, 42cc.

‣ Co-organizer of PyCon Ukraine, KyivJS, Papers We Love

‣ CTO at CartFresh

‣ Challenging myself with english talk. It’s not my first language, bear with me

About

Page 3: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ Grocery Delivery startup

‣ Operating as CartFresh (Boston, US) and ZAKAZ.UA

(Kiev, Dnepropetrovsk, Kharkiv, Ukraine)

‣ Apache CouchDB, Apache Solr, Redis

‣ Heavy python on back-end

CartFresh

Page 4: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ Quick overview

‣ Some abstract info about context

‣ Tools for Python

Table of contents

Page 5: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Why API again?

Page 6: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

World is changing very quickly:

‣ Mobile apps

‣ Internet of Things

‣ Microservices

‣ Isomorphic apps

Why API again?

Page 7: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Good API is hardwhen all your stuff should work together well

Page 8: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ Validation

‣ Reusability

‣ Consistency

‣ Maintainability

‣ Scalability

It’s challenging

Page 9: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Good API makes it easier to develop a service

Page 10: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Divide and conquer (D&C)

Page 11: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ API expresses a software component in terms of its operations, inputs, outputs, and underlying types

‣ API helps create reusable building blocks and communicate between system components

‣ It opens new opportunities to develop new systems based on your product

Overview

Page 12: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Moving parts

VALIDATION

OUTPUT

BL

INPUT

Page 13: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ input – need to be validated for type correctness

‣ validation – input should be constrained by domain-

specific business rules

‣ business logic – obviously most useful part of the system

‣ output – data model, serialised into specific format

Moving parts

Page 14: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

API creation becomes trivial with good understanding and right tools

All challenges are behind: how to make it simple, how to make it maintainable, how

to keep API users updated

Page 15: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Trends during the past few years

Page 16: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ RESTification

‣ Data Query Languages

‣ Microservices architecture

Trends

Page 17: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

REST

‣ Unified interface to communication protocol between client and API

‣ Built on top of HTTP

‣ Simple

Page 18: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Data Query Languages

‣ GraphQL

‣ Falcor

‣ Datalog

‣ Datomic

etc.

Page 19: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Data Query Languages

Main point of DQL is to make declarative composition of queries to simple data structures and

represent it as single data structure

Page 20: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Common case

Page 21: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Monolithic service

Monolite

Page 22: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

More realistic case

Page 23: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Microservices

Monolit Microservices

Page 24: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Microservices

Monolit Microservices

Difference

Page 25: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ New layer of complexity in terms of input validation

‣ New unreliable layer (network)

‣ Additional protocol overhead

‣ Communication latency

Seriously

Page 26: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ You’ll get a chance to improve each piece of code

separately without breaking other part of the system (D&C!)

‣ You can split development of microservices between

different dev teams

‣ You’ll get a lot of fun!

But let’s be optimistic

Page 27: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Tools

Page 28: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ SWAGGER – a simple representation of your RESTful API

(OpenAPI initiative), FLEX for Python

‣ RESTful API Modelling Language – RAML

‣ APIDOC – a documentation from API annotations in your

source code

‣ api-blueprint, RESTUnite, apiary etc.

API Frameworks

Page 29: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

paths: /products: get: summary: Product Types description: | The Products endpoint returns information about the *Uber* products offered at a given location. The response includes the display name and other details about each product, and lists the products in the proper display order. parameters: - name: latitude in: query description: Latitude component of location. required: true type: number format: double - name: longitude in: query description: Longitude component of location. required: true type: number format: double tags: - Products responses: 200: description: An array of products schema: type: array items: $ref: '#/definitions/Product'

Swagger spec example

Page 30: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

/products: uriParameters: displayName: Products description: A collection of products post: description: Create a product #Post body media type support #text/xml: !!null # media type text, xml support #application/json: !!null #media type json support body: application/json: schema: | { "$schema": "http://json-schema.org/draft-03/schema", "product": { "name": { "required": true, "type": "string" }, "description": { "required": true, "type": "string" }

RAML spec example

Page 31: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

example: | { "product": { "id": "1", "name": "Product One", ... } } get: description: Get a list of products queryParameters: q: description: Search phrase to look for products type: string required: false responses: 200: body: application/json: #example: !include schema/product-list.json

RAML spec example

Page 32: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

To prevent situation when documentation, client libraries, and source code get out of sync

CLIENT #1 SERVER CLIENT #2

Page 33: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ Predefined input parameters + validation

‣ Predefined response schema (model)

‣ Query Language

Aggregation

Page 34: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

GraphQL/Grapheneimport graphene import pprint

data = [1, 2, 3, 4]

class Query(graphene.ObjectType): hello = graphene.String() data = graphene.String()

def resolve_data(self, args, info): return ",".join(map(str, data))

def resolve_hello(self, args, info): return 'World'

schema = graphene.Schema(query=Query) result = schema.execute('{ hello, data }') pprint.pprint(result.data)

# OrderedDict([('hello', u'World'), ('data', u'1,2,3,4')])

Page 35: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

GraphQL’s power comes from a simple idea — instead of defining the structure of responses

on the server, the flexibility is given to the client.

GraphQL vs REST

Page 36: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

GraphQL/graphene allow usto use our beloved language

for declaration of Model/API Schema: python

GraphQL vs Swagger

Page 37: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Batching

Page 38: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Tools: django-batch-requests

[ { "method": "get", "url": "/sleep/?seconds=3" }, { "method": "get", "url": "/sleep/?seconds=3" } ]

Page 39: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

[ { "headers": { "Content-Type": "text/html; charset=utf-8", "batch_requests.duration": 3 }, "status_code": 200, "body": "Success!", "reason_phrase": "OK" }, { "headers": { "Content-Type": "text/html; charset=utf-8", "batch_requests.duration": 3 }, "status_code": 200, "body": "Success!", "reason_phrase": "OK" } ]

Page 40: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Our experience

Page 41: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

‣ End up with batched API interface

‣ Declarative input validation with trafaret

‣ Free schema (disadvantage)

‣ Very simple SQL-JOIN-like aggregation

Page 42: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Params, validation, transformation

@validate_args( _('Invalid request'), store_id=tr.String() >> pipe(unicode, unicode.strip), slugs=tr.List(tr.String() >> pipe(unicode, unicode.strip)), ean=tr.String | tr.Null, extended=tr.Bool | tr.Null, query=tr.String | tr.Null, facets=tr.List( tr.List(tr.String, min_length=2, max_length=2)) | tr.Null, sort=tr.String(allow_blank=True) | tr.Null, _optional=('extended', 'query', 'facets', 'sort', 'ean')) def resource_products(store, user, session, limit=None, offset=1, lang='en', args=None, **kwargs): pass

Page 43: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

[ "store.products", { store_id: Storage.first(“store").id, slugs: [options.slug], facets: options.facets || [], sort: options.sort || “" }, { offset: options.offset || 1, id: "catalog", join: [{ apply_as: "facets_base", on: ["slug", "slug"], request: { type: "store.facets", args: { store_id: "$request.[-2].args.store_id", slug: "$request.[-2].args.slugs|first" } } }, { apply_as: "category_tree", on: ["slug", "requested_slug"], request: { type: "store.department_tree", args: { store_id: "$request.[-2].args.store_id", slug: "$request.[-2].args.slugs|first" } } }] } ]

Page 44: PiterPy 2016: Parallelization, Aggregation and Validation of API in Python

Thanks.

@maxmaxmaxmax