distributing over the web

75
https://github.com/naighes distributing over the web @NicolaBaldi

Upload: nicola-baldi

Post on 11-Apr-2017

135 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: distributing over the web

https://github.com/naighes

distributing over the web

@NicolaBaldi

Page 2: distributing over the web

stillnoagreementonReST…:-)

Page 3: distributing over the web

WhataWonderfulWorld

Page 4: distributing over the web

4

Page 5: distributing over the web

cryptic…

“a resource is a conceptual mapping

to a set of entities, not the entity that

corresponds to the mapping at any

particular point in time”

Page 6: distributing over the web

very cryptic…

Page 7: distributing over the web

coupling

ReST is loosely coupled:

§ there are no «contracts» in ReST!

The only contract is represented

by the URI.

§ clients does not (MUST not)

depend on server-side

implementations.

Page 8: distributing over the web

constraintsI

client-server

§ driven by SoC (separation of concerns).

• UI portability.

• scalability (server components simplification).

• loose coupling (client & server will evelove

indipendently).

Page 9: distributing over the web

constraintsII

stateless

§ visibility (just look at «request»).

§ reliability (recovering from partial failures).

§ scalability (no need to store client data).

Page 10: distributing over the web

constraintsIII

cache

NOTE: caching «just» improves network efficiency.

§ data within a response to a request implicitly or

explicitly labeled as cacheable or non-cacheable.

• efficiency, scalability, and user-perceived

performance by reducing the average latency of a

series of interactions.

Page 11: distributing over the web

constraintsIII

“the goal of caching in HTTP/1.1 is to

eliminate the need to send requests in many

cases (expiration), and to eliminate the need

to send full responses in many other cases

(validation)”

Page 12: distributing over the web

constraints III

Page 13: distributing over the web

constraints III

Page 14: distributing over the web

constraintsIV

uniform interface

the ReST interface is designed to be efficient for large-

grain hypermedia data transfer.

§ identification of resources.

§ manipulation of resources through representations.

§ self-descriptive messages.

§ HatEoAS, HatEoAS, HatEoAS and HatEoAS again!

Page 15: distributing over the web

data

ReST components communicate by transferring a

representation of a resource.

a representation is a sequence of bytes, plus

representation metadata to describe those bytes.

§ type (not format) selected dynamically:

• based on capabilities or desires of recipients.

• based on the nature of the resource.

Page 16: distributing over the web

resources

§ any information that can be named can be a resource.

§ a resource is a conceptual mapping to a set of entities.

§ every resource must provide an identifier.

Page 17: distributing over the web

URI– URL- URN

URI(UniformResourceIdentifier)

asetofcharactersusedtoidentify anameor aresourceontheInternet

URL(UniformResourceLocator)

where anidentifiedresourceisavailableandhow to

retrieveit(http://,ftp://,smb://…)

URN(UniformResourceName)

TheURNdefinessomething'sidentity

Page 18: distributing over the web

URI– URL- URN

URL: ftp://ftp.is.co.za/rfc/rfc1808.txt

URL: http://www.ietf.org/rfc/rfc2396.txt

URL: ldap://[2001:db8::7]/c=GB?objectClass?one

URL: mailto:[email protected]

URL: news:comp.infosystems.www.servers.unix

URL: telnet://192.0.2.16:80/

URN: urn:oasis:names:specification:docbook:dtd:xml:4.1.2

URN: urn:isbn:0-486-27557-4

Page 19: distributing over the web

it’samatterofimplementation

HTTP != ReST

HTTP is a ReSTful protocol for exposing resources across

distributed systems.

HTTP doesn’t map 1:1 to ReST, it's an implementation of ReST.

Page 20: distributing over the web

GET

> GET /orders/1772634< 200 Ok< ETag: 686897696a7c876b7e< Last-Modified: Thu, 05 Jul 2012 15:31:30 GMT

GOOD

> GET /GetOrder?id=1772634< 200 Ok

BAD

include Last-Modified header whenever feasible!

Page 21: distributing over the web

localization

> GET /entries/1772634> Accept-Language: it, en-gb;q=0.8, en;q=0.7< 200 Ok< ...< Content-Language: en

GOOD

> GET /GetEntry?id=1772634&languageId=4< 200 Ok

BAD

language is a matter of representation

Page 22: distributing over the web

aliasing

> GET /weather/tomorrow< 302 Found< Location: /weather/2015-03-21T12%3A24%3A26Z< Link: </weather/2015-03-21T12%3A24%3A26Z>; rel="canonical"

GOOD

> GET /GetWeatherForecastForTomorrow< 200 Ok

BAD

Page 23: distributing over the web

POST

> POST /entries/188273/comments< 201 Created< Location: /entries/188273/comments/2

GOOD

> POST /AddComment?entryId=188273< 200 Ok

BAD

[…] is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI

Page 24: distributing over the web

PUT

> PUT /entries/188274< 204 No Content (when resource is updated)

< 201 Created (when resource is created)

GOOD

> PUT /AddEntry< 200 Ok

BAD

PUT is (MUST be) idempotent, while POST is not.

Page 25: distributing over the web

idempotency

Page 26: distributing over the web

donotPATCHlikeanidiot!

PATCH is not about sending an updated value,

rather than the entire resource.

Stop doing this right now!

> PATCH /entries/188273{"email": "[email protected]"}

BAD

> PATCH /entries/188273?email=mario%40rossi.com

BAD

Page 27: distributing over the web

PATCH

PATCH method requests that a set of changes have to

be applied to the resource this set contains

instructions describing how a resource should be

modified to produce a new version.

> PATCH /entries/188273[description of changes]

GOOD

the entire set of changes must be applied atomically.

Page 28: distributing over the web

PATCH

you have to use a media type that defines semantics

for PATCH (RFC 6902).

> PATCH /entries/188273[{"op": "replace","path": "/email","value": "[email protected]"}]

< 200 Ok

GOOD

Page 29: distributing over the web

PATCH:asidenotes

• Fielding's dissertation does not define any way to

partially modify resources.

• PATCH does not transfer a complete

representation, but ReST doesn't require

representations to be complete anyway.

Page 30: distributing over the web

checkifaresourceexists

> HEAD /orders/1772634< 404 Not Found

GOOD

> GET /ExistsOrder?id=1772634< 200 Ok

BAD

save bandwidth!

Page 31: distributing over the web

longrunningprocesses

> POST /entries< 202 Accepted< Location: /queue/982773

> GET /queue/982773< 200 Ok< {"status": "pending",

"eta": "00:01:23"}< Link: </queue/982773>; rel="cancel";

> GET /queue/982773< 303 See Other< Location: /entries/188275

Page 32: distributing over the web

optimisticconcurrency

> GET /orders/123< 200 Ok< ETag: 686897696a7c876b7e

> PUT /orders/123> If-Match: 686897696a7c876b7e< 412 Precoondition Failed

always, always and always rely on ETag!

did I say “always”? :-)

Page 33: distributing over the web

409Conflict

> POST /users< 409 Conflict< {"error": "username_already_taken"}

the request could not be completed due to a conflict

with the current state of the resource.

Page 34: distributing over the web

400BadRequest

> POST /entries/188273/comments< 400 Bad Request< {"message": "problems parsing payload."}

the request could not be understood by the server

due to malformed syntax.

Page 35: distributing over the web

422UnprocessableEntity

> POST /entries/188273/comments< 422 Unprocessable Entity< {"message": "validation failed.",

"errors": [{"path": "/title","code": "missing_field"}]}

server was unable to process the contained

instructions (eg. semantically erroneous instructions).

Page 36: distributing over the web

customerrorcodes

> POST /entries/188273/comments< 489 Entry Does Not Allow New Comments

don’tdothat

BAD

Page 37: distributing over the web

503ServiceUnavailable

> GET /entries/188273< 503 Service Unavailable< Retry-After: Mon, 20 Apr 2015 23:59:59 GMT

the server is currently unable to handle the request

due to a temporary overloading or maintenance of

the server.

Page 38: distributing over the web

pagination

> GET /user/7364/orders?page=3&page_size=10< 200 Ok< Link: </user/7364/orders?page=2&page_size=10>; rel="prev", </user/7364/orders?page=4&page_size=10>; rel="next", </user7364/orders?page=11&page_size=10>; rel="last"

include Link header, embrace HatEoAS!

Page 39: distributing over the web

authorization

> GET /user/7364/orders< 401 Unauthorized< WWW-Authenticate: Bearer realm="example"

on 401 status code you MUST include a WWW-

Authenticate header field containing a challenge

applicable to the requested resource.

Page 40: distributing over the web

conditionalrequests

> GET /orders/123> If-None-Match: "644b5b0155e6404a9cc4bd9d8b1ae730" < 304 Not Modified

body will be (MUST be) empty on a 304 response

save bandwidth!

> GET /orders/123> If-Modified-Since: Thu, 05 Jul 2012 15:31:30 GMT< 304 Not Modified

Page 41: distributing over the web

versioning

by URL: it sucks!

§ URI design should have less natural constraints and it

should be preserved over time.

§ it disrupts the concept of HatEoAS.

§ resource URIs that API users can depend on should be

permalinks.

by Accept header: application/vnd.contoso.cart-v2+json

§ the only drawback is that it could give you a few

headaches when it comes to testing / debugging.

Page 42: distributing over the web

versioning

“No, HTTP doesn’t version the interface

names — there are no numbers on the methods

or URIs. That doesn’t mean other aspects of the

communication aren’t versioned. We do want

change, since otherwise we would not be able

to improve over time, and part of change is

being able to declare in what language the data

is spoken. We just don’t want breaking change.

Hence, versioning is used in ways that are

informative rather than contractual”

Page 43: distributing over the web

falsemyth

ReST is suitable for CRUD…

BULLSHITReST != ODBC => Transaction boundaries are

defined by resources themselves.

ReST is far away from CRUD.

Page 44: distributing over the web

HatEoAS

Hypermedia as the Engine of Application State

§ API clients should not construct URLs on their

own.

§ decoupling: future upgrades of the API easier for

developers.

> GET /users/mariorossi< Link: <https://api.contoso.com/users/mariorossi>; rel="related"...

Page 45: distributing over the web

cookies

“An example of where an inappropriate

extension has been made to the

protocol to support features that

contradict the desired properties of the

generic interface is the introduction of

site-wide state information in the form

of HTTP cookies”

cookie-based applications on the web will never be reliable

Page 46: distributing over the web

what’s«wrong»with1.X

§ clients achieve concurrency by using multiple

connections.

§ HTTP headers cause a lot of network traffic on req/rsp.

Page 47: distributing over the web

«mosaic»

Page 48: distributing over the web

bandwithastheprimary metric

for most web-browsing use

cases, an internet connection

over several Mbps offers but a

tiny improvement in performance

Page 49: distributing over the web

TCPSlow-Start

• client sends a SYN packet which advertises its maximum

buffer size

• sender replies by sending several packets back

• then each time it receives an ACK from the client, it doubles

the number of packets that can be "on the wire“ (cwnd) while

unacknowledged. -> that allows exponential grow

• …

avoid sending more data than the network is capable of transmitting

HTTP traffic tends to make use of short and bursty connections - in these cases we often never even reach the full capacity of our pipes

Page 50: distributing over the web

whydoweneedheadercompression

§ page with 80 assets, each request has 1400 bytes of

headers.

• 7-8 roundtrips to get the headers out “on the wire”…

without counting response time.

Page 51: distributing over the web

pipelining

client server

does not scale very well… :-(

Page 52: distributing over the web

pipelining

client server

much better, but it suffers

from “head of line”

Page 53: distributing over the web

TCPSlow-Start

focus on cutting down the round-trip time between the client and server, not necessarily

just investing in bigger pipes

• re-use your TCP connections

• support HTTP keep-alive and pipelining

• think about end-2-end latency

Page 54: distributing over the web

specs

HTTP/2 - RFC7540

HPACK - RFC7541

Page 55: distributing over the web

terminology

Page 56: distributing over the web

frameanatomy

+-----------------------------------------------+| Length (24) |+---------------+---------------+---------------+| Type (8) | Flags (8) |+-+-------------+---------------+-------------------------------+|R| Stream Identifier (31) |+=+=============================================================+| Frame Payload (0...) ...+---------------------------------------------------------------+

Page 57: distributing over the web

frametypes

- HEADERS

- DATA

- SETTINGS

- WINDOWS_UPDATE

- PUSH_PROMISE

- PRIORITY

- RST_STREAM

- GOAWAY

- PING

- CONTINUATION

} basis of HTTP request

Page 58: distributing over the web

DATAframe

+---------------+|Pad Length? (8)|+---------------+-----------------------------------------------+| Data (*) ...+---------------------------------------------------------------+| Padding (*) ...+---------------------------------------------------------------+

Page 59: distributing over the web

HEADERSframe

+---------------+|Pad Length? (8)|+-+-------------+-----------------------------------------------+|E| Stream Dependency? (31) |+-+-------------+-----------------------------------------------+| Weight? (8) |+-+-------------+-----------------------------------------------+| Header Block Fragment (*) ...+---------------------------------------------------------------+| Padding (*) ...+---------------------------------------------------------------+

Page 60: distributing over the web

GETrequestexample

GET /resource HTTP/1.1 HEADERSHost: example.org ==> + END_STREAMAccept: image/jpeg + END_HEADERS

:method = GET:scheme = https:path = /resourcehost = example.orgaccept = image/jpeg

Page 61: distributing over the web

GETresponseexample

HTTP/1.1 304 Not Modified HEADERSETag: "xyzzy" ==> + END_STREAMExpires: Thu, 23 Jan ... + END_HEADERS

:status = 304etag = "xyzzy"expires = Thu, 23 Jan ...

Page 62: distributing over the web

GETrequestexample

POST /resource HTTP/1.1 HEADERSHost: example.org ==> - END_STREAMContent-Type: image/jpeg - END_HEADERSContent-Length: 123 :method = POST

:path = /resource{binary data} :scheme = https

CONTINUATION+ END_HEADERScontent-type = image/jpeghost = example.orgcontent-length = 123

DATA+ END_STREAM

{binary data}

Page 63: distributing over the web

serverpush

in addition to the response to the original request, the server can push additional resources to the client without the client

having to request each one explicitly

Page 64: distributing over the web

serverpush

• server receives HEADERS frame asking for index.html in stream 3,

and it can forecast the need for styles.css and script.js

• server sends a PUSH_PROMISE for styles.css and a PUSH_PROMISE

for script.js in stream 3

• server sends a HEADERS frame in stream 3 for responding to the

request for index.html

• server sends DATA frame(s) with the content of index.html in stream

3

• server sends HEADERS frame for the response to styles.css in

stream 4 and then HEADERS for the response to script.js in stream 6

• server sends DATA frames for the contents of styles.css in stream 4

and DATA frames for the contents of script.js in stream 6

Page 65: distributing over the web

PUSH_PROMISEframe

+---------------+|Pad Length? (8)|+-+-------------+-----------------------------------------------+|R| Promised Stream ID (31) |+-+-----------------------------+-------------------------------+| Header Block Fragment (*) ...+---------------------------------------------------------------+| Padding (*) ...+---------------------------------------------------------------+

Page 66: distributing over the web

what’snext

• hosting and distributing petabyte datasets

• computing on large data across organizations

• high-volume high-definition on-demand or real-time

media streams

• versioning and linking of massive datasets

• preventing accidental disappearance of

important files

• more

Page 67: distributing over the web

67

Page 68: distributing over the web

donotpoweritdown!

Page 69: distributing over the web

donotpoweritdown!

Page 70: distributing over the web

HTTPencourageshypercentralization

centrally managed web servers inevitably shut

down

Page 71: distributing over the web

HTTPisinefficient

2,576,067,779 views clocks in at 117 Megabytes 301.4 Petabytes

Page 72: distributing over the web

HTTPisinefficient

2,576,067,779 views clocks in at 117 Megabytes 301.4 Petabytes

assuming 1 cent per gigabyte means about 3.000.000…

Page 73: distributing over the web

overdependenceontheInternetbackbone

Page 74: distributing over the web

IPFS

instead of looking for a centrally-controlled location and asking it what it thinks /img/neocitieslogo.svg is, what if we instead asked a distributed network of millions of computers not for the name of a file, but for the content that is supposed to be in the file?

This is precisely what IPFS does.

https://ipfs.io/

Page 75: distributing over the web

75

Q&Atime