app engine - fit.mta.edu.vnfit.mta.edu.vn/files/danhsach/l11_app engine_datastore.pdf · > all...

Post on 15-Jul-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

App Engine: Datastore Introduction

Part 1

Another very useful course:https://www.udacity.com/course/developing-scalable-apps-in-java--ud859

1

Topics cover in this lessonTopics cover in this lesson

• What is Datastore?What is Datastore?– Datastore and relational database– Scalability, reliability and performancey, y p

• Datastore Internals– BigtableBigtable

• Datastore Basics Operation– Entity Properties and KeysEntity, Properties and Keys– Properties and Value Types– Datastore APIsDatastore APIs

2

What is Datastore?What is Datastore?

• Datastore is a database (persistent storage)Datastore is a database (persistent storage) for AppEngine

AppEngine Traditional Web Apps

Web application

AppEngine(Java, Python,

Pert/CGIPHP

framework PHP, Go) Ruby on RailsPersistent storage

Datastore RDBMS: MySQL MSstorage MySQL, MS SQL, Oracle

3

What is Datastore?

• Persistent storage for AppEngine

What is Datastore?

Persistent storage for AppEngine• AppEngine is very scalable=> Many instances

=> Central Server to store data from all=> Central Server to store data from all instances.Wh RDB? S l bili !• Why not RDB? Scalability!

4

Datastore and RDBMSDatastore and RDBMS

Datastore RDBMSDatastore RDBMS

Query SQL like query Full support of SQLQuery language flexibility

SQL-like query language : Limited to simple filter and

Full support of SQL-Table JOIN- Flexible filtering

sort - SubqueryReliability and S l bilit

Highly scalable and li bl ith

Hard to scaleScalability reliable with

performance

D t t ff G l l l l bilit

5

Datastore offsers Google-level scalability

Problems of Scalability and ReliabilityProblems of Scalability and Reliability

• Single InstanceSingle Instance– Performance limited by machine resource

Single point of failure– Single point of failure

• Replication (copies) increases reliability– Consistency among instances

• Sharding (Split among machines)– Lock control (transaction)[Shard = split server into multiple machines]

6

Strong Consistency and Eventual Consistency

Strong Data is always consistent among allStrong Consistency

Data is always consistent among all database instances-Just after write operation- Crash in the middle of write operation -> All server returns the same results.

Eventual Takes time until all data becomesEventual Consistency

Takes time until all data becomesconsistent after write(Think of DNS as an example)

DNS i di t ib t d d t b tDNS is a distributed database system.Updated configuration on domain -> Not reflected to all DNS immediately. For a certain period of time, some DNS servers return old. 7

Scalability, Reliability and Performance on RDB

• Replication and/or sharding for scalabilityReplication and/or sharding for scalability• But…

St i t RDB l it ti– Strong consistency on RDB slows write operations due to lock.Join operation is a bottleneck due to data– Join operation is a bottleneck due to data shuffling.

RDB ensures strong consistency -> Hard to ensure scalability.

8Datastore for AppEngine

Datastore InternalsDatastore Internals• Based on Bigtable, which offers super high

l biliscalability.• High availability by High Replication Datastore

(HRD)– Synchronous write on multiple datacenters.

• Supports strong consistency among multiple rows

9

What is Bigtable?What is Bigtable?

• Scalable, distributed, highly-available and Sca ab e, d st buted, g y a a ab e a dstructured storage– Bigtable is not database itself (it doesn’t support

query)• Consistency

S i f i l– Strong consistency for single row– Eventual consistency for multi-row level

• Google usage• Google usage– In production since April 2005– Web search youtubeWeb search, youtube…

10

Automatic Scale-out of Bigtable table server

11

Bigtable Data ModelBigtable Data Model

• Key value data storageKey value data storage• A row has a Key and Columns

S d b• Sorted by Key– In lexical order– Enables range query by application

12

Bigtable OperationsBigtable Operations

• CRUD on a rowCRUD on a row– Create, Read, Update and Delete operations

Preserves single row strong consistency (not– Preserves single-row strong consistency (not multiple row).

• Scan by range of keys• Scan by range of keys– But can not search by column values

13

Scalability is based on Bigtable automated sharding

14

Scalability is based on Bigtable automated sharding.Megastore supports transactions (strong consistency)

15Property = actual data you want to store

16

17Property can have multiple values. (Multiple data for one property)

18

19

20

App Engine: Datastore Query, Index andDatastore Query, Index and

Transactions

Part 2

21

2222

2323

2424

2525

Bigtable can scanon a key, noton a key, not value!

Index table on Bigtable:Property name and valueImplement query on bigtable(without reading

t l l )actual value)

2626

2727

2828

2929

3030

3131

3232

3333

3434

3535

3636

3737

3838

3939

4040

4141

4242

4343

4444

4545

4646

4747

4848

Caveats = limitations

4949

5050

5151

5252

5353

5454

5555

5656

57

top related