i like trains - mrmcd15

Post on 22-Mar-2017

306 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

I like trainsExploiting undocumented Germany Railway APIs

API Calls very servertime

curl -i -s -k -X 'GET' 'http://www.apps-bahn.de/bin/live

map/query-livemap.exe/dny?L=vs_livefahrplan&tpl=time2json&performLocating=512&look_nv

=type|servertime&'

[ {

"date":"05.09.15","time":"02:55","hours" : "2","minutes" : "55","seconds" : "12","year" : "2015","month" : "9","day" : "5","weekday" : "5","HHMMSS" : "025512","DDMMYYYY" :

"05092015" }]

API Calls much stations

curl -i -s -k -X 'GET' 'http://www.apps-bahn.de/bin/live

map/query-livemap.exe/dny?L=vs_livefahrplan&performLocating=2&tpl=stop2json&look_maxno=150&look_nv=get_stopweight|

yes|nur_hauptmast|yes|minId|8000000|maxId|

8099999&look_stopclass=8&look_maxx=8120171&look_maxy=51000923&look_minx=6822411&lo

ok_miny=50392017&'

{ "prods":"8", "stops": [{ "x" : "6832952", "y" : "50807094" ,"name" : "Erftstadt" ,"urlname" : "Erftstadt" ,"prodclass" : "8" ,"extId":"8003671" ,"puic":"80" ,"planId":"1440019499" ,"stopweight":"5240" },{...}] ,"error" : "0" ,"numberofstops": "191"}

API Calls so trains

[ [ [14547207,52335925], [...] [12566416,52400756] ], [ [0,"Frankfurt(Oder)"], [71,"Frankfurt(Oder)-Rosengarten"], [120,"Pillgram"], [137,"Jacobsdorf(Mark)"], [175,"Briesen(Mark)"], [..] [1510,"Potsdam Park Sanssouci"], [1580,"Werder(Havel)"], [1662,"Groß Kreutz"], [1691,"Götz"], [1768,"Brandenburg Hbf"] ]]

curl -i -s -k -X 'GET' 'http://www.apps-bahn.de/bin/live

map/query-livemap.exe/dny?L=vs_livefahrplan&look_trainid=84/84935/18/19/80&tpl=chain2json3&performLocating=16&forma

t_xy_n'

{ "look":{ "singletrain":[ { "trainid":"84/84935/18/19/80", "x":"12566137", "y":"52400558", "name":"RE 18150", "pstopname":"Brandenburg Hbf", "pstopno":"8010060", "parr":"2:35", "fstopname":"Frankfurt(Oder)", "fstopno":"8010113", "fdep":"0:29", "lstopname":"Brandenburg Hbf", "lstopno":"8010060", "larr":"2:35", "pass":"19", "edgeid":"20", "passproc":"0" } ] }}

curl -i -s -k -X 'GET' 'http://www.apps-bahn.de/bin/live

map/query-livemap.exe/dny?L=vs_livefahrplan&tpl=singletrain2json&performLocating=8&look

_nv=get_rtmsgstatus|yes|get_rtfreitextmn|yes|

get_rtstoptimes|yes|get_fstop|yes|get_pstop|yes|get_nstop|yes|

get_lstop|yes|zugposmode|3|&look_trainid=84/84935/18/19/8

0'

221113501, 212867344, "84/183337/18/19/80", "28", 2, "27", "Basel SBB", [[16585, -35714, -8070, "29", "88", "0", "1"], [18958, -36649, 0, "4", "0", null, null], [18967, -36649, 59, "29", "61", null, null], [20144, -37126, 5900, "29", "61", null, null], [21349, -37584, 11742, "", "0", null, null], [21349, -37584, 11801, "29", "60", null, null], [21960, -37800, 14722, "29", "60", null, null], [22581, -37998, 17643, "30", "61", null, null], [23228, -38169, 20564, "30", "60", null, null], [23884, -38321, 23485, "30", "60", null, null], [25098, -38591, 28860, "30", "53", null, null], [25205, -38609, 29327, "30", "61", null, null], [26518, -38906, 35169, "", "0", null, null]], "Koblenz Hbf", "8000206", "Mainz Hbf", "8000240", "04.09.15", "0", null, "1:36", "0:40", "27", "33", "0", null, null], ["IC 60457", 1155930774, 1591832938, "84/183259/18/19/80", "0", 2, "0", "Praha hl.n.", [[136330, 59796, -32737, "7", "98", null, null], [136438, 75959, 32664, "7", "98", null, null], [136447, 77748, 39903, "", "0", null, null]], "Bielefeld Hbf", "8000036", "Berlin Hbf (tief)", "8098160", "04.09.15", "0", null, "4:22", "0:43", "0", "18", "0", null, null], ["CNL40419", 1552797704, 1293572134, "84/183116/18/19/80", "28", 2, "31", "Zürich HB", [[68767, -64794, -674, "29", "109", null, null], [69388, -64992, 940, "30", "111", null, null], [70035, -65163, 2553, "30", "109", null, null], [70691, -65315, 4167, "30", "110", null, null], [71905, -65585, 7135, "30", "97", null, null], [72012, -65603, 7393, "30", "110", null, null], [73325, -65900, 10620, "30", "111", null, null], [75923, -66556, 17074, "29", "110", null, null], [77208, -66906, 20301, "30", "110", null, null], [78512, -67221, 23528, "29", "111", null, null], [78566, -67239, 23657, "30", "109", null, null], [79168, -67374, 25141, "30", "109", null, null], [79833, -67500, 26755, "31", "111", null, null], [80525, -67581, 28368, "31", "109", null, null], [81217, -67608, 29982, "0", "106", null, null], [81496, -67608, 30627, "0", "111", null, null], [81874, -67599, 31499, "0", "109", null, null], [82602, -67527, 33209, "1", "109", null, null], [83267, -67437, 34790, "1", "111", null, null], [83968, -67347, 36435, "", "0", null, null]], "Koblenz Hbf", "8000206", "Mainz Hbf", "8000240", "04.09.15", "0", null, "1:36", "0:40", "31", "31", "0", null, null], ["RB 11840", 1366314646, 1180032903, "84/94656/18/19/80", "6", 8, "", "Neuss Hbf", [[242, -827, -1115, "9", "40", "1", "1"], [89, -386, 3368, "8", "39", "1", "2"], [-18, 54, 7850, "8", "39", null, null], [-198, 944, 16815, "8", "40", "1", "3"], [-324, 1843, 25780, "8", "40", null, null], [-378, 2293, 30262, "8", "40", "1", "4"], [-405, 2742, 34744, "7", "40", "1", "6"], [-396, 3192, 39227, "", "0", "1", "7"]], "Holzheim(b Neuss)", "8002979", "Neuss Hbf", "8000274", "05.09.15", "-1", null, "1:51", "1:44", null, null, "4", null, null], ["RE 10892", 631912656, 1369918931, "84/93747/18/19/80", "20", 8, "", "Bielefeld Hbf", [[351, -18, -316, "15", "273", "3", "0"], [0, 0, 0, "", "0", null, null], [0, 0, 60000, "", "0", null, null]], "Herford", "8000162", "Brake(b Bielefeld)", "8001118", "05.09.15", "-1", null, "1:55", "1:50", null, null, "4", null, null], ["RE 10246", 1120099420, 908435482, "84/92655/18/19/80", "18", 8, "", "Essen Hbf", [], "Münster(Westf)Hbf", "8000263", "Münster-Albachten", "8000462", "05.09.15", "0", null, "2:16", "2:10", null, null, "0", null, null], ["RE 10174", 1326439793, 732721842, "84/92521/18/19/80", "14", 8, "8192", "Aachen Hbf", [], "Köln Hbf", "8000207", "Köln-Ehrenfeld", "8000208", "05.09.15", "0", null, "2:19", "2:15", "13", "15", "0", null, null], ["CNL 457", 1155930774, 1591832938, "84/82286/18/19/80", "0", 2, "8", "Praha hl.n.", [[136330, 59796, -32738, "7", "98", null, null], [136438, 75959, 32663, "7", "98", null, null], [136447, 77748, 39902, "", "0", null, null]], "Bielefeld Hbf", "8000036", "Berlin Hbf (tief)", "8098160", "04.09.15", "0", null, "4:22", "0:43", "8", "40", "0", null, null], ["EN 447", 1155930774, 1591832938, "84/82269/18/19/80", "0", 4, "0", "Warszawa Wschodnia", [[129283, 46205, -3804, "6", "80", null, null], [129822, 47032, 602, "6", "80", null, null], [130388, 47877, 5142, "6", "79", null, null], [130676, 48308, 7456, "6", "80", null, null], [130928, 48695, 9503, "6", "80", null, null], [131476, 49531, 13955, "6", "80", null, null], [132024, 50367, 18406, "6", "80", null, null], [133130, 52021, 27262, "6", "80", null, null], [133669, 52848, 31669, "6", "80", null, null], [134227, 53693, 36165, "", "0", null, null]], "Bielefeld Hbf", "8000036", "Berlin Hbf (tief)", "8098160", "04.09.15", "0", null, "4:22", "0:43", "0", "40", "0", null, null], ["CNL 419", 221113501, 212867344, "84/82252/1

curl -i -s -k -X 'GET' ‘http://www.apps-bahn.de/bin/livemap/query-livemap.exe/dny?

L=vs_livefahrplan&performLocating=1&performFixedLocating=9’

curl -i -s -k -X 'GET' ‘http://www.apps-bahn.de/bin/livemap/query-livemap.exe/dny?

L=vs_livefahrplan&performLocating=1&performFixedLocating=9’

[...],[ "IC 60457", 1155930774, 1591832938, "84/183259/18/19/80", "0", 2, "0", "Praha hl.n.", [ [136330, 59796, -32737, "7", "98", null, null], [136438, 75959, 32664, "7", "98", null, null], [136447, 77748, 39903, "", "0", null, null] ], "Bielefeld Hbf", "8000036", "Berlin Hbf (tief)", "8098160", "04.09.15", "0", null, "4:22", "0:43", "0", "18", "0", null, null],[...]

namexyid

directionproductclassdelaylstopnamepstopnamepstopnonstopnamenstopnodateRefageofreportlastreportingnstoparrivalpstopdeparturezpathflagsadditionaltypehideMoments

[...],[ "IC 60457", 1155930774, 1591832938, "84/183259/18/19/80", "0", 2, "0", "Praha hl.n.", [ [136330, 59796, -32737, "7", "98", null, null], [136438, 75959, 32664, "7", "98", null, null], [136447, 77748, 39903, "", "0", null, null] ], "Bielefeld Hbf", "8000036", "Berlin Hbf (tief)", "8098160", "04.09.15", "0", null, "4:22", "0:43", "0", "18", "0", null, null],[...]

namexyid

directionproductclassdelaylstopnamepstopnamepstopnonstopnamenstopnodateRefageofreportlastreportingnstoparrivalpstopdeparturezpathflagsadditionaltypehideMoments

[...],[ "IC 60457", 1155930774, 1591832938, "84/183259/18/19/80", "0", 2, "0", "Praha hl.n.", [ [136330, 59796, -32737, "7", "98", null, null], [136438, 75959, 32664, "7", "98", null, null], [136447, 77748, 39903, "", "0", null, null] ], "Bielefeld Hbf", "8000036", "Berlin Hbf (tief)", "8098160", "04.09.15", "0", null, "4:22", "0:43", "0", "18", "0", null, null],[...]

Problemthese are not lat/longs

x = 1155930774y = 1591832938

Encrypted?

“Encryption/Decryption”

Calc CKV

22222 + ((date+n) % 22222)

“Decrypt”

ckv * (x % ckv) + int(x / ckv)

Profit

So much coordinates.

How much we talkin’?

~160 milliondatasets per year

...stored as JSON… yikes!

What to do...

Full search over 80 GB worth of JSON?

Nope.

SomeSQL?

NOPE NOPE NOPE

What to do?no budget, high

expectations

ElasticSearch→ performs well with large datasets

→ easy clustering

How it worksCollect / Normalize

request all that datafix formatsconvert location

Store

save everything to a file

Import

import to ES

import everything again because you forgot something

Current stack

3 ES servers ~3.4 GB res/srv ~40 GB disk/srv ~2 CPUs/srv1 nginx + kibana

Indexes

1 index per day2 shards2 replicas

Mappings!less data (in memory)

mapping = { 'train': { '_all': {'enabled': False}, 'properties': { 'cid': { 'type': 'string', 'index': 'not_analyzed', 'norms': {'enabled': False}}, 'timestamp': { 'type': 'date', 'norms': {'enabled': False}}, 'location': { 'type': 'geo_point', 'fielddata': { 'lat_lon': True, 'format': 'compressed', 'precision': '3m' }, 'norms': {'enabled': False} }, 'name': { 'type': 'string', 'analyzer': 'keyword', 'norms': {'enabled': False}, 'fields': { 'raw' : { 'type' : 'string', 'index' : 'not_analyzed'} } }[…]

What we getwith (almost) no budget, still

high expectations

Fast searchsub-second searches for <2 weeks

<30 seconds for whole year

strange things happening

WTF

negative delays~3000 docs / ~800 unique trains

delaystrains with >30000 min delay

what’s next?

Delays - averaged per day

Delays - summed up

Uniq train_ids per day

Delays - average by type

kibana.iliketrains.deelastic.iliketrains.de

Twitter

@nv1t@lung_yean

Github

makujaho/trainspotter

top related