Update on Tripview Live Bus Data

[Continuing on from my last post about Tripview bus data]

So I woke up this morning and what did I find? Something had changed.

Requesting data at URL’s encoded with http://realtime.grofsoft.com/tripview/delays?routes=SB_???_u always returned no tuples, even for a known active route. There had to be something more.

I pulled out the sniffer AP again – and I did a query for M50. I found two things –

GET /tripview/delays?routes=SB_M50_d HTTP/1.1
Host: realtime.grofsoft.com
Connection: keep-alive
Accept-Encoding: gzip, deflate
User-Agent: TripViewLite/212 CFNetwork/609 Darwin/13.0.0
Accept-Language: en-us
Accept: */*

The first is that the letter on the end can be a u or a d, likely depending on the direction of the bus.

The second was that the iPad was still receiving data on buses – a quick looked revealed that USER AGENT SPOOFING is now required to get data from the server. If the server finds an unsupported user-agent string in the HTTP header, you will only get empty results!

So on Firefox, I got Override User Agent 0.4.0 and used the custom user agent of “TripViewLite/212 CFNetwork/609 Darwin/13.0.0” and success! The data came out just as expected. From the data, it appears that it’s obvious that data is only available for several buses which are either in run, or about to commence run – limited historical data is pushed out by the server.

So it looks like someone at Grofsoft isn’t happy that their data maybe being co-opted, stolen, scraped, or server resources tied up in mal-formatted queries (from experimentation). But still, at least they haven’t implemented something sinister like SSL/TLS yet. There’s still time to explore for those who wish to do so …

Using default Firefox User Agent:

{
 "pollInterval" : 30,
 "timestamp" : [redacted],
 "delays" : [

]}

With User Agent Switch:

{
 "pollInterval" : 30,
 "timestamp" : [redacted],
 "delays" : [
 {"route":"SB_M50_u","tripId":"80202618","start":"11:00","offsets":"11:22,4,11:24,5,11:29,3,11:33,4,11:36,5,11:38,6,11:43,5,11:45,6,11:46,5,11:46,6,11:49,4,11:49,5,11:50,4,11:51,5"},
 {"route":"SB_M50_u","tripId":"80202619","start":"11:15","offsets":"11:23,4,11:25,5,11:26,6,11:27,7,11:27,6,11:28,7,11:30,6,11:34,5,11:34,6,11:39,7,11:44,5,11:48,6,11:51,7,11:53,8,11:58,7,12:01,8,12:04,6,12:04,7,12:05,6,12:06,7"},
 {"route":"SB_M50_u","tripId":"80202617","start":"10:45","offsets":"11:28,-1,11:29,-2,11:31,-1,11:34,-3,11:34,-2,11:35,-3"},
 {"route":"SB_M50_u","tripId":"80202622","start":"12:00","offsets":"12:00,0"},
 {"route":"SB_M50_u","tripId":"80202623","start":"12:15","offsets":"12:15,0"},
 {"route":"SB_M50_u","tripId":"80202620","start":"11:30","offsets":"11:30,0"},
 {"route":"SB_M50_u","tripId":"80202621","start":"11:45"} 
]}

UPDATE:

Some further digging around led me to the NSW Transport Data Exchange program which provides some data under very liberal licenses. It allows for derivative work and redistribution as well – however, all the data I can find is static data for buses (and live data for RTA traffic, but not for buses). However, static data is useful.

First you must sign up, and realize no support is given for the data. Big woop.

The interesting thing is the sydney_buses_gtfs_static.zip file linked to from the download page. This zip file contains a relational database as a set of text files – I haven’t imported them into a database system, I’m only examining them raw.

Say we look up M50 – and we have the raw route tuple as follows:

{"route":"SB_M50_d","tripId":"80203589","start":"12:19","offsets":"12:19,0,12:21,1,12:22,4,12:27,3,12:28,5,12:37,2,12:38,4,12:39,5,12:46,4,12:47,5,12:49,4,12:49,5,12:50,4,12:53,5,12:54,4,12:56,5,12:57,4,13:00,5,13:01,4,13:02,5,13:03,6,13:05,7,13:07,6,13:10,5,13:11,6,13:11,5"},

How can we make sense of this? Well, using the zip data – it’s not so hard.

The tripID corresponds to the trips.txt file, giving the following line (snipped) in the database:

route_id,service_id,trip_id,trip_headsign,shape_id,wheelchair_accessible,direction_id
11954_M50,0,80203587,"Coogee",126735,1,1

Okay – still a bit cryptic – but we can see the route ID is 11954_M50 (we could guess that from the route tuple) – it’s an M50 service to Coogee . The shape ID is where it gets interesting – 126735. Opening the shape.txt file, we get the following (snipped) lines:

shape_id,shape_pt_lat,shape_pt_lon,shape_pt_sequence
126735,-33.849419975713296,151.14976911600547,0
126735,-33.848937344812754,151.14945530248184,1
126735,-33.84808208139271,151.15074900860714,2
126735,-33.85122980978339,151.15319935004314,3
126735,-33.85169375553925,151.15347036386797,4
126735,-33.85106041412643,151.15453233966835,5
126735,-33.85188611035863,151.15547646919885,6
126735,-33.85248963361669,151.1560363086877,7
126735,-33.85271822805495,151.15624755262075,8
126735,-33.85474851965012,151.1581493788482,9
126735,-33.85500463855153,151.1583924673651,10
126735,-33.85614791161225,151.159470369635,11
126735,-33.85644974587037,151.15975571853622,12
126735,-33.856779429569784,151.16009451400404,13
126735,-33.860746995930896,151.16432215061255,14
126735,-33.86163554853329,151.16646493607237,15
126735,-33.86177591071739,151.16680782453207,16
126735,-33.86266524551315,151.16839937801967,17
126735,-33.86285904714447,151.1686978958594,18
126735,-33.86383641764064,151.17014708957817,19
126735,-33.864988414992496,151.1718195913621,20
126735,-33.865291825707125,151.17221307391347,21
126735,-33.86590813881675,151.1730322738261,22
126735,-33.866150801439716,151.17419454174265,23
126735,-33.86669642633347,151.17511255598194,24
126735,-33.867006921315394,151.1753761905406,25
126735,-33.86774733835521,151.1760522619114,26
126735,-33.868760964871086,151.17630094635686,27
126735,-33.86892658532153,151.1765244327204,28
126735,-33.8688972437648,151.1769790755246,29
126735,-33.86869790317177,151.17752381157035,30
126735,-33.86789166091743,151.17971372917972,31
126735,-33.86782521413884,151.18070964541812,32
126735,-33.86987025039067,151.18853600869124,33
126735,-33.87081407082218,151.19078621598715,34
126735,-33.87094236703419,151.1915402266015,35
126735,-33.871080307558124,151.19233727733706,36
126735,-33.871744260091425,151.19333945103375,37
126735,-33.87240993256253,151.19384432319902,38
126735,-33.8743976074458,151.19533756093034,39
126735,-33.8746912989866,151.19568813200894,40
126735,-33.87493818045648,151.19653698637165,41
126735,-33.874835225324915,151.19689590782514,42
126735,-33.87390867971972,151.20198559498857,43
126735,-33.873686310052655,151.2021956614123,44
126735,-33.87358270731053,151.20251134226768,45
126735,-33.873011538302514,151.20415571226187,46
126735,-33.87282994827172,151.2046892324689,47
126735,-33.872720984852364,151.20463746198274,48
126735,-33.87273661321408,151.20509117640572,49
126735,-33.87275475833802,151.20571780670036,50
126735,-33.87287300571536,151.2064072059941,51
126735,-33.87288197222752,151.20702321917716,52
126735,-33.87300225617025,151.20785311614583,53
126735,-33.873033230119745,151.20812273263132,54
126735,-33.8731659776734,151.20919020516232,55
126735,-33.874048945663525,151.20913932622446,56
126735,-33.874598213048,151.20908460733392,57
126735,-33.87466156880587,151.2097211190759,58
126735,-33.876858953196056,151.20952386643734,59
126735,-33.87799370359549,151.20942448199003,60
126735,-33.87841676522412,151.20937239810198,61
126735,-33.87946073987518,151.20923166017033,62
126735,-33.879811369607026,151.20915946485465,63
126735,-33.88045854077525,151.20901620265533,64
126735,-33.881159642533945,151.2088609992488,65
126735,-33.8816807648275,151.20873117861277,66
126735,-33.88345193931532,151.20836979685467,67
126735,-33.88406446508646,151.20832455341136,68
126735,-33.8848376825093,151.20816782832026,69
126735,-33.885126623472054,151.20819422204409,70
126735,-33.88634564706295,151.20830928760904,71
126735,-33.88931781207381,151.20806331422276,72
126735,-33.889910892294225,151.20792115604914,73
126735,-33.89024008438452,151.2088549775225,74
126735,-33.89037147641232,151.2092090511525,75
126735,-33.89091490696477,151.21061417076007,76
126735,-33.890989536811624,151.21078561720626,77
126735,-33.891373266723036,151.21175076194208,78
126735,-33.89142728417925,151.21236596926238,79
126735,-33.891559136983645,151.21275248249023,80
126735,-33.891658301021934,151.21399390004973,81
126735,-33.89172814298179,151.214457400152,82
126735,-33.891921821222674,151.2159996168472,83
126735,-33.89205059587514,151.2167970956205,84
126735,-33.892080300342435,151.216980298039,85
126735,-33.89303305873431,151.21863649762616,86
126735,-33.89318026437276,151.21883888569877,87
126735,-33.89497870697896,151.22098578513467,88
126735,-33.89521406622588,151.2216729536826,89
126735,-33.896181762429755,151.22186913125324,90
126735,-33.89949019921858,151.22242764587202,91
126735,-33.899950885821895,151.2224829670075,92
126735,-33.901514019757144,151.2226992458014,93
126735,-33.90161333761096,151.2227079985642,94
126735,-33.90216807130013,151.22303172694328,95
126735,-33.90425705078192,151.22406982712238,96
126735,-33.90724712845317,151.2238131334842,97
126735,-33.90778754312767,151.22376947614657,98
126735,-33.908930539379895,151.2236159785654,99
126735,-33.90958752102151,151.2235266387639,100
126735,-33.91005519372196,151.22344122629212,101
126735,-33.91119756485231,151.22324447015976,102
126735,-33.91156033697124,151.22338835646786,103
126735,-33.91175073767554,151.22346011295477,104
126735,-33.91203190930511,151.22357324794046,105
126735,-33.912857267712184,151.22452952482917,106
126735,-33.91312243564154,151.22478360025522,107
126735,-33.91367997000846,151.22530200753104,108
126735,-33.91483805995074,151.22557002261922,109
126735,-33.91510561129038,151.22786826656687,110
126735,-33.915176183085926,151.22838597005733,111
126735,-33.91590182877029,151.23370325197874,112
126735,-33.916076345198576,151.2345541189053,113
126735,-33.91627792529563,151.23477709834657,114
126735,-33.9164147970155,151.23614792056597,115
126735,-33.91652854990136,151.23653495644885,116
126735,-33.9168453030953,151.23913512517723,117
126735,-33.91719904511058,151.24117211120165,118
126735,-33.917100037000054,151.24118496224327,119
126735,-33.91800531749144,151.2420641110287,120
126735,-33.91808814873818,151.2421813891155,121
126735,-33.9186590130818,151.2423751744125,122
126735,-33.92014222232642,151.2426800235188,123
126735,-33.91999829448961,151.24270461249327,124
126735,-33.92024308263576,151.24279693481552,125
126735,-33.92073876584809,151.2465725909532,126
126735,-33.920706094461515,151.24681122725477,127
126735,-33.921491374381176,151.24749822756127,128
126735,-33.92153789824099,151.24823281576525,129
126735,-33.92185774000834,151.25043289974374,130
126735,-33.92207741678134,151.25193195087056,131
126735,-33.923053861041964,151.25338311108732,132
126735,-33.92397279534464,151.25395929210168,133
126735,-33.92409783794835,151.25514662333845,134
126735,-33.924149195112456,151.25558907817532,135
126735,-33.923841488851295,151.25614702353968,136
126735,-33.923004333833056,151.25625062220442,137
126735,-33.92279760506359,151.25629810324003,138
126735,-33.92114346751532,151.25665631521798,139
126735,-33.920855757940586,151.2567162606148,140
126735,-33.919237371428885,151.25705209349377,141

So we know what path the bus is taking – this is quite cryptic but could be handy to plot it out. Lets take a look at stop_times.txt (snipped as well):

trip_id,arrival_time,departure_time,stop_id,stop_sequence
80203587,11:49:00,11:49:00,204759,0
80203587,11:51:00,11:51:00,204731,1
80203587,11:52:00,11:52:00,204714,2
80203587,11:53:00,11:53:00,204716,3
80203587,11:54:00,11:54:00,204717,4
80203587,11:54:15,11:54:15,204760,5
80203587,11:57:00,11:57:00,203911,6
80203587,11:57:15,11:57:15,203912,7
80203587,11:58:00,11:58:00,203913,8
80203587,11:58:15,11:58:15,203914,9
80203587,12:00:00,12:00:00,203916,10
80203587,12:07:00,12:07:00,2000282,11
80203587,12:08:00,12:08:00,2000249,12
80203587,12:09:00,12:09:00,200069,13
80203587,12:12:00,12:12:00,200074,14
80203587,12:13:00,12:13:00,200075,15
80203587,12:14:00,12:14:00,201059,16
80203587,12:16:00,12:16:00,201080,17
80203587,12:17:00,12:17:00,201081,18
80203587,12:19:00,12:19:00,201082,19
80203587,12:19:15,12:19:15,201045,20
80203587,12:20:00,12:20:00,201046,21
80203587,12:21:00,12:21:00,201047,22
80203587,12:22:00,12:22:00,201048,23
80203587,12:23:00,12:23:00,201049,24
80203587,12:24:00,12:24:00,202133,25
80203587,12:26:00,12:26:00,202128,26
80203587,12:27:00,12:27:00,202129,27
80203587,12:28:00,12:28:00,203317,28
80203587,12:28:15,12:28:15,203318,29
80203587,12:29:00,12:29:00,203319,30
80203587,12:30:00,12:30:00,203320,31
80203587,12:30:15,12:30:15,203324,32
80203587,12:31:00,12:31:00,203327,33
80203587,12:32:00,12:32:00,2031163,34
80203587,12:33:00,12:33:00,2031164,35
80203587,12:33:15,12:33:15,2031165,36
80203587,12:34:00,12:34:00,2031166,37
80203587,12:35:00,12:35:00,203119,38
80203587,12:36:00,12:36:00,203122,39
80203587,12:37:00,12:37:00,2031154,40
80203587,12:38:00,12:38:00,203461,41
80203587,12:38:15,12:38:15,203462,42
80203587,12:39:00,12:39:00,203466,43
80203587,12:40:00,12:40:00,203467,44
80203587,12:41:00,12:41:00,203468,45
80203587,12:41:15,12:41:15,203469,46
80203587,12:42:00,12:42:00,203470,47

Okay, so this thing doesn’t correspond with the route tuple data exactly. I’m not sure what we can do about that one – so I guess it’s a bit disappointing there. The request doesn’t start at the initial stop either – I am beginning to suspect the query itself with the “letter” at the end of the route may indicate which stop is being requested as the “start” point.

However, given the “time + offset” values in the route response, I am assuming while the bus has this many stops, most of them are not timing stops – and the time + offset values are given for timing stops only. So with the data above, we can match the times to work out the stop number, and use the stop number to work out the co-ordinates, and the name of the stop (e.g. from stops.txt – short excerpt below).

stop_id,stop_code,stop_name,stop_lat,stop_lon
20002,20002,"Circular Quay Stand E",-33.861602783203125,151.2119140625
20652,20652,"Serpentine Rd nr Greenwich Wharf",-33.8417854309082,151.18101501464844
21051,21051,"McCarrs Creek Rd nr Cargo Wharf",-33.645896911621094,151.28147888183594
21373,21373,"Burwood Rd nr Bayview Park",-33.85786437988281,151.12069702148438
200011,200011,"Circular Quay Stand A",-33.86150360107422,151.20944213867188
200012,200012,"Circular Quay Stand C",-33.861549377441406,151.20997619628906

Maybe I should really get on with my work now …

About lui_gough

I'm a bit of a nut for electronics, computing, photography, radio, satellite and other technical hobbies. Click for more about me!
This entry was posted in Computing, Tablet and tagged , . Bookmark the permalink.

Error: Comment is Missing!