Skip to content

Project: GTFS feed creation

answerquest edited this page Aug 10, 2017 · 17 revisions

Aim of this project:

Create a program that automates the task of generating static GTFS feed of a public transit system, with the input being a simple public transit system's regular daily routes as seen in Indian cities like Pune.

Skills needed :

python, java, php-mysql, javascript or similar programming, reading and outputting spreadsheet/csv, an understanding of static GTFS

Project Status

Template mentioned in step 2 is prepared, with some starter data to work with. The stops and routes info mentioned in step 3 are yet to be prepared: they will be the outputs of the stops and routes projects. All programming is still to be done, and even the framework/language to use isn't fixed yet. This project is OPEN to join in / take up.

Team

Lead: Nikhil VJ, nikhil.js (at) gmail.com, +919665831250

Links

Steps

1] GTFS reference (static):

Keep this page open : https://developers.google.com/transit/gtfs/reference/

2] Template

I (Nikhil) have prepared a "template for gtfs conversion" spreadsheet here:

https://docs.google.com/spreadsheets/d/1JL5ClgiB1VFY54hg8KTkQm0T_8xfNUwK1y4RnMWnXcg/edit?usp=sharing

Download a copy and open it up. Check out the worksheets in it.

3] Our internal database:

stops.txt : the stops database, and also the stops.txt file in the gtfs.

routes-db : routes database that has the route's name and timing information keyed in to its unique id. timing is either in form of timings (pipeline separated), or first trip, frequency, last trip. (i have kept one route having timings and other having frequency)

sequence-db : the sequence of stops in each route. unique id's of both used. Up and down directions defined separately.

These would get data filled in through the other projects on stops and routes data.

4] GTFS tables/files:

From the above three tables (which would be our internal db), the GTFS feed is created, which comprises of a bunch of CSV files with a .txt extension.

stops.txt is as-is

routes.txt - each route's id and name

trips.txt - if a route has multiple timings instead of frequency, then it is multiplied into multiple trips here. else one trip. Oh, and separate trips for reverse direction.

frequences.txt - if a route operates on a frequency, then its here

stop_times.txt - where a trip expands into sequence of stops. This is where the main computation takes place.

calendar.txt - static for our purposes.

agency.txt - static for our purposes.

Please study the GTFS reference site for knowing more about these files/tables.

5] Programming

After understanding the above (gulp!), program it so that given an input with stops.txt, routes-db and sequence-db filled in, the program generates the remaining sheets (need not be as sheets in an excel.. that was just for my convenience. Output will be each of these sheets being a text file in csv format.)

6] Clarification

As initial dummy data I've copied two routes from the PMPML database (also attached), so the id's are from there. Please ignore the -D suffix as in GTFS one route will have trips defined under it going up or down. For the sake of simplicity I have not defined the return (up) journeys here yet. We'll probably have to add columns in the routes-db sheet to define timings for return trips.

7] Bigger Picture:

This task / project ties in to a long term process of improving PMPML through increased transparency and systematization. The global standard data format for public transit is (GTFS), which is used by Google Transit and most transit related apps. It critically needs a stop-centric database and routes info laid out in a systemized way, and from there we need to have a program that churns this data and generates GTFS.

8] Open for design inputs

If you feel that the internal db can be structured in a better way, please put the better structure forward and let's adapt to that. Have to do this in consultation with the other projects if they've started, since their output is the internal db.

Taking this forward

A follow-on project would be creating a GUI system that operates on the three internal db tables. Enabling a user to change the route info comfortably, and then the program takes the updated internal db and generates a fresh GTFS feed. Here is an album of some mockups created by Nikhil, and here is a full presentation on it he had made earlier, before the Pune Open Data Portal had released this bus data.