-
Notifications
You must be signed in to change notification settings - Fork 21
Project: GTFS feed creation
Create a program that automates the task of generating static GTFS feed of a public transit system, with the input being a simple public transit system's regular daily routes as seen in Indian cities like Pune.
python, java, php-mysql, javascript or similar programming, reading and outputting spreadsheet/csv, an understanding of static GTFS
Template mentioned in step 2 is prepared, with some starter data to work with. The stops and routes info mentioned in step 3 are yet to be prepared: they will be the outputs of the stops and routes projects. All programming is still to be done, and even the framework/language to use isn't fixed yet. This project is OPEN to join in / take up.
- Lead: Nikhil VJ, nikhil.js (at) gmail.com, +919665831250
- Gaurav Sitlani
- Static GTFS reference : https://developers.google.com/transit/gtfs/reference/
- A forum thread on open source tools for producing GTFS : https://groups.google.com/forum/#!topic/opentripplanner-users/_N5zXJUc7gE
- CowboyGTFS : https://github.com/jamesku/cowboyGTFS Attempt at GTFS creation; has done some work on map-based route creation.
- TransitDataFeeder : open source web-based GTFS creation and maintenance tool made in Java; See description here.
- Listing of GTFS editors on GIS StackExchange : https://gis.stackexchange.com/a/124120/44746
- XLSTools by Bob Heitzman: https://sites.google.com/site/rheitzman/ This was probably the first and last tool that took spreadsheet data and converted it to GTFS. Since it was limited to excel macros, it had its limitations (including some cumbersome things for the user) and could not be extended further by others. We hope to change that.
- http://transloc.com/architect-free-gtfs-builder/ 20.9.17: Found this service does pretty much the same things we're trying to achieve. Applied a week ago to have an account made; haven't heard back from them since.
- https://github.com/search?utf8=%E2%9C%93&q=gtfs+builder&type= Search on github for gtfs projects : keep an eye out for similar work already going on, let's merge if possible.
Keep this page open : https://developers.google.com/transit/gtfs/reference/
I (Nikhil) have prepared a "template for gtfs conversion" spreadsheet here:
https://docs.google.com/spreadsheets/d/1JL5ClgiB1VFY54hg8KTkQm0T_8xfNUwK1y4RnMWnXcg/edit?usp=sharing
Download a copy and open it up. Check out the worksheets in it.
stops.txt : the stops database, and also the stops.txt file in the gtfs.
routes-db : routes database that has the route's name and timing information keyed in to its unique id. timing is either in form of timings (pipeline separated), or first trip, frequency, last trip. (i have kept one route having timings and other having frequency)
sequence-db : the sequence of stops in each route. unique id's of both used. Up and down directions defined separately.
These would get data filled in through the other projects on stops and routes data.
From the above three tables (which would be our internal db), the GTFS feed is created, which comprises of a bunch of CSV files with a .txt extension.
- stops.txt is as-is
- routes.txt - each route's id and name
- trips.txt - if a route has multiple timings instead of frequency, then it is multiplied into multiple trips here. else one trip. Oh, and separate trips for reverse direction.
- frequences.txt - if a route operates on a frequency, then its here
- stop_times.txt - where a trip expands into sequence of stops. This is where the main computation takes place.
- calendar.txt - static for our purposes.
- agency.txt - static for our purposes.
Please study the GTFS reference site for knowing more about these files/tables.
After understanding the above (gulp!), program it so that given an input with stops.txt, routes-db and sequence-db filled in, the program generates the remaining sheets (need not be as sheets in an excel.. that was just for my convenience. Output will be each of these sheets being a text file in csv format.)
Here's a page on understanding the GTFS format, by studying a real GTFS data snippet.
- Pick one route from
routes-db
sheet. - Load timings values from
routes-db
sheet for that route. - From sequence-db sheet, load stopcode sequences for that route.
- Create entry in
routes.txt
- Based on timings values, calculate number of trips to provision. 5.1. If timing is in frequency format, then just one trip per direction. 5.2. Else as many trips as starting times in either direction.
- Create trip entries for chosen route in
trips.txt
. - If frequency-based route, create entry in
frequencies.txt
. - For each trip, the sequence of stops is to be defined in
stop_times.txt
. ie, if 30 stops then 30 rows for that trip, with stopcodes. - Timing values are counted up from 00:00hrs at starting stop in case its a frequency based route, or from given start times in case its a fixed timings route.
- Estimate the timings of subsequent stops by choosing some methodology. Some options: 18.2. Assume some time interval between each stop, like 3 mins for example. 10.1. Calculate distance between stops using lat-long values and assume some average speed of buses.
- Remember to set
timepoint
as 0 to indicate that time values are approximate and not exact. - Repeat for all trips under the route.
- Repeat for all routes.
As initial dummy data I've copied two routes from the PMPML database (also attached), so the id's are from there. Please ignore the -D suffix as in GTFS one route will have trips defined under it going up or down. For the sake of simplicity I have not defined the return (up) journeys here yet. We'll probably have to add columns in the routes-db sheet to define timings for return trips.
This task / project ties in to a long term process of improving PMPML through increased transparency and systematization. The global standard data format for public transit is (GTFS), which is used by Google Transit and most transit related apps. It critically needs a stop-centric database and routes info laid out in a systemized way, and from there we need to have a program that churns this data and generates GTFS.
If you feel that the internal db can be structured in a better way, please put the better structure forward and let's adapt to that. Have to do this in consultation with the other projects if they've started, since their output is the internal db.
A follow-on project would be creating a GUI system that operates on the three internal db tables. Enabling a user to change the route info comfortably, and then the program takes the updated internal db and generates a fresh GTFS feed. Here is an album of some mockups created by Nikhil, and here is a full presentation on it he had made earlier, before the Pune Open Data Portal had released this bus data.