Mapping GTFS routes (without shapes)

This tutorial and its repository show how to draw public transport routes from a GTFS dataset on a web map. Here's a demo, and a screenshot:

image

Let's get started!

0. Prerequisites

First, make sure you have the prerequisites listed here.

Next let's pick things up from the previous project: a map of GTFS stops. Either follow that tutorial to build your own copy from scratch, or fork repository for a quick start.

The only thing missing now is some sample data, namely routes, trips, and stop_times. Fill in the gaps by downloading the corresponding SQL files from our starter app, then save them in your repository's ./sample-data/ directory.

That's it! The app is ready to be launched with docker-compose up --build.

You might have a conflict error if you already ran other projects using our starter app. You can resolve this by flushing any previous containers and volumes with docker stop $(docker ps -q), docker system prune -a, and docker volume rm $(docker volume ls -q). Then run docker-compose up --build again.

If everything went well, you'll be able to view a map with the location of GTFS stops in your browser (at http://localhost:4000/) that looks like this:

image

All good? Great, now we can replace the stop markers with route lines.

1. Line display

Let's start simple by drawing a single line on the map.

  1. Replace the stop markers from ./views/index.pug with Leaflet's polyline example:
script.
  // Replace the "markerStyle" and "L.geoJSON" with this:
  var latlngs = [
    [45.51, -122.68],
    [37.77, -122.43],
    [34.04, -118.2]
  ];
  var polyline = L.polyline(latlngs, {color: 'red'}).addTo(map);
  map.fitBounds(polyline.getBounds());
  1. While you're at it, change the page title in ./views/index.pug:
html(lang="en")
  head
    title GTFS Routes Map
  1. Temporarily disable the database query from ./src/index.ts:
app.get('/', async (request: Request, response: Response) => {
  try {
    // Comment out the query here,
    // then send back null routes below:
    response.render('index', { routes: null });
  }
});

Refresh the page and admire your very first line:

image

The line links three points: Portland, San Fransico, and Los Angeles. We're going to draw routes by using stops as the connection points.

2. Routes query

Some GTFS datasets contain an optional shapes table, which is the obvious starting point to draw routes.

Shapes describe the path that a vehicle travels along a route alignment, and are defined in the file shapes.txt. Shapes are associated with Trips, and consist of a sequence of points through which the vehicle passes in order.Source

Unfortunately, our dataset doesn't have this file. So we have two options: either generate it, or circumvent it. Let's go with the latter.

This involves a somewhat complex query to generate the list of stop sequences that make up each route, using a jointure of the following tables:

  • routes containing each route's description (name, id, colour, etc.)
  • stops containing each stop's description (name, id, geolocation, etc.)
  • trips listing routes actually travelled (at a given time)
  • stop_times combining a trip, stop, sequence, and time, which together describe when and where a vehicle stops along a route.

Go ahead and build this query in pgAdmin first, so you can view the output at different stages.

  1. Connect to the database in pgAdmin.

  2. Right click Tables then select Query Tool. This opens a window where you can run SQL queries:

image

  1. Start with a Common Table Expression (CTE) that returns all the information needed from all the tables:
WITH routes_with_stops AS
(
  SELECT DISTINCT
    routes.agency_id,
    routes.route_id,
    routes.route_short_name,
    trips.trip_headsign,
    trips.trip_id,
    stop_times.stop_sequence::INTEGER,
    CONCAT(stops.stop_lon, ',', stops.stop_lat) AS stop_coordinates
  FROM routes
  JOIN trips ON routes.route_id = trips.route_id
  JOIN stop_times ON trips.trip_id = stop_times.trip_id
  JOIN stops ON stop_times.stop_id = stops.stop_id
  ORDER BY
    routes.agency_id,
    routes.route_id,
    trips.trip_headsign,
    trips.trip_id,
    stop_times.stop_sequence::INTEGER
)
  SELECT * FROM routes_with_stops;

The results shows all variants of each route (or just "S24" in the sample illustrated below), per stops, with the latter's sequence and coordinates:

image

Note that the query took over a minute to run. This isn't optimal, but it's to be expected considering the sample database doesn't use indexes or any other form of optimisation. We'll get back to this in another project.

  1. Use a second CTE to aggregate the stops' sequence and coordinates as a single value which describes the path followed by each trip:
WITH routes_with_stops AS
(
  -- ...
),
routes_with_stops_path AS
(
  SELECT
    route_id,
    route_short_name,
    trip_headsign,
    trip_id,
    JSON_AGG(stop_coordinates ORDER BY stop_sequence) AS stops_path,
    ROW_NUMBER() OVER(PARTITION BY route_id) AS occurrence
  FROM routes_with_stops
  GROUP BY
    route_id,
    route_short_name,
    trip_headsign,
    trip_id
  ORDER BY route_id
)
  SELECT * FROM routes_with_stops_path;

This produces just one row per tripId, paired with an ordered list of stop coordinates that composes the route:

image

  1. Apply one last selection to keep only the relevant columns of the first occurrence of many duplicates:
WITH routes_with_stops AS
(
  -- ...
),
routes_with_stops_path AS
(
  -- ...
)
  SELECT
    route_id,
    route_short_name,
    stops_path
  FROM routes_with_stops_path
  WHERE occurrence = 1

The whole thing runs in 2 minutes now, which is far from great. But once more, optimisation is its own project. Let's focus on the mapping for now.

  1. Add the query in ./src/index.ts, using the code previously commented out:
app.get('/', async (request: Request, response: Response) => {
  try {
    const routes = await pool.query(`
      WITH routes_with_stops AS
      (
        SELECT DISTINCT
          routes.agency_id,
          routes.route_id,
          routes.route_short_name,
          trips.trip_headsign,
          trips.trip_id,
          stop_times.stop_sequence::INTEGER,
          CONCAT(stops.stop_lon, ',', stops.stop_lat) AS stop_coordinates
        FROM routes
        JOIN trips ON routes.route_id = trips.route_id
        JOIN stop_times ON trips.trip_id = stop_times.trip_id
        JOIN stops ON stop_times.stop_id = stops.stop_id
        ORDER BY
          routes.agency_id,
          routes.route_id,
          trips.trip_headsign,
          trips.trip_id,
          stop_times.stop_sequence::INTEGER
      ),
      routes_with_stops_path AS
      (
        SELECT
          route_id,
          route_short_name,
          trip_headsign,
          trip_id,
          JSON_AGG(stop_coordinates ORDER BY stop_sequence) AS stops_path,
          ROW_NUMBER() OVER(PARTITION BY route_id) AS occurrence
        FROM routes_with_stops
        GROUP BY
          route_id,
          route_short_name,
          trip_headsign,
          trip_id
        ORDER BY route_id
      )
        SELECT
          route_id,
          route_short_name,
          stops_path
        FROM routes_with_stops_path
        WHERE occurrence = 1
    ;`);

    // ...

    response.render('index', { routes: null });
  }
});

3. Routes display

Now that we have the data, let's convert it to geoJson format for a nice display on the map.

  1. Edit ./src/index.ts to convert the query's response into an array of geoJson objects, then send it to the frontend:
app.get('/', async (request: Request, response: Response) => {
  try {
    const routes = await pool.query(`
      -- ...
    ;`);

    const geoJson = routes.rows.map((route) => ({
      type: 'Feature',
      properties: {
        id: route.route_id,
        name: route.route_short_name,
        // This generates a random colour for each route to improve the map's readability:
        color: `#${Math.floor(Math.random() * 16777215).toString(16)}`
      },
      geometry: {
        type: 'LineString',
        coordinates: route.stops_path.map((point: string) => {
          const [lon, lat] = point.split(',');
          return [parseFloat(lon), parseFloat(lat)];
        })
      }
    }));

    response.render('index', { routes: JSON.stringify(geoJson) });
  }
});
  1. Replace the sample code in ./views/index.pug with the list of routes obtained from the backend:
script.
  // ...

  L.geoJSON(!{routes}, {
    style: function(route) {
      return {
        color: route.properties.color
      };
    },
    onEachFeature: function (route, layer) {
      const {id, name} = route.properties;
      return layer.bindPopup(`<h1>${id}</h1><p>${name}</p>`);
    }
  }).addTo(map);

Refresh the browser one last time. Wait for an eternity (remember that the query alone takes over 2 minutes to run)... Finally, here's the result:

image

That's a lot of routes! If you zoom out, you'll see that the straight lines on the left are international routes:

image

The fact that they look as straight as the crow flies is the result of drawing lines by connecting stops directly, instead of following roads and rail lines (we need shapes for that). Zooming in shows this clearly:

image

Still, not too bad for a first shot at drawing routes!

What's next?

There's definitely room for improvement in this project, such as:

  • optimising the database to enable much faster query results

  • adding filters to the map to view only a subset of routes, like the international ones, for smaller map loading times

  • generating and displaying the contents of a shape.txt file for more realistic routes.