Skip to content

Usage

Import the package using

import gtfs_segments

Get GTFS Files

Fetch all sources

from gtfs_segments import **fetch_gtfs_source**
sources_df = fetch_gtfs_source()
sources_df.head()

Fetch source by name/provider/state

from gtfs_segments import fetch_gtfs_source
sources_df = fetch_gtfs_source(place ='Chicago')
sources_df

Automated Download

from gtfs_segments import download_latest_data
download_latest_data(sources_df,"output_folder")

Manual Download

Download the GTFS .zip files from @transitfeeds or @mobility data.

Get GTFS Segments

from gtfs_segments import get_gtfs_segments
segments_df = get_gtfs_segments("path_to_gtfs_zip_file")

Alternatively filter a specific agency by passing agency_id as a string or multiple agencies as list ["SFMTA",]

segments_df = get_gtfs_segments("path_to_gtfs_zip_file",agency_id = "SFMTA")
segments_df

Table generated by gtfs-segments using data from the San Francisco’s Muni system. Each row contains the following columns:

  1. segment_id: the segment's identifier, produced by gtfs-segments
  2. stop_id1: the identifier of the segment's beginning stop. The identifier is the same one the agency has chosen in the stops.txt file of its GTFS package.
  3. stop_id2: The identifier of the segment's ending stop.
  4. route_id: The same route ID listed in the agency's routes.txt file.
  5. direction_id: The route's direction identifier.
  6. traversals: The number of times the indicated route traverses the segment during the "measurement interval." The "measurement interval" chosen is the busiest day in the GTFS schedule: the day which has the most bus services running.
  7. distance: The length of the segment in meters.
  8. geometry: The segment's LINESTRING (a format for encoding geographic paths). All geometries are re-projected onto Mercator (EPSG:4326/WGS84) to maintain consistency.

Each row does not represent one segment. Rather, each row maps to a combination of a segment, a route that includes that segment, and a direction. For instance, a segment included in eight routes will appear as eight rows, which will have the same information except for route_id and traversals (since some routes might traverse the segment more than others). This choice enables filtering by route and preserves how many times each route traverses each segment during the measurement interval. The direction identifier is used for very rare cases (mostly loops) in which a route visits the same two stops, in the same order, but in different directions.

Visualize Spacings

from gtfs_segments import view_spacings
view_spacings(segments_df,route = '18131',segment = '6294-6290-1',basemap=True)

Plot Distributions

from gtfs_segments import plot_hist
plot_hist(segments_df, max_spacing = 1200)

Optionally save figure using

plot_hist(segments_df,file_path = "spacings_hist.png",save_fig = True)

Get Summary Stats

from gtfs_segments import summary_stats
summary_stats(segments_df,max_spacing = 3000,export = True,file_path = "summary.csv")

Get Route Summary Stats

from gtfs_segments import get_route_stats,get_bus_feed
_,feed = get_bus_feed('path_to_gtfs.zip')
get_route_stats(feed)

Here each row contains the following columns:

  1. route: The route_id for the route of interest
  2. direction: The direction_id of the route
  3. route_length: The total length of the route. Units : Kilometers (Km)
  4. total time: The total scheduled time to travel the whole route. Units : Hours (Hr)
  5. headway: The average headway between consecutive buses for the route. A NaN indicates only 1 trip. Units` : Hours (Hr)
  6. peak_buses: The 15-minute interval where the route has the maximum number of buses concurrently running.
  7. average_speed: The average speed of the bus along the route. Units : Kmph
  8. n_bus_avg: The average number of buses concurrently running
  9. bus_spacing: The average spacing (in distance) between consecutive buses. Units : Kilometers (Km)
  10. stop_spacing: The average distance between two consecutive stops. Units : Kilometers (Km)

Download Segments Data

Download the data as either .csv or .geojson

from gtfs_segments import export_segments
export_segments(segments_df,'filename', output_format ='geojson')
# Get csv without geometry
export_segments(segments_df,'filename', output_format ='csv',geometry = False)

(back to top)