Extracting Fire Station Locations from OpenStreetMap
Want to share your content on python-bloggers? click here.
OpenStreetMap (OSM) is a collaborative project that creates a free, editable map of the world. It’s built and maintained by a global community of contributors who collect and upload geographic data including roads, buildings, parks and more, using GPS devices other public sources. OSM allows anyone to use its data freely under the Open Database License (ODbL), which makes it popular for research and geospatial analysis.
The geographic data that constitutes OpenStreetMap is a valuable resource for enriching modeling datasets, which can help insurers better understand the context of insured assets. For example, OSM includes the locations of buildings, roads, fire stations, hospitals and other critical infrastructure, which can be used to model proximity to emergency services or accessibility in case of disaster. It also includes land use classifications, natural features like rivers or forests, and tags for amenities or hazard-prone areas, all of which can serve to inform risk assessment.
OpenStreetMap extracts are updated frequently, and are available at the Geofabrik North America download site. State extracts are available for download as shapefiles (.shp.zip) or as protocol buffer files (.pbf). .pbf files are binary files that store OSM data in a compressed format, and are used to efficiently distribute and process large amounts of geographic data. .pbf files contain the same OSM data as the original XML format, but the files are much smaller and allow for faster I/O.
OSM data is structured into nodes (points), ways (lines and shapes), and relations (groupings of elements). The elements are tagged with key-value pairs to describe their features. Amenities are a core category of features that describe useful facilities and services for the public. They are tagged with the key amenity=*
and cover a wide range of entities, including schools, churches, hospitals, banks, fire stations and many more. Refer to the OSM amenity page for the full list of amenity values.
My initial attempt to extract data from OpenStreetMap relied on Pyrosm. While convenient, it proved inefficient in terms of memory usage. I later switched to Osmium, which delivered significantly better performance with much lower memory overhead. My understanding is that Osmium is optimized for streaming and filtering large files quickly, selecting specific tags, bounding boxes or time slices without loading the full dataset into memory. This is exactly what I was looking to do, so the accompanying code sticks with Osmium.
A quick word of caution: It is important to note that OSM does not have 100% coverage of U.S. amenities. Coverage depends on community contributions and varies a lot by region and over time. You should treat OSM data as a strong starting layer, but expect gaps, duplicates (points + polygons) and tagging inconsistencies when you need comprehensive U.S. coverage.
In the remainder of the article, our focus will be on extracting fire station locations from the most recent State of Iowa .pbf extract. The provided code can easily be adapted to retrieve any amenity of interest.
The FireStationHandler
Class
In the next cell the FireStationHandler
class definition is provided, which is a custom osmium.SimpleHandler
subclass used to extract information about fire stations from OSM data. Before describing FireStationHandler
’s methods, a little clarification on OSM vernacular, specifically nodes, ways and relations:
A node is a single point defined by a longitude and latitude. Nodes are used to represent standalone features such as trees, fire hydrants, or POIs like restaurants. They also serve as building blocks for more complex features.
A way is an ordered list of nodes that forms a polyline or polygon. Ways can represent linear features like roads, rivers, and railways or area features like building footprints, parks, or lakes when the way forms a closed loop. Each way refers to multiple nodes, and the shape and geometry of the way are defined by the sequence of those nodes.
A relation is a higher-level structure that groups together multiple nodes, ways, or even other relations into a single logical feature. Relations are used when a feature cannot be represented by a single way. For example, a multipolygon building with holes or a bus route made up of many road segments. Each member in a relation can have a role (like “outer” or “inner” in a multipolygon), and the relation itself carries tags that describe the whole feature.
The next cell defines the FireStationHandler
class with node
and way
methods (I chose to skip the relation
method since it wasn’t required for this exercise):
import folium import geopandas as gpd import numpy as np import osmium import pandas as pd from shapely.geometry import Point, LineString, Polygon np.set_printoptions(suppress=True, precision=8, linewidth=1000) pd.options.mode.chained_assignment = None pd.set_option("display.max_columns", None) pd.set_option("display.width", None) class FireStationHandler(osmium.SimpleHandler): """ Collects OSM elements with amenity=fire_station. - Nodes: Point - Ways: Polygon if closed (>=4 nodes and first==last), else LineString """ def __init__(self): super().__init__() self.features = [] def node(self, n): if n.tags.get("amenity") == "fire_station" and n.location and n.location.valid(): self.features.append({ "osm_id": n.id, "osm_type": "node", "name": n.tags.get("name"), "lon": float(n.location.lon), "lat": float(n.location.lat), "tags": {t.k: t.v for t in n.tags}, }) def way(self, w): if w.tags.get("amenity") != "fire_station": return coords = [] for nd in w.nodes: if not nd.location or not nd.location.valid(): coords = [] break coords.append((float(nd.location.lon), float(nd.location.lat))) if not coords: return lon = sum(c[0] for c in coords) / len(coords) lat = sum(c[1] for c in coords) / len(coords) self.features.append({ "osm_id": w.id, "osm_type": "way", "name": w.tags.get("name"), "lon": lon, "lat": lat, "tags": {t.k: t.v for t in w.tags}, })
Within FireStationHandler
, the node
method is automatically called for each node in the extract. It checks whether the node has a tag “amenity” having value “fire_station”. If so, it collects relevant data from the node including its ID, type (marked as “node”), longitude, latitude, name and any additional tags. All tags associated with the node are contained within a dictionary. The information is then appended to the self.features
list.
Within the way
method, it also checks whether the node has a tag “amenity” having value “fire_station”. The handler attempts to compute a representative point by averaging the longitudes and latitudes of its nodes (remember that ways are lists of nodes). This is a simple way to reduce a building footprint to a single coordinate pair.
A FireStationHandler
instance is created, and the inherited apply_file
method is called. Note that locations=True
is included: This tells Osmium to build an in-memory index of node locations. When the handler processes a way, it uses this index to look up and attach the coordinate data to each node within that way. Without locations=True
, the nd.location
attribute in the way
method would be invalid, and you would not be able to get the coordinates associated with that feature.
# Path to the OSM PBF file. pbf_path = "iowa-latest.osm.pbf" # Apply the handler to the PBF file. handler = FireStationHandler() handler.apply_file(pbf_path, locations=True) # Create DataFrame from the extracted data. df = pd.DataFrame(handler.features) # Convert DataFrame to a GeoDataFrame. df = gpd.GeoDataFrame( df, geometry=gpd.points_from_xy(df.lon, df.lat), crs="EPSG:4326" ) print(f"\nExtracted {len(df)} fire stations from {pbf_path}.") df.head()
Extracted 626 fire stations from iowa-latest.osm.pbf.
osm_id | osm_type | name | lon | lat | tags | geometry | |
---|---|---|---|---|---|---|---|
0 | 354356402 | node | Seymour Fire Station | -93.121311 | 40.681718 | {‘amenity’: ‘fire_station’, ‘ele’: ‘323’, ‘gni… | POINT (-93.12131 40.68172) |
1 | 367080032 | node | Adel Fire Volunteer Department | -94.020270 | 41.616938 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-94.02027 41.61694) |
2 | 367081259 | node | Woden Fire Department | -93.910333 | 43.231583 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-93.91033 43.23158) |
3 | 367081264 | node | Crystal Lake Fire Department | -93.792224 | 43.223448 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-93.79222 43.22345) |
4 | 367081268 | node | Albert City Fire Department | -94.948768 | 42.781856 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-94.94877 42.78186) |
An independent source puts the total number of fire stations in Iowa at 872, so our initial attempt captured ~70% of known fire stations.
The current FireStationHandler
only extracts features where the tag amenity=“fire_station” is present on a node or way. However, in OSM, fire stations may be identified using different tags, for example office=“fire_service” or building=“fire_station”, sometimes without the amenity tag. In the next cell, we define is_fire_station
, which aims to broaden the search beyond just the amenity tag. FireStationHandlerV2
updates the node
and way
methods by calling is_fire_station
as the initial filtering step:
def is_fire_station(tags): """ Determines if the given OSM tags indicate a fire station. """ g = lambda k: (tags.get(k) or "").lower() return ( g("amenity") == "fire_station" or g("office") == "fire_service" or g("emergency") in {"fire_service", "rescue_station"} or g("building") == "fire_station" ) class FireStationHandlerV2(osmium.SimpleHandler): """ Collects OSM elements that are fire stations based on multiple tags. """ def __init__(self): super().__init__() self.features = [] def node(self, n): """ Processes OSM nodes to identify fire stations. """ if is_fire_station(n.tags) and n.location and n.location.valid(): self.features.append({ "osm_id": n.id, "osm_type": "node", "name": n.tags.get("name"), "lon": float(n.location.lon), "lat": float(n.location.lat), "tags": {t.k: t.v for t in n.tags}, }) def way(self, w): """ Processes OSM ways to identify fire stations. """ if not is_fire_station(w.tags): return coords = [] for nd in w.nodes: if not nd.location or not nd.location.valid(): coords = [] break coords.append((float(nd.location.lon), float(nd.location.lat))) if not coords: return lon = sum(c[0] for c in coords) / len(coords) lat = sum(c[1] for c in coords) / len(coords) self.features.append({ "osm_id": w.id, "osm_type": "way", "name": w.tags.get("name"), "lon": lon, "lat": lat, "tags": {t.k: t.v for t in w.tags}, })
# Path to the OSM PBF file. pbf_path = "iowa-latest.osm.pbf" # Apply the handler to the PBF file. handler = FireStationHandlerV2() handler.apply_file(pbf_path, locations=True) # Create DataFrame from the extracted data. df = pd.DataFrame(handler.features) # Convert DataFrame to a GeoDataFrame. df = gpd.GeoDataFrame( df, geometry=gpd.points_from_xy(df.lon, df.lat), crs="EPSG:4326" ) print(f"\nExtracted {len(df)} fire stations from {pbf_path}.") df.head()
Extracted 641 fire stations from iowa-latest.osm.pbf.
osm_id | osm_type | name | lon | lat | tags | geometry | |
---|---|---|---|---|---|---|---|
0 | 354356402 | node | Seymour Fire Station | -93.121311 | 40.681718 | {‘amenity’: ‘fire_station’, ‘ele’: ‘323’, ‘gni… | POINT (-93.12131 40.68172) |
1 | 367080032 | node | Adel Fire Volunteer Department | -94.020270 | 41.616938 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-94.02027 41.61694) |
2 | 367081259 | node | Woden Fire Department | -93.910333 | 43.231583 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-93.91033 43.23158) |
3 | 367081264 | node | Crystal Lake Fire Department | -93.792224 | 43.223448 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-93.79222 43.22345) |
4 | 367081268 | node | Albert City Fire Department | -94.948768 | 42.781856 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-94.94877 42.78186) |
By broadening the search beyond the amenity tag, we captured 15 additional fire stations, for ~72% coverage.
As noted in the introduction, OpenStreetMap does not provide full coverage of U.S. fire stations. Because it’s community-contributed, completeness varies by region and over time. For context, the U.S. Fire Administration reports roughly 27,000 registered fire departments, while OSM lists fewer than 20,000 fire stations, evidence of uneven coverage. Still, OSM is an excellent starting layer for geospatial analysis. For example, when estimating the distance from a property to the nearest fire station, OSM-based distances are an upper bound: adding missing stations can only shorten the distance or leave it unchanged.
Using osmium-tool
Osmium-tool is a fast command-line utility used for processing OpenStreetMap data. We can use it to pre-filter pdf extracts, allowing us to work with much smaller files containing only the features we are interested in. To start, install the Osmium command line tool:
$ sudo apt update $ sudo apt install osmium-tool
Then pass the path to the pbf file, along with the tags of interest:
$ osmium tags-filter iowa-latest.osm.pbf \ nwr/amenity=fire_station,building=fire_station,emergency=fire_service \ -o iowa-firestations-filtered.osm.pbf \ --overwrite
nwr
represents nodes, ways and relations. Running the above command will create a new, smaller .pbf file iowa-firestations-filtered.osm.pbf, consisting of those elements meeting the specified filter criteria (amenity=fire_station,building=fire_station,emergency=fire_service
). We can load the extracted locations using FireStationHandlerv2
as before, but this time it should run much quicker (this approach returns only the original 626 fire stations. It would require additional research to figure out how to identify the full 641 using osmium-tool):
# Path to the OSM PBF file. pbf_path = "iowa-firestations-filtered.osm.pbf" # Apply the handler to the PBF file. handler = FireStationHandlerV2() handler.apply_file(pbf_path, locations=True) # Create DataFrame from the extracted data. df = pd.DataFrame(handler.features) # Convert DataFrame to a GeoDataFrame. df = gpd.GeoDataFrame( df, geometry=gpd.points_from_xy(df.lon, df.lat), crs="EPSG:4326" ) print(f"\nExtracted {len(df)} fire stations from {pbf_path}.") df.head()
Extracted 626 fire stations from iowa-firestations-filtered.osm.pbf.
osm_id | osm_type | name | lon | lat | tags | geometry | |
---|---|---|---|---|---|---|---|
0 | 354356402 | node | Seymour Fire Station | -93.121311 | 40.681718 | {‘amenity’: ‘fire_station’, ‘ele’: ‘323’, ‘gni… | POINT (-93.12131 40.68172) |
1 | 367080032 | node | Adel Fire Volunteer Department | -94.020270 | 41.616938 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-94.02027 41.61694) |
2 | 367081259 | node | Woden Fire Department | -93.910333 | 43.231583 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-93.91033 43.23158) |
3 | 367081264 | node | Crystal Lake Fire Department | -93.792224 | 43.223448 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-93.79222 43.22345) |
4 | 367081268 | node | Albert City Fire Department | -94.948768 | 42.781856 | {‘addr:state’: ‘IA’, ‘amenity’: ‘fire_station’… | POINT (-94.94877 42.78186) |
Once the desired amenities have been extracted, we can visualize the distribution across the state using folium:
# Unpack the coordinates and names. lats, lons, names = df.lat.tolist(), df.lon.tolist(), df.name.tolist() # Center map on central Iowa. mid_lon, mid_lat = -93.0977, 41.8780 m = folium.Map( location=[mid_lat, mid_lon], zoom_start=8 ) for name, lon, lat in zip(names, lons, lats): folium.CircleMarker( location=[lat, lon], radius=3, color="red", fill_color="red", popup=name, fill=True, fill_opacity=1 ).add_to(m) m