Determining Distance to Coastline for Policy Locations Using GeoPandas
Want to share your content on python-bloggers? click here.
Knowing the distance to coastline for an exposure is crucial for insurance rating applications because it helps insurers assess the risk of hazards like hurricanes, storm surges, and flooding, which are much more prevalent in coastal areas. This information allows insurers to make informed decisions about pricing, underwriting and reinsurance. Properties closer to the coast are generally at higher risk, leading to higher premiums for these properties. Insurance rating plans may use distance to coastline directly as an explanatory variable, with factors inversely proportional to distance to coastline.
This article walks through how GeoPandas can be used to calculate distance to coastline for a collection of simulated latitude-longitude pairs in the Florida region, and how these exposure locations can be assigned to different risk levels based on the distance calculation.
Coastal Shapefiles
The United States Census Bureau provides shapefiles for state, county and ZCTA boundaries as well as roads, rails an coastlines (see full list here). Shapefiles are a widely-used geospatial vector data format that store the geometric location and attribute information of geographic features, which can be represented as points, lines, or polygons.
We being by downloading the COASTLINE zip archive available on the Census Bureau’s FTP site. The COASTLINE shapefile is loaded into GeoPandas (the STATE shapefile is also loaded for later use). We limit our analysis to the continental United States and filter out the Great Lakes. Inspecting the first few records:
import numpy as np import pandas as pd import geopandas as gpd np.set_printoptions(suppress=True, precision=5) pd.options.mode.chained_assignment = None pd.set_option('display.max_columns', None) pd.set_option('display.width', None) coastline_shp = "tl_2024_us_coastline.zip" us_shp = "tl_2024_us_state.zip" # Bounding box of lower 48 states. Remove Great Lakes. xmin, ymin, xmax, ymax = -125, 24.6, -65, 50 coast = gpd.read_file(coastline_shp) coast = coast.cx[xmin:xmax, ymin:ymax] coast = coast[coast.NAME!="Great Lakes"].reset_index(drop=True) # State boundaries. states = gpd.read_file(us_shp)[["NAME", "geometry"]] states = states.cx[xmin:xmax, ymin:ymax].reset_index(drop=True) print(f"coast.shape : {coast.shape}") print(f"states.shape: {states.shape}") coast.head(10)
coast.shape : (2916, 3) states.shape: (49, 2)
NAME | MTFCC | geometry | |
---|---|---|---|
0 | Atlantic | L4150 | LINESTRING (-80.88368 32.03912, -80.88365 32.0… |
1 | Atlantic | L4150 | LINESTRING (-70.66800 41.51199, -70.65663 41.5… |
2 | Atlantic | L4150 | LINESTRING (-76.58108 38.09572, -76.58184 38.0… |
3 | Atlantic | L4150 | LINESTRING (-73.75518 40.58565, -73.75517 40.5… |
4 | Atlantic | L4150 | LINESTRING (-76.15615 38.63324, -76.15070 38.6… |
5 | Atlantic | L4150 | LINESTRING (-76.53289 39.20776, -76.53298 39.2… |
6 | Atlantic | L4150 | LINESTRING (-73.93653 40.56644, -73.93594 40.5… |
7 | Atlantic | L4150 | LINESTRING (-81.10208 29.42706, -81.10215 29.4… |
8 | Atlantic | L4150 | LINESTRING (-71.89236 41.32922, -71.89293 41.3… |
9 | Atlantic | L4150 | LINESTRING (-75.31239 38.94595, -75.31239 38.9… |
The coastline shapefile is comprised of ~3,000 LINESTRING objects. Let’s get a count of geometries by NAME:
coast["NAME"].value_counts().sort_index()
NAME Atlantic 941 Gulf 647 Pacific 1328 Name: count, dtype: int64
We can visualize the coastline by calling the coast
GeoDataFrame’s plot
method:
import matplotlib.pyplot as plt fig, ax = plt.subplots(1, 1, figsize=(8, 6), tight_layout=True) ax.set_title("Lower 48 Coastline", fontsize=9) coast.plot(ax=ax, edgecolor="red", linewidth=1.0) ax.axis("off") plt.show()
To overlay the coastline along with state boundaries, download the STATE shapefile from the Census Bureau’s FTP site and plot them together:
fig, ax = plt.subplots(1, 1, figsize=(8, 6), tight_layout=True) ax.set_title("Lower 48 States with Coastline", fontsize=9) coast.plot(ax=ax, edgecolor="red", linewidth=1.50, linestyle="--") states.boundary.plot(ax=ax, edgecolor="black", linewidth=0.50) ax.axis("off") plt.show()
Let’s next generate synthetic latitude-longitude pairs from within the Florida bounding envelope. The envelope bounds can be obtained from Florida’s geometry as follows:
# Get bounding box for each state. states["bbox"] = states.geometry.map(lambda gg: gg.envelope.bounds) # Put coordinates in separate columns. states[["lon0", "lat0", "lon1", "lat1"]] = pd.DataFrame(states.bbox.tolist(), index=states.index) states.head()
NAME | geometry | bbox | lon0 | lat0 | lon1 | lat1 | |
---|---|---|---|---|---|---|---|
0 | West Virginia | POLYGON ((-77.75438 39.33346, -77.75422 39.333… | (-82.644591, 37.20154, -77.719519, 40.638801) | -82.644591 | 37.201540 | -77.719519 | 40.638801 |
1 | Florida | MULTIPOLYGON (((-83.10874 24.62949, -83.10711 … | (-87.634896, 24.396308, -79.974306, 31.000968) | -87.634896 | 24.396308 | -79.974306 | 31.000968 |
2 | Illinois | POLYGON ((-87.89243 38.28285, -87.89334 38.282… | (-91.513079, 36.970298, -87.019935, 42.508481) | -91.513079 | 36.970298 | -87.019935 | 42.508481 |
3 | Minnesota | POLYGON ((-95.31991 48.99892, -95.31778 48.998… | (-97.239093, 43.499361, -89.483385, 49.384479) | -97.239093 | 43.499361 | -89.483385 | 49.384479 |
4 | Maryland | POLYGON ((-75.75600 39.24607, -75.75579 39.243… | (-79.487651, 37.886605, -74.986282, 39.723037) | -79.487651 | 37.886605 | -74.986282 | 39.723037 |
Let’s draw the bounding region using folium:
import folium # Florida bounding box. lon0, lat0, lon1, lat1 = states[states.NAME=="Florida"].bbox.item() mlat, mlon = (lat0 + lat1) / 2, (lon0 + lon1) / 2 m = folium.Map( location=[mlat, mlon], zoom_start=6, zoom_control=True, no_touch=True, tiles="OpenStreetMap" ) folium.Rectangle( [(lat0, lon0), (lat1, lon1)], fill_color="blue", fill_opacity=.05 ).add_to(m) m