Python-bloggers

Determining Distance to Coastline for Policy Locations Using GeoPandas

This article was first published on The Pleasure of Finding Things Out: A blog by James Triveri , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Knowing the distance to coastline for an exposure is crucial for insurance rating applications because it helps insurers assess the risk of hazards like hurricanes, storm surges, and flooding, which are much more prevalent in coastal areas. This information allows insurers to make informed decisions about pricing, underwriting and reinsurance. Properties closer to the coast are generally at higher risk, leading to higher premiums for these properties. Insurance rating plans may use distance to coastline directly as an explanatory variable, with factors inversely proportional to distance to coastline.

This article walks through how GeoPandas can be used to calculate distance to coastline for a collection of simulated latitude-longitude pairs in the Florida region, and how these exposure locations can be assigned to different risk levels based on the distance calculation.

Coastal Shapefiles

The United States Census Bureau provides shapefiles for state, county and ZCTA boundaries as well as roads, rails an coastlines (see full list here). Shapefiles are a widely-used geospatial vector data format that store the geometric location and attribute information of geographic features, which can be represented as points, lines, or polygons.

We being by downloading the COASTLINE zip archive available on the Census Bureau’s FTP site. The COASTLINE shapefile is loaded into GeoPandas (the STATE shapefile is also loaded for later use). We limit our analysis to the continental United States and filter out the Great Lakes. Inspecting the first few records:

import numpy as np
import pandas as pd
import geopandas as gpd

np.set_printoptions(suppress=True, precision=5)
pd.options.mode.chained_assignment = None
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)


coastline_shp = "tl_2024_us_coastline.zip"
us_shp = "tl_2024_us_state.zip"


# Bounding box of lower 48 states. Remove Great Lakes.
xmin, ymin, xmax, ymax = -125, 24.6, -65, 50
coast = gpd.read_file(coastline_shp)
coast = coast.cx[xmin:xmax, ymin:ymax]
coast = coast[coast.NAME!="Great Lakes"].reset_index(drop=True)

# State boundaries.
states = gpd.read_file(us_shp)[["NAME", "geometry"]]
states = states.cx[xmin:xmax, ymin:ymax].reset_index(drop=True)

print(f"coast.shape : {coast.shape}")
print(f"states.shape: {states.shape}")

coast.head(10)
coast.shape : (2916, 3)
states.shape: (49, 2)
NAME MTFCC geometry
0 Atlantic L4150 LINESTRING (-80.88368 32.03912, -80.88365 32.0…
1 Atlantic L4150 LINESTRING (-70.66800 41.51199, -70.65663 41.5…
2 Atlantic L4150 LINESTRING (-76.58108 38.09572, -76.58184 38.0…
3 Atlantic L4150 LINESTRING (-73.75518 40.58565, -73.75517 40.5…
4 Atlantic L4150 LINESTRING (-76.15615 38.63324, -76.15070 38.6…
5 Atlantic L4150 LINESTRING (-76.53289 39.20776, -76.53298 39.2…
6 Atlantic L4150 LINESTRING (-73.93653 40.56644, -73.93594 40.5…
7 Atlantic L4150 LINESTRING (-81.10208 29.42706, -81.10215 29.4…
8 Atlantic L4150 LINESTRING (-71.89236 41.32922, -71.89293 41.3…
9 Atlantic L4150 LINESTRING (-75.31239 38.94595, -75.31239 38.9…

The coastline shapefile is comprised of ~3,000 LINESTRING objects. Let’s get a count of geometries by NAME:

coast["NAME"].value_counts().sort_index()
NAME
Atlantic     941
Gulf         647
Pacific     1328
Name: count, dtype: int64

We can visualize the coastline by calling the coast GeoDataFrame’s plot method:

import matplotlib.pyplot as plt

fig, ax = plt.subplots(1, 1, figsize=(8, 6), tight_layout=True)
ax.set_title("Lower 48 Coastline", fontsize=9)
coast.plot(ax=ax, edgecolor="red", linewidth=1.0)
ax.axis("off")
plt.show()

To overlay the coastline along with state boundaries, download the STATE shapefile from the Census Bureau’s FTP site and plot them together:

fig, ax = plt.subplots(1, 1, figsize=(8, 6), tight_layout=True)
ax.set_title("Lower 48 States with Coastline", fontsize=9)
coast.plot(ax=ax, edgecolor="red", linewidth=1.50, linestyle="--")
states.boundary.plot(ax=ax, edgecolor="black", linewidth=0.50)
ax.axis("off")
plt.show()

Let’s next generate synthetic latitude-longitude pairs from within the Florida bounding envelope. The envelope bounds can be obtained from Florida’s geometry as follows:


# Get bounding box for each state.
states["bbox"] = states.geometry.map(lambda gg: gg.envelope.bounds)

# Put coordinates in separate columns.
states[["lon0", "lat0", "lon1", "lat1"]] = pd.DataFrame(states.bbox.tolist(), index=states.index)

states.head()
NAME geometry bbox lon0 lat0 lon1 lat1
0 West Virginia POLYGON ((-77.75438 39.33346, -77.75422 39.333… (-82.644591, 37.20154, -77.719519, 40.638801) -82.644591 37.201540 -77.719519 40.638801
1 Florida MULTIPOLYGON (((-83.10874 24.62949, -83.10711 … (-87.634896, 24.396308, -79.974306, 31.000968) -87.634896 24.396308 -79.974306 31.000968
2 Illinois POLYGON ((-87.89243 38.28285, -87.89334 38.282… (-91.513079, 36.970298, -87.019935, 42.508481) -91.513079 36.970298 -87.019935 42.508481
3 Minnesota POLYGON ((-95.31991 48.99892, -95.31778 48.998… (-97.239093, 43.499361, -89.483385, 49.384479) -97.239093 43.499361 -89.483385 49.384479
4 Maryland POLYGON ((-75.75600 39.24607, -75.75579 39.243… (-79.487651, 37.886605, -74.986282, 39.723037) -79.487651 37.886605 -74.986282 39.723037

Let’s draw the bounding region using folium:

import folium 

# Florida bounding box. 
lon0, lat0, lon1, lat1 = states[states.NAME=="Florida"].bbox.item()

mlat, mlon = (lat0 + lat1) / 2, (lon0 + lon1) / 2

m = folium.Map(
    location=[mlat, mlon], 
    zoom_start=6, 
    zoom_control=True, 
    no_touch=True,
    tiles="OpenStreetMap"
    )

folium.Rectangle(
    [(lat0, lon0), (lat1, lon1)], 
    fill_color="blue", fill_opacity=.05
    ).add_to(m)

m
Make this Notebook Trusted to load map: File -> Trust Notebook