DataDrought: Bridging Digital Deserts in Climate Crisis

Unveiling North Korea’s Flood Impact Through Satellite Analytics

In a world where climate crises are increasingly prevalent, the urgency to combat climate injustice leads us to the digital deserts of North Korea. This nation, recently hit by severe floods, serves as a stark example of how a lack of data compounds disaster vulnerability. Our initiative, DataDrought, aligns with the UN's Sustainable Development Goal(SDG) on digital connectivity, local action, and climate mitigation.

DataDrought seeks to close the digital divide with satellite analytics, transforming imagery into actionable insights for community-level resilience, supporting the SDGs' mission. This project spotlights North Korea's flooding as a blueprint for digital intervention in climate-vulnerable regions worldwide.

DataDrought is more than just a project—it's a movement for digital empowerment and climate resilience, advocating for solutions that bring hidden climate impacts to light and drive local, informed action against global climate challenges.

Hoeryong

Satellite Imagery of Hoeryong, 2017

The damage is visible to the eye, yet do we possess the data to comprehend the full extent of the village's suffering? No. Many regions in North Korea remain cut off, not only physically but also in the digital realm.

Then, how can we decipher the damage, render it digitally perceptible, and swiftly coordinate humanitarian interventions when they are needed?

The process begins with what is observable – capturing images, converting these to numerical data, and integrating them to form a holistic data environment. Simplified, this is: Capture, Vectorize, and Integrate.

The following is a detailed step by step guide of "Capture, Vectorize, and Integrate."

Capture

It would be the best to assess damages with our own eyes. But if the damaged site is not accessible, we can turn into our other eyes: satellite, and Google Earth Engine is where we can easily access open satellite datasets.

Now, in this step, we'll use Python and the geemap library (a python library that makes you access Google Earth Engine) to analyze Sentinel-2 satellite imagery and perform basic land cover classification. Of course, the area of interest is Hoeryong.

The following is a 5 steps manual:

Step 1. Setting up the environment:

  • We import the necessary libraries: geemap for interactive mapping and ee for Google Earth Engine functionalities.
  • We initialize the map using geemap.Map().

Step 2. Defining the area of interest (aoi):

  • We create a point object representing the coordinates of Hoeryong using ee.Geometry.Point.
  • We filter the Sentinel-2 image collection for the year 2014-2015 within the defined area and sort them based on cloud cover percentage, selecting the image with the least cloud cover using .first().

Step 3. Visualizing the image:

  • We define visualization parameters like minimum and maximum values, and the bands to be used (red, green, blue) for displaying the image in RGB format.
  • We center the map on the chosen image and set the zoom level (here we set the resolution as 8).
  • We add the image as a layer to the map with a descriptive title. Map.addLayer(image, vis_params, "Sentinel-2 RGB")

Step 4. Defining and sampling training data:

  • We define coordinates for a polygon that acts as a bounding box around Hoeryong.
  • We create a polygon geometry object using these coordinates.
  • We use the sample method to extract data points from the image within the polygon, specifying the scale, number of samples, and including geometry information.
  • We add this layer of training samples to the map for visualization.

Step 5. Clustering and displaying results:

  • We define the number of clusters (9) for our land cover classification.
  • We create a KMeans clusterer object and train it using the training data.
  • We classify the entire image using the trained clusterer and store the results.
  • We add a layer representing the clusters with a random color visualization and display it on the map.
import geemap
import ee

# Step 1: Setting Up the Environment
# Initialize the map
Map = geemap.Map()


# Step2: Define the Area of Interest
# Define the point of interest, in this case, coordinates of Hoeryong 
point = ee.Geometry.Point([129.72635505491021, 42.44534333217912])  
# Get the Sentinel-2 image collection
image = (
    ee.ImageCollection('COPERNICUS/S2')
    .filterBounds(point)
    .filterDate('2014-01-01', '2015-12-31')
    .sort('CLOUDY_PIXEL_PERCENTAGE')
    .first()
)
region = image.geometry()


# Step 3: Visualizing the Image
# Define the visualization parameters
vis_params = {
    'min': 0, 
    'max': 3000, 
    'bands': ['B4', 'B3', 'B2']  
}
# Center the map on the region and set the zoom level
Map.centerObject(region, 8)
# Add the image layer to the map
Map.addLayer(image, vis_params, "Sentinel-2 RGB")


# Step 4: Define and Sample Training Data
# Define the coordinates for the polygon. This is a bounding box that wraps around Hoerong
coordinates = [
    [129.696351, 42.407487],
    [129.696351, 42.450693],
    [129.797631, 42.450693],
    [129.797631, 42.407487],
    [129.696351, 42.407487]
]
# Create the polygon object
polygon = ee.Geometry.Polygon(coordinates, proj='EPSG:4326', geodesic=False)
# Define Training Data
training = image.sample(
    **{
        "region": polygon,
        'scale': 1,
        'numPixels': 2000,
        'seed': 0,
        'geometries': True,  
    }
)
Map.addLayer(training, {}, 'Training samples')


# Step 5: Clustering Using Machine Learning
n_clusters = 9
clusterer = ee.Clusterer.wekaKMeans(n_clusters).train(training)
result = image.cluster(clusterer)
Map.addLayer(result.randomVisualizer(), {}, 'clusters')
Map

The analysis yielded a segmented raster dataset with nine unique pixel classifications, where the red pixels specifically denote areas of human settlement. Analysis of imagery prior to the flooding highlights densely populated regions along the riverbanks, illustrating the areas most at risk from flood damage.

Segmentation using unsupervised classification, pre-flooding

If we focus solely on the red cluster, it becomes apparent that the settlements within the circle have been utterly eradicated.

Unsupervised classification in pre and post-flooding

This disaster was, in a way, foreseeable. Refer to the following for the river's dynamics. We ran a Relative Elevation Model (REM) to track the river's meandering path. The damaged zone shows significant vulnerability to flooding.

River dynamics simulated based on REM

By overlaying the meandering river's path with pre-flood segmentation, we pinpointed areas where red pixels, representing human settlements, intersect with potential flood zones. The blue pixels against a green backdrop highlight the locations most vulnerable to flooding.

Areas exposed to meandering river

Scaling Up

Transitioning from our detailed analysis of Hoeryong, which showcases the "Capture" phase of our methodology, we now broaden our scope to encapsulate the critical steps of "Vectorize" and "Integrate." Raster data cannot be integrated easily with other readily available vector, or tabular data. This is why we need to first vectorize and then integrate with other forms of data.

We begin with satellite imagery from Google Earth Engine (GEE), which provides Sentinel-1 GRD Data for flood detection and analysis.

Flood extent was calculated by:

  1. Difference Calculation:

    • A difference image is created by dividing the after flood image by the before flood image to highlight changes, presumably increases in water coverage.
  2. Thresholding:

    • This difference image is then thresholded (using a variable difference_threshold) to create a binary mask indicating potential flood water.
  3. Refining the Flood Extent:

    • Permanent water bodies are masked out using the JRC Global Surface Water dataset.
    • Pixel connectivity is analyzed to reduce noise.
    • Slope data from a digital elevation model (DEM) is used to mask out areas with a slope greater than 5%, as these are less likely to be flooded.
  4. Area Calculation:

    • The flooded area is calculated by multiplying the flood extent mask by pixel area, and statistics are gathered over the AOI to estimate the total flooded area in hectares.

Extracted flood extent calculated from Sentinel-1 GRD Data

And here's the product. The map of flooded areas in North Korea from May 2020 to October 2022.

The varied topography of North Korea has a significant impact on the pattern of flooding areas in the nation. North Korea's topography consists of low-lying coastal regions along the Yellow Sea and the East Sea, plateaus and plains in the west, and mountainous areas in the north and east. The different topographical features in the nation are a major factor in determining the patterns of flooding.

Snowmelt and heavy rainfall are common in mountainous areas like the Taebaek and Hamgyong Mountains. Flash floods frequently occur in these areas, and the accumulated water flows into valleys downhill, overflowing riverbanks in nearby lower-lying areas. Especially vulnerable to flooding are the valleys that separate mountain ranges.

Likewise, the west region, broad areas of flat plateaus and plains allow water to disperse over greater distances during periods of intense precipitation or snowmelt, leading to extensive flooding. Storm surges, high tides, and intense rainfall during typhoons and tropical storms can cause coastal areas to flood. Additionally, because of their lower elevation compared to surrounding bodies of water, low-lying areas—such as river floodplains and coastal plains—are susceptible to flooding. In some areas, human activities like urbanization and deforestation can increase the risk of flooding.

Now, let's comeback to our mantra. Capture, vectorize, and integrate. Now that we have captured the image, it is time to vectorize it so that we can easily tie with other socio, economic, demographic data <-- does it make sense?

Vectorized flood extent image

One of the great advantage of vectorizing raster data is that we can easily analyze with other already available vector data e.g., roads, rails, and buildings.

Kowon city, South Hamgyong Province

See Kowon city in South Hamgyong Province. Flooding caused damage to the majority of roads, railways, and almost every settlement. The extensive damage to transportation networks and settlements highlights the severe consequences of the flooding, emphasizing the urgent need for disaster response and recovery efforts.

Samho City, South Hamgyong Province

Samho, another city in South Hamgyong Province, shows that all the six schools(pink polygons) are exposed to flooding. Flooding can have devastating consequences for the education system and the future of students in the affected areas. Addressing the damage to educational institutions is not only about repairing physical structures but also about safeguarding the educational future of the affected students.

Hamhung, South Hamgyong Province

Health systems is at risk as well. In North Korea, the lack of hospitals is a structural problem that is made worse by disasters like flooding. The flooding has put additional strain on North Korea's already limited healthcare infrastructure, which is a highly concerning situation. The population depends on hospitals to provide healthcare, and these facilities must remain operational—particularly during and after natural disasters.

Concluding Remarks

Our response to disasters is limited by data availability. While we may not have complete data for disaster-struck areas like North Korea, we can observe, capture, vectorize, and extrapolate. Our case study on North Korea's flooding is just the start; this framework can apply to any geographical area facing climate threats, from heatwaves to fires. By evaluating satellite imagery before and after disasters, we prepare for necessary interventions and can inform local efforts to build resilience against future climate hazards.

Satellite Imagery of Hoeryong, 2017

Segmentation using unsupervised classification, pre-flooding

Unsupervised classification in pre and post-flooding

River dynamics simulated based on REM

Areas exposed to meandering river

Extracted flood extent calculated from Sentinel-1 GRD Data