Skip to article frontmatterSkip to article content

Downloading Climate Data from ERA5

In this notebook we will demonstrate how we can use Climate Tools and dhis2eo to retrieve climate data for a set of DHIS2 organisation units, based on the ERA5 dataset, hosted at the ECMWF Climate Data Store (CDS).


Requirements

1. Create an ECMWF User

Before you can download the dataset programmatically, you need to create an ECMWF user.

2. Authenticate with your ECMWF user

Next, you need to authenticate with this user based on your user credentials:

  • Go to the CDSAPI Setup page and make sure to login.

  • Once logged in, scroll down to the section “Setup the CDS API personal access token”.

    • This should show your login credentials, and look something like this:

      url: https://cds.climate.copernicus.eu/api
      key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  • Copy those two lines to a file .cdsapirc in your user’s $HOME directory.

3. Accept the dataset license

Lastly, ECMWF requires that you manually accept the user license for each dataset that you download.


Step-by-step Workflow

We start by importing the necessary libraries:

import dhis2eo
import dhis2eo.org_units
import dhis2eo.data.cds

Step 1. Connect to DHIS2

The first thing to do is connect to an instance of DHIS2. For our example, we will connect to a local instance of DHIS2 containing the standard Sierra Leone demo database, but you should be able to switch out the instance url and credentials to work directly with your own database.

from dhis2_client import DHIS2Client
from dhis2_client.settings import ClientSettings

# Create DHIS2 client connection
cfg = ClientSettings(
  base_url="http://localhost:8080",
  username="admin",
  password="district"
)
client = DHIS2Client(settings=cfg)

# Verify connection
info = client.get_system_info()
print("Current DHIS2 version:", info["version"])
Current DHIS2 version: 2.42.3

Step 2: Retrieve organisation units

Next, we get the organisation units from our DHIS2 instance and load them into a format we can work with:

org_units_geojson = client.get_org_units_geojson(level=2)
org_units = dhis2eo.org_units.from_dhis2_geojson(org_units_geojson)
org_units
Loading...

Step 3: Download data for multiple months

The dhis2eo.data.cds module provides a convenience function for downloading daily climate data from the ERA5 post-processed daily statistics on single levels from 1940 to present. This function downloads the most commonly requested ERA5 climate variables: 2m temperature and total precipitation. Since data downloads from the Climate Data Store (CDS) can be slow, the function also caches the download results and reuses it if the file has already been downloaded.

Since the data can only be downloaded one month at a time, downloading data for a range of months can be done by looping through the years and months we are interested in. Inside the loop, we provide the year, month, and org_units we want to download for. The region to download data for is automatically calculated from the provided organisation units.

For this notebook, let’s download data for the last 3 months:

start_year = 2025
start_month = 7
end_year = 2025
end_month = 9

for year in range(start_year, end_year+1):
    for month in range(1, 12+1):
        
        # skip months before or after our defined time range
        if (year,month) < (start_year,start_month):
            continue
        if (year,month) > (end_year,end_month):
            continue

        # download the climate data
        # commented out for this notebook
        print(f'Month: {year}-{month}')
        data = dhis2eo.data.cds.get_daily_era5_data(year, month, org_units)

        # do something with the data
        # e.g. save to disk, aggregate, or import to DHIS2
        # ...
Month: 2025-7
dhis2eo.data.cds - INFO - Loading from cache: /tmp/cds_daily-era5_params-ca5bab_region-37098a_2025-07.nc
Month: 2025-8
dhis2eo.data.cds - INFO - Loading from cache: /tmp/cds_daily-era5_params-ca5bab_region-37098a_2025-08.nc
Month: 2025-9
dhis2eo.data.cds - INFO - Loading from cache: /tmp/cds_daily-era5_params-ca5bab_region-37098a_2025-09.nc

To inspect the contents of the downloaded data, let’s view the data of the last element in the loop:

data.to_xarray()
Loading...

We see that this is data for the month of September 2025, and contains data variables t2m (2m temperature), and tp (total precipitation)

Step 4: Process the data

The loop in the previous step only downloaded the data, but didn’t actually do anything with it. For guidance on how to further process the downloaded data, see:

Next steps

This notebook has showed how to download and potentially process the ERA5 climate data. As these are daily updated data, this script should be run regularly to continue feeding DHIS2 with the latest data. This can be done by automatically triggering the script to be run at regular intervals, e.g. via a cron job (guide to be added in the future).