Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Full workflow for importing ERA5-Land into DHIS2

In this example workflow, we demonstrate how to use DHIS2 Climate Tools to make sure that DHIS2 is continuously updated with the latest climate data. Specifically we show step by step how to download and import the latest available daily precipitation data from the ERA5-Land hourly data hosted at the Climate Data Store.

The notebook fetches and imports only data that is not yet present in DHIS2, making it safe to run on a recurring basis and ensuring that DHIS2 stays up to date with the latest available climate data.

By updating the input parameters at the top of the notebook, it’s possible to also run the import for other ERA5-Land climate variables.

If you’re only interested in downloading ERA5-Land data, see this detailed step-by-step guide.

Important: This notebook only aggregates to daily periods according the Gregorian calendar. Other calendar systems, like those used in Nepal or Ethiopia, are not yet supported.

Prerequisites

Before proceeding with the notebook, make sure the following are in place:

1. CDS API access

Make sure you have followed these instructions to authenticate and allow API access the CDS portal.

2. Required DHIS2 data element

Your DHIS2 instance must contain a data element that can receive the imported data.

For daily precipitation, the data element must have:

  • valueType = NUMBER

  • aggregationType = SUM

  • It must belong to a data set with periodType = DAILY

If this data element does not already exist, you have two options:

Once the data element exists, copy its UID and set it as DHIS2_DATA_ELEMENT_ID in the Input Parameters section further down.

Library imports

We start by importing the necessary libraries:

from datetime import date
import json

import geopandas as gpd
import xarray as xr
from earthkit import transforms
from metpy.units import units

from dhis2_client import DHIS2Client
from dhis2_client.settings import ClientSettings

from dhis2eo.data.cds import era5_land
from dhis2eo.integrations.pandas import dataframe_to_dhis2_json

Input parameters

Let’s first define all the input parameters so they are clearly stated at the top of the notebook.

For this example we will connect to a public DHIS2 instance, so it’s important that you create the precipitation data element (as described previously) and update the DHIS2_DATA_ELEMENT_ID below. Note also that the public instance resets every night, so this process will have to be repeated for each new day.

Since we are importing from hourly to daily data, setting the DHIS2_TIMEZONE_OFFSET parameter is needed to aggregate the hourly data to the correct days in the local timezone.

Note that IMPORT_FROM_UNITS and IMPORT_TO_UNITS define the units of the source data and the desired units for import, and should match the IMPORT_VARIABLE being imported. For example, to convert temperature data from Kelvin to degrees Celsius, set these to K and degC. For other unit conversions, see the MetPy documentation for working with units.

# DHIS2 connection
DHIS2_BASE_URL = "https://play.im.dhis2.org/stable-2-42-3-1"
DHIS2_USERNAME = "admin"
DHIS2_PASSWORD = "district"

# DHIS2 import settings
DHIS2_DATA_ELEMENT_ID = '<INSERT-DATA-ELEMENT-ID>'
DHIS2_ORG_UNIT_LEVEL = 2
DHIS2_DRY_RUN = True                            # default to safe dry-run mode; set to False for actual import
DHIS2_TIMEZONE_OFFSET = 0                       # set to UTC timezone offset for your country

# ERA5 import configuration
IMPORT_VARIABLE = "total_precipitation"         # ERA5 variable to download (as named in the CDS catalogue)
IMPORT_VALUE_COL = "tp"                         # variable name in the downloaded xarray dataset
IMPORT_IS_CUMULATIVE = True                     # indicates whether the input data is cumulative over time (e.g. ERA5 precipitation)
IMPORT_FROM_UNITS = "m"                         # units of the original data values
IMPORT_TO_UNITS = "mm"                          # convert to these units before import
IMPORT_START_DATE = "2025-01-01"                # how far back in time to start import
IMPORT_END_DATE = date.today().isoformat()      # automatically tries to import the latest data

# Download settings
DOWNLOAD_FOLDER = "../../guides/data/local"
DOWNLOAD_PREFIX = "era5-hourly-precip"          # prefix for caching downloads; existing files are reused

# Aggregation settings
TEMPORAL_AGGREGATION = "sum"
SPATIAL_AGGREGATION = "mean"

Connect to DHIS2

First, we connect the python-client to the DHIS2 instance we want to import into. You can point this to your own instance, but for the purposes of this example we will use one of the public access DHIS2 instances, since these are continuously reset:

# Client configuration
cfg = ClientSettings(
  base_url=DHIS2_BASE_URL,
  username=DHIS2_USERNAME,
  password=DHIS2_PASSWORD
)

client = DHIS2Client(settings=cfg)
info = client.get_system_info()

# Check if everything is working.
# You should see your current DHIS2 version info.
print("Current DHIS2 version:", info["version"])
Current DHIS2 version: 2.42.3.1

Get the DHIS2 organisation units

In order to download and aggregate the data to our DHIS2 organisation units, we also use the python-client to get the level 2 organisation units from our DHIS2 instance:

# Get org units GeoJSON from DHIS2
org_units_geojson = client.get_org_units_geojson(level=DHIS2_ORG_UNIT_LEVEL)

# Convert GeoJSON to geopandas
org_units = gpd.read_file(json.dumps(org_units_geojson))
org_units
Skipping field groups: unsupported OGR type: 5
Loading...

Check when the data was last imported

Since we want to run this script on a regular interval, we want to avoid importing data that has already been imported. We therefore first want to check the last date for which data was imported for the data element we want to import into. This can be done using the convenience function analytics_latest_period_for_level() provided by the python-client:

last_imported_response = client.analytics_latest_period_for_level(de_uid=DHIS2_DATA_ELEMENT_ID, level=DHIS2_ORG_UNIT_LEVEL)
last_imported_response
{'meta': {'dataElement': 'PGDgmnWmXT6', 'level': 2, 'periodType': 'DAILY', 'calendar': 'iso8601', 'years_checked': 31}, 'existing': None, 'next': None}

Let’s extract and report the last imported month:

last_imported_period = last_imported_response["existing"]
last_imported_month_string = last_imported_period["id"][:6] if last_imported_period else None

if last_imported_month_string:
    print(f"Last imported period: {last_imported_month_string}")
else:
    print("No existing data found")
No existing data found

We then use this information to define when we will start the data download, and ensure that we only download data after the last_imported_string:

if last_imported_month_string:
    IMPORT_START_DATE_OVERRIDE = max(last_imported_month_string, IMPORT_START_DATE)
else:
    IMPORT_START_DATE_OVERRIDE = IMPORT_START_DATE

print(f'Import will start at {IMPORT_START_DATE_OVERRIDE}')
Import will start at 2025-01-01

Download the necessary data

In the next step we download all the requested data to the local file system, using convenience functionality from the dhis2eo.data.cds.era5_land module.

Running this step may take some time depending on how many months of data are requested.

Note that after the initial data download, subsequent runs of this notebook will re-use the previously imported files to avoid repeated downloads of the same data.

For more details on this step, see our guide for Downloading ERA5-Land data.

print(f'Downloading data for the period: {IMPORT_START_DATE_OVERRIDE} to {IMPORT_END_DATE}...')
files = era5_land.hourly.download(
    start=IMPORT_START_DATE_OVERRIDE, 
    end=IMPORT_END_DATE, 
    bbox=org_units.total_bounds, 
    dirname=DOWNLOAD_FOLDER, 
    prefix=DOWNLOAD_PREFIX, 
    variables=[IMPORT_VARIABLE]
)
files
Downloading data for the period: 2025-01-01 to 2026-01-21...
INFO - 2026-01-21 22:52:17,068 - dhis2eo.data.cds.era5_land.hourly - Month 2025-1
INFO - 2026-01-21 22:52:17,071 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-01.nc
INFO - 2026-01-21 22:52:17,073 - dhis2eo.data.cds.era5_land.hourly - Month 2025-2
INFO - 2026-01-21 22:52:17,076 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-02.nc
INFO - 2026-01-21 22:52:17,078 - dhis2eo.data.cds.era5_land.hourly - Month 2025-3
INFO - 2026-01-21 22:52:17,080 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-03.nc
INFO - 2026-01-21 22:52:17,082 - dhis2eo.data.cds.era5_land.hourly - Month 2025-4
INFO - 2026-01-21 22:52:17,083 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-04.nc
INFO - 2026-01-21 22:52:17,085 - dhis2eo.data.cds.era5_land.hourly - Month 2025-5
INFO - 2026-01-21 22:52:17,088 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-05.nc
INFO - 2026-01-21 22:52:17,091 - dhis2eo.data.cds.era5_land.hourly - Month 2025-6
INFO - 2026-01-21 22:52:17,094 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-06.nc
INFO - 2026-01-21 22:52:17,095 - dhis2eo.data.cds.era5_land.hourly - Month 2025-7
INFO - 2026-01-21 22:52:17,099 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-07.nc
INFO - 2026-01-21 22:52:17,101 - dhis2eo.data.cds.era5_land.hourly - Month 2025-8
INFO - 2026-01-21 22:52:17,103 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-08.nc
INFO - 2026-01-21 22:52:17,105 - dhis2eo.data.cds.era5_land.hourly - Month 2025-9
INFO - 2026-01-21 22:52:17,109 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-09.nc
INFO - 2026-01-21 22:52:17,111 - dhis2eo.data.cds.era5_land.hourly - Month 2025-10
INFO - 2026-01-21 22:52:17,115 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-10.nc
INFO - 2026-01-21 22:52:17,117 - dhis2eo.data.cds.era5_land.hourly - Month 2025-11
INFO - 2026-01-21 22:52:17,120 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-11.nc
INFO - 2026-01-21 22:52:17,121 - dhis2eo.data.cds.era5_land.hourly - Month 2025-12
INFO - 2026-01-21 22:52:17,124 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-12.nc
INFO - 2026-01-21 22:52:17,125 - dhis2eo.data.cds.era5_land.hourly - Month 2026-1
WARNING - 2026-01-21 22:52:17,127 - dhis2eo.data.cds.era5_land.hourly - Skipping downloads for months that are expected to be incomplete (~7 days of lag).Latest available date expected in ERA5-Land: 2026-01-14
[WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-01.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-02.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-03.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-04.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-05.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-06.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-07.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-08.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-09.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-10.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-11.nc'), WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-12.nc')]

Open the downloaded data

Once the data has been downloaded, we can then pass the list of files to xr.open_mfdataset(). This allows us to open and work with the data as if it were a single xarray dataset:

ds_hourly = xr.open_mfdataset(files)
ds_hourly = ds_hourly.drop_vars(['number', 'expver'])
ds_hourly
Loading...

Handle cumulative variables

Some variables in ERA5-Land, such as total precipitation, are stored as cumulative (running total) values. These must be de-accumulated prior to aggregation, so that each hour only represents the precipitation that occured during the preceding hour:

if IMPORT_IS_CUMULATIVE:
    print('Converting cumulative to incremental variable...')
    # convert cumulative to diffs
    ds_diffs = ds_hourly.diff(dim='valid_time')
    # replace negative diffs with original cumulative (the hours where accumulation resets)
    ds_diffs = xr.where(ds_diffs < 0, ds_hourly.isel(valid_time=slice(1, None)), ds_diffs)
    ds_hourly = ds_diffs
Converting cumulative to incremental variable...

Aggregate from hours to days

In this example we want to import daily data to DHIS2. We therefore aggregate the data from hourly to daily using earthkit.transforms. We also pass the DHIS2_TIMEZONE_OFFSET input to correctly match the hours with what should be considered a day in the local timezone of your country:

print("Aggregating temporally...")
ds_daily = transforms.temporal.daily_reduce(
    ds_hourly[IMPORT_VALUE_COL],
    how=TEMPORAL_AGGREGATION,
    time_shift={"hours": DHIS2_TIMEZONE_OFFSET},
    remove_partial_periods=False,
)
ds_daily
Aggregating temporally...
Loading...

Aggregate to organisation units

Once the data have been aggregated to the correct daily level, we can then aggregate the gridded data to the organisation units from your DHIS2 instance:

print("Aggregating to organisation units...")
ds_org_units = transforms.spatial.reduce(
    ds_daily,
    org_units,
    mask_dim="id",
    how=SPATIAL_AGGREGATION,
)
ds_org_units
Aggregating to organisation units...
Loading...

Post-processing

After aggregating the precipitation data to the desired organizational units, we convert the xarray Dataset to a Pandas DataFrame. This makes it easier to inspect the data and prepare it for subsequent post-processing:

dataframe = ds_org_units.to_dataframe().reset_index()
dataframe
Loading...

As seen above, the ERA5-Land precipitation is reported in meters, but the desired import unit is millimeters. In this step, to support converting between user-defined source and target units, we leverage MetPy’s units module to automatically convert between units before importing the data:

if IMPORT_TO_UNITS != IMPORT_FROM_UNITS:
    print(f"Applying unit conversion from {IMPORT_FROM_UNITS} to {IMPORT_TO_UNITS}...")
    # values with source units
    values_with_units = dataframe[IMPORT_VALUE_COL].values * units(IMPORT_FROM_UNITS)
    # convert to target units
    converted = values_with_units.to(IMPORT_TO_UNITS).magnitude
    # update the dataframe
    dataframe[IMPORT_VALUE_COL] = converted
    print(dataframe)
else:
    print("No unit conversion needed")
Applying unit conversion from m to mm...
               id valid_time        tp
0     O6uvpzGd5pu 2025-01-01  0.001031
1     O6uvpzGd5pu 2025-01-02  0.000849
2     O6uvpzGd5pu 2025-01-03  0.008860
3     O6uvpzGd5pu 2025-01-04  0.000000
4     O6uvpzGd5pu 2025-01-05  0.000034
...           ...        ...       ...
4740  at6UHUQatSo 2025-12-27  3.374695
4741  at6UHUQatSo 2025-12-28  2.068645
4742  at6UHUQatSo 2025-12-29  0.872920
4743  at6UHUQatSo 2025-12-30  0.855703
4744  at6UHUQatSo 2025-12-31  0.954094

[4745 rows x 3 columns]

Create DHIS2 payload

At this point we have the final data that we want to import into DHIS2. In order to submit the data to DHIS2 we first have to convert the data a standardized JSON format, which can be done with the help of the dhis2eo library:

print(f"Creating payload with {len(dataframe)} values...")
payload = dataframe_to_dhis2_json(
    df=dataframe,
    org_unit_col="id",
    period_col="valid_time",
    value_col=IMPORT_VALUE_COL,
    data_element_id=DHIS2_DATA_ELEMENT_ID,
)
payload['dataValues'][:3]
Creating payload with 4745 values...
[{'orgUnit': 'O6uvpzGd5pu', 'period': '20250101', 'value': '0.0010307069', 'dataElement': '<INSERT-DATA-ELEMENT-ID>'}, {'orgUnit': 'O6uvpzGd5pu', 'period': '20250102', 'value': '0.0008490666', 'dataElement': '<INSERT-DATA-ELEMENT-ID>'}, {'orgUnit': 'O6uvpzGd5pu', 'period': '20250103', 'value': '0.0088600907', 'dataElement': '<INSERT-DATA-ELEMENT-ID>'}]

Import to DHIS2

print(f"Importing payload into DHIS2 (dryrun={DHIS2_DRY_RUN})...")
res = client.post("/api/dataValueSets", json=payload, params={"dryRun": str(DHIS2_DRY_RUN).lower()})
print(f'Result: {res["response"]["importCount"]}')
Importing payload into DHIS2 (dryrun=True)...
Result: {'imported': 4745, 'updated': 0, 'ignored': 0, 'deleted': 0}

We have now successfully completed a full workflow for downloading, postprocessing, aggegating, and importing daily ERA5-Land precipitation data into DHIS2.