In this example workflow, we demonstrate how to use DHIS2 Climate Tools to make sure that DHIS2 is continuously updated with the latest climate data. Specifically we show step by step how to download and import the latest available daily precipitation data from the ERA5-Land hourly data hosted at the Climate Data Store.
The notebook fetches and imports only data that is not yet present in DHIS2, making it safe to run on a recurring basis and ensuring that DHIS2 stays up to date with the latest available climate data.
By updating the input parameters at the top of the notebook, it’s possible to also run the import for other ERA5-Land climate variables.
If you’re only interested in downloading ERA5-Land data, see this detailed step-by-step guide.
Important: This notebook only aggregates to daily periods according the Gregorian calendar. Other calendar systems, like those used in Nepal or Ethiopia, are not yet supported.
Prerequisites¶
Before proceeding with the notebook, make sure the following are in place:
1. CDS API access¶
Make sure you have followed these instructions to authenticate and allow API access the CDS portal.
2. Required DHIS2 data element¶
Your DHIS2 instance must contain a data element that can receive the imported data.
For daily precipitation, the data element must have:
valueType = NUMBERaggregationType = SUMIt must belong to a data set with
periodType = DAILY
If this data element does not already exist, you have two options:
Create the data element manually in DHIS2.
Once the data element exists, copy its UID and set it as DHIS2_DATA_ELEMENT_ID in the Input Parameters section further down.
Library imports¶
We start by importing the necessary libraries:
from datetime import date
import json
import geopandas as gpd
import xarray as xr
from earthkit import transforms
from metpy.units import units
from dhis2_client import DHIS2Client
from dhis2_client.settings import ClientSettings
from dhis2eo.data.cds import era5_land
from dhis2eo.integrations.pandas import dataframe_to_dhis2_jsonInput parameters¶
Let’s first define all the input parameters so they are clearly stated at the top of the notebook.
For this example we will connect to a public DHIS2 instance, so it’s important that you create the precipitation data element (as described previously) and update the DHIS2_DATA_ELEMENT_ID below. Note also that the public instance resets every night, so this process will have to be repeated for each new day.
Since we are importing from hourly to daily data, setting the DHIS2_TIMEZONE_OFFSET parameter is needed to aggregate the hourly data to the correct days in the local timezone.
Note that IMPORT_FROM_UNITS and IMPORT_TO_UNITS define the units of the source data and the desired units for import, and should match the IMPORT_VARIABLE being imported. For example, to convert temperature data from Kelvin to degrees Celsius, set these to K and degC. For other unit conversions, see the MetPy documentation for working with units.
# DHIS2 connection
DHIS2_BASE_URL = "https://play.im.dhis2.org/stable-2-42-3-1"
DHIS2_USERNAME = "admin"
DHIS2_PASSWORD = "district"
# DHIS2 import settings
DHIS2_DATA_ELEMENT_ID = '<INSERT-DATA-ELEMENT-ID>'
DHIS2_ORG_UNIT_LEVEL = 2
DHIS2_DRY_RUN = True # default to safe dry-run mode; set to False for actual import
DHIS2_TIMEZONE_OFFSET = 0 # set to UTC timezone offset for your country
# ERA5 import configuration
IMPORT_VARIABLE = "total_precipitation" # ERA5 variable to download (as named in the CDS catalogue)
IMPORT_VALUE_COL = "tp" # variable name in the downloaded xarray dataset
IMPORT_IS_CUMULATIVE = True # indicates whether the input data is cumulative over time (e.g. ERA5 precipitation)
IMPORT_FROM_UNITS = "m" # units of the original data values
IMPORT_TO_UNITS = "mm" # convert to these units before import
IMPORT_START_DATE = "2025-01-01" # how far back in time to start import
IMPORT_END_DATE = date.today().isoformat() # automatically tries to import the latest data
# Download settings
DOWNLOAD_FOLDER = "../../guides/data/local"
DOWNLOAD_PREFIX = "era5-hourly-precip" # prefix for caching downloads; existing files are reused
# Aggregation settings
TEMPORAL_AGGREGATION = "sum"
SPATIAL_AGGREGATION = "mean"Connect to DHIS2¶
First, we connect the python-client to the DHIS2 instance we want to import into. You can point this to your own instance, but for the purposes of this example we will use one of the public access DHIS2 instances, since these are continuously reset:
# Client configuration
cfg = ClientSettings(
base_url=DHIS2_BASE_URL,
username=DHIS2_USERNAME,
password=DHIS2_PASSWORD
)
client = DHIS2Client(settings=cfg)
info = client.get_system_info()
# Check if everything is working.
# You should see your current DHIS2 version info.
print("Current DHIS2 version:", info["version"])Current DHIS2 version: 2.42.3.1
Get the DHIS2 organisation units¶
In order to download and aggregate the data to our DHIS2 organisation units, we also use the python-client to get the level 2 organisation units from our DHIS2 instance:
# Get org units GeoJSON from DHIS2
org_units_geojson = client.get_org_units_geojson(level=DHIS2_ORG_UNIT_LEVEL)
# Convert GeoJSON to geopandas
org_units = gpd.read_file(json.dumps(org_units_geojson))
org_unitsSkipping field groups: unsupported OGR type: 5
Check when the data was last imported¶
Since we want to run this script on a regular interval, we want to avoid importing data that has already been imported. We therefore first want to check the last date for which data was imported for the data element we want to import into. This can be done using the convenience function analytics_latest_period_for_level() provided by the python-client:
last_imported_response = client.analytics_latest_period_for_level(de_uid=DHIS2_DATA_ELEMENT_ID, level=DHIS2_ORG_UNIT_LEVEL)
last_imported_response{'meta': {'dataElement': 'PGDgmnWmXT6',
'level': 2,
'periodType': 'DAILY',
'calendar': 'iso8601',
'years_checked': 31},
'existing': None,
'next': None}Let’s extract and report the last imported month:
last_imported_period = last_imported_response["existing"]
last_imported_month_string = last_imported_period["id"][:6] if last_imported_period else None
if last_imported_month_string:
print(f"Last imported period: {last_imported_month_string}")
else:
print("No existing data found")No existing data found
We then use this information to define when we will start the data download, and ensure that we only download data after the last_imported_string:
if last_imported_month_string:
IMPORT_START_DATE_OVERRIDE = max(last_imported_month_string, IMPORT_START_DATE)
else:
IMPORT_START_DATE_OVERRIDE = IMPORT_START_DATE
print(f'Import will start at {IMPORT_START_DATE_OVERRIDE}')Import will start at 2025-01-01
Download the necessary data¶
In the next step we download all the requested data to the local file system, using convenience functionality from the dhis2eo.data.cds.era5_land module.
Running this step may take some time depending on how many months of data are requested.
Note that after the initial data download, subsequent runs of this notebook will re-use the previously imported files to avoid repeated downloads of the same data.
For more details on this step, see our guide for Downloading ERA5-Land data.
print(f'Downloading data for the period: {IMPORT_START_DATE_OVERRIDE} to {IMPORT_END_DATE}...')
files = era5_land.hourly.download(
start=IMPORT_START_DATE_OVERRIDE,
end=IMPORT_END_DATE,
bbox=org_units.total_bounds,
dirname=DOWNLOAD_FOLDER,
prefix=DOWNLOAD_PREFIX,
variables=[IMPORT_VARIABLE]
)
filesDownloading data for the period: 2025-01-01 to 2026-01-21...
INFO - 2026-01-21 22:52:17,068 - dhis2eo.data.cds.era5_land.hourly - Month 2025-1
INFO - 2026-01-21 22:52:17,071 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-01.nc
INFO - 2026-01-21 22:52:17,073 - dhis2eo.data.cds.era5_land.hourly - Month 2025-2
INFO - 2026-01-21 22:52:17,076 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-02.nc
INFO - 2026-01-21 22:52:17,078 - dhis2eo.data.cds.era5_land.hourly - Month 2025-3
INFO - 2026-01-21 22:52:17,080 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-03.nc
INFO - 2026-01-21 22:52:17,082 - dhis2eo.data.cds.era5_land.hourly - Month 2025-4
INFO - 2026-01-21 22:52:17,083 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-04.nc
INFO - 2026-01-21 22:52:17,085 - dhis2eo.data.cds.era5_land.hourly - Month 2025-5
INFO - 2026-01-21 22:52:17,088 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-05.nc
INFO - 2026-01-21 22:52:17,091 - dhis2eo.data.cds.era5_land.hourly - Month 2025-6
INFO - 2026-01-21 22:52:17,094 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-06.nc
INFO - 2026-01-21 22:52:17,095 - dhis2eo.data.cds.era5_land.hourly - Month 2025-7
INFO - 2026-01-21 22:52:17,099 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-07.nc
INFO - 2026-01-21 22:52:17,101 - dhis2eo.data.cds.era5_land.hourly - Month 2025-8
INFO - 2026-01-21 22:52:17,103 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-08.nc
INFO - 2026-01-21 22:52:17,105 - dhis2eo.data.cds.era5_land.hourly - Month 2025-9
INFO - 2026-01-21 22:52:17,109 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-09.nc
INFO - 2026-01-21 22:52:17,111 - dhis2eo.data.cds.era5_land.hourly - Month 2025-10
INFO - 2026-01-21 22:52:17,115 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-10.nc
INFO - 2026-01-21 22:52:17,117 - dhis2eo.data.cds.era5_land.hourly - Month 2025-11
INFO - 2026-01-21 22:52:17,120 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-11.nc
INFO - 2026-01-21 22:52:17,121 - dhis2eo.data.cds.era5_land.hourly - Month 2025-12
INFO - 2026-01-21 22:52:17,124 - dhis2eo.data.cds.era5_land.hourly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\era5-hourly-precip_2025-12.nc
INFO - 2026-01-21 22:52:17,125 - dhis2eo.data.cds.era5_land.hourly - Month 2026-1
WARNING - 2026-01-21 22:52:17,127 - dhis2eo.data.cds.era5_land.hourly - Skipping downloads for months that are expected to be incomplete (~7 days of lag).Latest available date expected in ERA5-Land: 2026-01-14
[WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-01.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-02.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-03.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-04.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-05.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-06.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-07.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-08.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-09.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-10.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-11.nc'),
WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/era5-hourly-precip_2025-12.nc')]Open the downloaded data¶
Once the data has been downloaded, we can then pass the list of files to xr.open_mfdataset(). This allows us to open and work with the data as if it were a single xarray dataset:
ds_hourly = xr.open_mfdataset(files)
ds_hourly = ds_hourly.drop_vars(['number', 'expver'])
ds_hourlyHandle cumulative variables¶
Some variables in ERA5-Land, such as total precipitation, are stored as cumulative (running total) values. These must be de-accumulated prior to aggregation, so that each hour only represents the precipitation that occured during the preceding hour:
if IMPORT_IS_CUMULATIVE:
print('Converting cumulative to incremental variable...')
# convert cumulative to diffs
ds_diffs = ds_hourly.diff(dim='valid_time')
# replace negative diffs with original cumulative (the hours where accumulation resets)
ds_diffs = xr.where(ds_diffs < 0, ds_hourly.isel(valid_time=slice(1, None)), ds_diffs)
ds_hourly = ds_diffsConverting cumulative to incremental variable...
Aggregate from hours to days¶
In this example we want to import daily data to DHIS2. We therefore aggregate the data from hourly to daily using earthkit.transforms. We also pass the DHIS2_TIMEZONE_OFFSET input to correctly match the hours with what should be considered a day in the local timezone of your country:
print("Aggregating temporally...")
ds_daily = transforms.temporal.daily_reduce(
ds_hourly[IMPORT_VALUE_COL],
how=TEMPORAL_AGGREGATION,
time_shift={"hours": DHIS2_TIMEZONE_OFFSET},
remove_partial_periods=False,
)
ds_dailyAggregating temporally...
Aggregate to organisation units¶
Once the data have been aggregated to the correct daily level, we can then aggregate the gridded data to the organisation units from your DHIS2 instance:
print("Aggregating to organisation units...")
ds_org_units = transforms.spatial.reduce(
ds_daily,
org_units,
mask_dim="id",
how=SPATIAL_AGGREGATION,
)
ds_org_unitsAggregating to organisation units...
Post-processing¶
After aggregating the precipitation data to the desired organizational units, we convert the xarray Dataset to a Pandas DataFrame. This makes it easier to inspect the data and prepare it for subsequent post-processing:
dataframe = ds_org_units.to_dataframe().reset_index()
dataframeAs seen above, the ERA5-Land precipitation is reported in meters, but the desired import unit is millimeters. In this step, to support converting between user-defined source and target units, we leverage MetPy’s units module to automatically convert between units before importing the data:
if IMPORT_TO_UNITS != IMPORT_FROM_UNITS:
print(f"Applying unit conversion from {IMPORT_FROM_UNITS} to {IMPORT_TO_UNITS}...")
# values with source units
values_with_units = dataframe[IMPORT_VALUE_COL].values * units(IMPORT_FROM_UNITS)
# convert to target units
converted = values_with_units.to(IMPORT_TO_UNITS).magnitude
# update the dataframe
dataframe[IMPORT_VALUE_COL] = converted
print(dataframe)
else:
print("No unit conversion needed")Applying unit conversion from m to mm...
id valid_time tp
0 O6uvpzGd5pu 2025-01-01 0.001031
1 O6uvpzGd5pu 2025-01-02 0.000849
2 O6uvpzGd5pu 2025-01-03 0.008860
3 O6uvpzGd5pu 2025-01-04 0.000000
4 O6uvpzGd5pu 2025-01-05 0.000034
... ... ... ...
4740 at6UHUQatSo 2025-12-27 3.374695
4741 at6UHUQatSo 2025-12-28 2.068645
4742 at6UHUQatSo 2025-12-29 0.872920
4743 at6UHUQatSo 2025-12-30 0.855703
4744 at6UHUQatSo 2025-12-31 0.954094
[4745 rows x 3 columns]
Create DHIS2 payload¶
At this point we have the final data that we want to import into DHIS2. In order to submit the data to DHIS2 we first have to convert the data a standardized JSON format, which can be done with the help of the dhis2eo library:
print(f"Creating payload with {len(dataframe)} values...")
payload = dataframe_to_dhis2_json(
df=dataframe,
org_unit_col="id",
period_col="valid_time",
value_col=IMPORT_VALUE_COL,
data_element_id=DHIS2_DATA_ELEMENT_ID,
)
payload['dataValues'][:3]Creating payload with 4745 values...
[{'orgUnit': 'O6uvpzGd5pu',
'period': '20250101',
'value': '0.0010307069',
'dataElement': '<INSERT-DATA-ELEMENT-ID>'},
{'orgUnit': 'O6uvpzGd5pu',
'period': '20250102',
'value': '0.0008490666',
'dataElement': '<INSERT-DATA-ELEMENT-ID>'},
{'orgUnit': 'O6uvpzGd5pu',
'period': '20250103',
'value': '0.0088600907',
'dataElement': '<INSERT-DATA-ELEMENT-ID>'}]Import to DHIS2¶
print(f"Importing payload into DHIS2 (dryrun={DHIS2_DRY_RUN})...")
res = client.post("/api/dataValueSets", json=payload, params={"dryRun": str(DHIS2_DRY_RUN).lower()})
print(f'Result: {res["response"]["importCount"]}')Importing payload into DHIS2 (dryrun=True)...
Result: {'imported': 4745, 'updated': 0, 'ignored': 0, 'deleted': 0}
We have now successfully completed a full workflow for downloading, postprocessing, aggegating, and importing daily ERA5-Land precipitation data into DHIS2.