Performance¶

pywgrib2 uses wgrib2 code. It is iteresting to compare performance of those two on common tasks.

The first task is converting GRIB2 file to netCDF. The GRIB2 file is GDAS 2m temperature data for whole month that can be obtained here The file size is about 20 MB. We compare run times of wgrib2 and Python scripts cfgrib and pywgrib2. The tests were ran several times, with no significant differences in lapsed times:

time wgrib2 tmp2m.gdas.l.201912.grib2 -inv /dev/null -netcdf x.nc

real  0m9.814s
user  0m9.696s
sys   0m0.100s

time cfgrib to_netcdf -o y.nc tmp2m.l.gdas.202002.grib2

real  0m16.451s
user  0m15.146s
sys   0m0.714s

time pywgrib2 template -t '2019-12-01T00' -o tmp2m.tmpl tmp2m.l.gdas.201912.grib2

real  0m1.160s
user  0m1.210s
sys   0m0.417s

time pywgrib2 to_nc -T tmp2m.tmpl -o tmp2m-pywgrib.nc tmp2m.l.gdas.201912.grib2

real  0m12.577s
user  0m12.401s
sys   0m0.868s

wgrib2 is the fastest, followed by pywgrib2 and cfgrib. One has to note that wgrib2 does not handle this dataset correctly: it uses forecast valid time as the time coordinate. The datafile contains analysis and 1 to 6 hour forecast, every 6 hours:

wgrib2 tmp2m.gdas.l.202002.grib2

0:d=2020020100:TMP:2 m above ground:anl:
26025:d=2020020100:TMP:2 m above ground:1 hour fcst:
52040:d=2020020100:TMP:2 m above ground:2 hour fcst:
77996:d=2020020100:TMP:2 m above ground:3 hour fcst:
103976:d=2020020100:TMP:2 m above ground:4 hour fcst:
129840:d=2020020100:TMP:2 m above ground:5 hour fcst:
155730:d=2020020100:TMP:2 m above ground:6 hour fcst:
181608:d=2020020106:TMP:2 m above ground:anl:
207509:d=2020020106:TMP:2 m above ground:1 hour fcst:
233451:d=2020020106:TMP:2 m above ground:2 hour fcst:
259410:d=2020020106:TMP:2 m above ground:3 hour fcst:
. . .

This means that the 6 hour forecast is overwritten by the next forecast analysis.

In the following example we compute mean temperature over two consecutive months. Calculation is done twice, first with pywgrib2_xr.open_dataset(), then xarray.open_mfdataset() and engine=cfgrib:

import sys
import time
from dask.distributed import Client
import numpy as np
import xarray as xr
import pywgrib2_xr as pywgrib2

f1 = '/tmp/tmp2m.l.gdas.201912.grib2'
f2 = '/tmp/tmp2m.l.gdas.202001.grib2'

def pywgrib2_():
    tmpl = pywgrib2.make_template(f2, reftime='2020-01-01T00:00:00')
    ds = pywgrib2.open_dataset([f1, f2], tmpl)
    tmp = ds['TMP.2_m_above_ground'][:,:-1,...]
    ds.close()
    return tmp.mean(['reftime', 'time1']).compute()

def cfgrib():
    ds = xr.open_mfdataset([f1, f2], engine='cfgrib') # , chunks={'time': 1})
    tmp = ds['t2m'][:,:-1,...]
    ds.close()
    return tmp.mean(['time', 'step']).compute()

if __name__ == '__main__':
    if sys.argv[1] == 'dask':
        client = Client()
        print(client)
    t = time.time()
    tmp1 = pywgrib2_()
    print('pywgrib2:', time.time() - t)
    t = time.time()
    tmp2 = cfgrib()
    print('cfgrib:', time.time() - t)
    assert np.allclose(tmp1.values[::-1,:], tmp2.values)

The last line compares results. Since pywgrib2 always converts grid orientation to WE:SN, the y-axis has to be swapped. The first run is single-threaded, the second uses dask distributed scheduler. pywgrib2_xr inventory and cfgrib index files already exist. The timing is done on an 8-core AMD FX-8350 processor:

python example1.py single

pywgrib2: 30.7
cfgrib: 20.8

cfgrib is substantially faster. This is mostly due to the default chunking by pywgrib2_xr, that is one chunk per model reference time. When analogous chunks are set in the call to xr.open_mfdataset, cfgrib run time increases to 30 s.

python example1.py dask

<Client: 'tcp://127.0.0.1:38535' processes=4 threads=8, memory=33.56 GB>
pywgrib2: 8.3
cfgrib: 11.1

Here situation is reversed. There are only two files, so only two processes are used by cfgrib. However with chunking, cfgrib is faster, the run time is about 7 s.

The next example illustrates performance with with a typical archive, where each data file contains weather elements for model run and one forecast time. We will calculate average minimum temperature in the atmosphere over a period of one month. The input files are GFS model with latitude-longitude projection at 0.5 deg resolution. File is about 60 MB. We select mudel runs at 00Z and 12Z and forecast hours 0 (i.e. analysis), 3, 6 and 9. This gives valid times at every 3 hours. There are 31 * 2 * 4 = 248 files. The timing code calculates minimum temperature in a vertical column 1000 to 100 hPa, then averages it over time:

import glob
import sys
import time
from dask.distributed import Client
import numpy as np
import xarray as xr
import pywgrib2_xr as pywgrib2

files = sorted(glob.glob('gfs_4_201801??_?[02]*_00[0369].grb2'))

def pywgrib2_():
    p = lambda x: x.varname == 'TMP' and x.level_code == 100
    tmpl = pywgrib2.make_template(files[:4], p, vertlevels='isobaric')
    ds = pywgrib2.open_dataset(files, tmpl) # , chunks={'time1': 1})
    tmp = ds['TMP.isobaric'][:,:,:21,:,:]
    ds.close()
    return tmp.min('isobaric1').mean(['reftime', 'time1']).compute()

def cfgrib():
    args = {'filter_by_keys': {'typeOfLevel': 'isobaricInhPa', 'shortName': 't'}}
    nested = [files[::4], files[1::4], files[2::4], files[3::4]]
    ds = xr.open_mfdataset(nested, engine='cfgrib', backend_kwargs=args,
                            combine='nested', concat_dim=['step', 'time'])
    tmp = ds['t'][:,:,:21,:,:]
    ds.close()
    return tmp.min('isobaricInhPa').mean(['time', 'step']).compute()

if __name__ == '__main__':
    if sys.argv[1] == 'dask':
        client = Client()
        print(client)
    t = time.time()
    tmp1 = pywgrib2_()
    print('pywgrib2: {:.1f} s'.format(time.time() - t))
    t = time.time()
    tmp2 = cfgrib()7.019040584564209
    print('cfgrib: {:.1f} s'.format(time.time() - t))
    assert np.allclose(tmp1.values[::-1,:], tmp2.values)
    ny = tmp1.shape[0]
    print('South Pole: {:.2f} degC'.format(tmp1[0,:].mean().values - 273.15))
    print('Equator: {:.2f} degC'.format(tmp1[ny//2+1,:].mean().values - 273.15))
    print('North Pole: {:.2f} degC'.format(tmp1[ny-1,:].mean().values - 273.15))

The most time consuming part is creation of inventory/index files. The reported times are for runs where the inventory/index files have been created.

python example2.py single

pywgrib2: 48.0 s
cfgrib: 254.3 s

python example2.py dask

pywgrib2: 23.4 s
cfgrib: 98.5 s
South Pole: -52.90 degC
Equator: -81.05 degC
North Pole: -69.65 degC

In this case, cfgrib default step chunk is 1. If the equivalent pywgrib2 chunk time1 is also set to 1, the run time for single tread increases to about 48 s, which is still much faster than cfgrib 4 min. The dask run time stays at about 23 s.