Performance =========== **pywgrib2** uses **wgrib2** code. It is iteresting to compare performance of those two on common tasks. The first task is converting GRIB2 file to netCDF. The GRIB2 file is GDAS 2m temperature data for whole month that can be obtained `here `_ The file size is about 20 MB. We compare run times of **wgrib2** and Python scripts **cfgrib** and **pywgrib2**. The tests were ran several times, with no significant differences in lapsed times: .. code-block:: console time wgrib2 tmp2m.gdas.l.201912.grib2 -inv /dev/null -netcdf x.nc .. parsed-literal:: real 0m9.814s user 0m9.696s sys 0m0.100s .. code-block:: console time cfgrib to_netcdf -o y.nc tmp2m.l.gdas.202002.grib2 .. parsed-literal:: real 0m16.451s user 0m15.146s sys 0m0.714s .. code-block:: console time pywgrib2 template -t '2019-12-01T00' -o tmp2m.tmpl tmp2m.l.gdas.201912.grib2 real 0m1.160s user 0m1.210s sys 0m0.417s .. code-block:: console time pywgrib2 to_nc -T tmp2m.tmpl -o tmp2m-pywgrib.nc tmp2m.l.gdas.201912.grib2 .. parsed-literal:: real 0m12.577s user 0m12.401s sys 0m0.868s **wgrib2** is the fastest, followed by **pywgrib2** and **cfgrib**. One has to note that **wgrib2** does not handle this dataset correctly: it uses forecast valid time as the time coordinate. The datafile contains analysis and 1 to 6 hour forecast, every 6 hours: .. code-block:: console wgrib2 tmp2m.gdas.l.202002.grib2 .. parsed-literal:: 1:0:d=2020020100:TMP:2 m above ground:anl: 2:26025:d=2020020100:TMP:2 m above ground:1 hour fcst: 3:52040:d=2020020100:TMP:2 m above ground:2 hour fcst: 4:77996:d=2020020100:TMP:2 m above ground:3 hour fcst: 5:103976:d=2020020100:TMP:2 m above ground:4 hour fcst: 6:129840:d=2020020100:TMP:2 m above ground:5 hour fcst: 7:155730:d=2020020100:TMP:2 m above ground:6 hour fcst: 8:181608:d=2020020106:TMP:2 m above ground:anl: 9:207509:d=2020020106:TMP:2 m above ground:1 hour fcst: 10:233451:d=2020020106:TMP:2 m above ground:2 hour fcst: 11:259410:d=2020020106:TMP:2 m above ground:3 hour fcst: . . . This means that the 6 hour forecast is overwritten by the next forecast analysis. In the following example we compute mean temperature over two consecutive months. Calculation is done twice, first with :py:func:`pywgrib2_xr.open_dataset`, then :py:func:`xarray.open_mfdataset` and ``engine=cfgrib``: .. code-block:: python import sys import time from dask.distributed import Client import numpy as np import xarray as xr import pywgrib2_xr as pywgrib2 f1 = '/tmp/tmp2m.l.gdas.201912.grib2' f2 = '/tmp/tmp2m.l.gdas.202001.grib2' def pywgrib2_(): tmpl = pywgrib2.make_template(f2, reftime='2020-01-01T00:00:00') ds = pywgrib2.open_dataset([f1, f2], tmpl) tmp = ds['TMP.2_m_above_ground'][:,:-1,...] ds.close() return tmp.mean(['reftime', 'time1']).compute() def cfgrib(): ds = xr.open_mfdataset([f1, f2], engine='cfgrib') # , chunks={'time': 1}) tmp = ds['t2m'][:,:-1,...] ds.close() return tmp.mean(['time', 'step']).compute() if __name__ == '__main__': if sys.argv[1] == 'dask': client = Client() print(client) t = time.time() tmp1 = pywgrib2_() print('pywgrib2:', time.time() - t) t = time.time() tmp2 = cfgrib() print('cfgrib:', time.time() - t) assert np.allclose(tmp1.values[::-1,:], tmp2.values) The last line compares results. Since **pywgrib2** always converts grid orientation to WE:SN, the y-axis has to be swapped. The first run is single-threaded, the second uses dask distributed scheduler. **pywgrib2_xr** inventory and **cfgrib** index files already exist. The timing is done on an 8-core AMD FX-8350 processor: .. code-block:: console python example1.py single .. parsed-literal:: pywgrib2: 30.7 cfgrib: 20.8 **cfgrib** is substantially faster. This is mostly due to the default chunking by **pywgrib2_xr**, that is one chunk per model reference time. When analogous chunks are set in the call to ``xr.open_mfdataset``, **cfgrib** run time increases to 30 s. .. code-block:: console python example1.py dask .. parsed-literal:: pywgrib2: 8.3 cfgrib: 11.1 Here situation is reversed. There are only two files, so only two processes are used by **cfgrib**. However with chunking, **cfgrib** is faster, the run time is about 7 s. The next example illustrates performance with with a typical archive, where each data file contains weather elements for model run and one forecast time. We will calculate average minimum temperature in the atmosphere over a period of one month. The input files are GFS model with latitude-longitude projection at 0.5 deg resolution. File is about 60 MB. We select mudel runs at 00Z and 12Z and forecast hours 0 (i.e. analysis), 3, 6 and 9. This gives valid times at every 3 hours. There are 31 * 2 * 4 = 248 files. The timing code calculates minimum temperature in a vertical column 1000 to 100 hPa, then averages it over time: .. code-block:: python import glob import sys import time from dask.distributed import Client import numpy as np import xarray as xr import pywgrib2_xr as pywgrib2 files = sorted(glob.glob('gfs_4_201801??_?[02]*_00[0369].grb2')) def pywgrib2_(): p = lambda x: x.varname == 'TMP' and x.level_code == 100 tmpl = pywgrib2.make_template(files[:4], p, vertlevels='isobaric') ds = pywgrib2.open_dataset(files, tmpl) # , chunks={'time1': 1}) tmp = ds['TMP.isobaric'][:,:,:21,:,:] ds.close() return tmp.min('isobaric1').mean(['reftime', 'time1']).compute() def cfgrib(): args = {'filter_by_keys': {'typeOfLevel': 'isobaricInhPa', 'shortName': 't'}} nested = [files[::4], files[1::4], files[2::4], files[3::4]] ds = xr.open_mfdataset(nested, engine='cfgrib', backend_kwargs=args, combine='nested', concat_dim=['step', 'time']) tmp = ds['t'][:,:,:21,:,:] ds.close() return tmp.min('isobaricInhPa').mean(['time', 'step']).compute() if __name__ == '__main__': if sys.argv[1] == 'dask': client = Client() print(client) t = time.time() tmp1 = pywgrib2_() print('pywgrib2: {:.1f} s'.format(time.time() - t)) t = time.time() tmp2 = cfgrib()7.019040584564209 print('cfgrib: {:.1f} s'.format(time.time() - t)) assert np.allclose(tmp1.values[::-1,:], tmp2.values) ny = tmp1.shape[0] print('South Pole: {:.2f} degC'.format(tmp1[0,:].mean().values - 273.15)) print('Equator: {:.2f} degC'.format(tmp1[ny//2+1,:].mean().values - 273.15)) print('North Pole: {:.2f} degC'.format(tmp1[ny-1,:].mean().values - 273.15)) The most time consuming part is creation of inventory/index files. The reported times are for runs where the inventory/index files have been created. .. code-block:: console python example2.py single .. parsed-literal:: pywgrib2: 48.0 s cfgrib: 254.3 s .. code-block:: console python example2.py dask .. parsed-literal:: pywgrib2: 23.4 s cfgrib: 98.5 s South Pole: -52.90 degC Equator: -81.05 degC North Pole: -69.65 degC In this case, **cfgrib** default ``step`` chunk is 1. If the equivalent **pywgrib2** chunk ``time1`` is also set to 1, the run time for single tread increases to about 48 s, which is still much faster than **cfgrib** 4 min. The dask run time stays at about 23 s.