Performance¶
pywgrib2 uses wgrib2 code. It is iteresting to compare performance of those two on common tasks.
The first task is converting GRIB2 file to netCDF. The GRIB2 file is GDAS 2m temperature data for whole month that can be obtained here The file size is about 20 MB. We compare run times of wgrib2 and Python scripts cfgrib and pywgrib2. The tests were ran several times, with no significant differences in lapsed times:
time wgrib2 tmp2m.gdas.l.201912.grib2 -inv /dev/null -netcdf x.nc
real 0m9.814s
user 0m9.696s
sys 0m0.100s
time cfgrib to_netcdf -o y.nc tmp2m.l.gdas.202002.grib2
real 0m16.451s
user 0m15.146s
sys 0m0.714s
time pywgrib2 template -t '2019-12-01T00' -o tmp2m.tmpl tmp2m.l.gdas.201912.grib2
real 0m1.160s
user 0m1.210s
sys 0m0.417s
time pywgrib2 to_nc -T tmp2m.tmpl -o tmp2m-pywgrib.nc tmp2m.l.gdas.201912.grib2
real 0m12.577s
user 0m12.401s
sys 0m0.868s
wgrib2 is the fastest, followed by pywgrib2 and cfgrib. One has to note that wgrib2 does not handle this dataset correctly: it uses forecast valid time as the time coordinate. The datafile contains analysis and 1 to 6 hour forecast, every 6 hours:
wgrib2 tmp2m.gdas.l.202002.grib2
1:0:d=2020020100:TMP:2 m above ground:anl:
2:26025:d=2020020100:TMP:2 m above ground:1 hour fcst:
3:52040:d=2020020100:TMP:2 m above ground:2 hour fcst:
4:77996:d=2020020100:TMP:2 m above ground:3 hour fcst:
5:103976:d=2020020100:TMP:2 m above ground:4 hour fcst:
6:129840:d=2020020100:TMP:2 m above ground:5 hour fcst:
7:155730:d=2020020100:TMP:2 m above ground:6 hour fcst:
8:181608:d=2020020106:TMP:2 m above ground:anl:
9:207509:d=2020020106:TMP:2 m above ground:1 hour fcst:
10:233451:d=2020020106:TMP:2 m above ground:2 hour fcst:
11:259410:d=2020020106:TMP:2 m above ground:3 hour fcst:
. . .
This means that the 6 hour forecast is overwritten by the next forecast analysis.
In the following example we compute mean temperature over two consecutive months.
Calculation is done twice, first with pywgrib2_xr.open_dataset()
, then
xarray.open_mfdataset()
and engine=cfgrib
:
import sys
import time
from dask.distributed import Client
import numpy as np
import xarray as xr
import pywgrib2_xr as pywgrib2
f1 = '/tmp/tmp2m.l.gdas.201912.grib2'
f2 = '/tmp/tmp2m.l.gdas.202001.grib2'
def pywgrib2_():
tmpl = pywgrib2.make_template(f2, reftime='2020-01-01T00:00:00')
ds = pywgrib2.open_dataset([f1, f2], tmpl)
tmp = ds['TMP.2_m_above_ground'][:,:-1,...]
ds.close()
return tmp.mean(['reftime', 'time1']).compute()
def cfgrib():
ds = xr.open_mfdataset([f1, f2], engine='cfgrib') # , chunks={'time': 1})
tmp = ds['t2m'][:,:-1,...]
ds.close()
return tmp.mean(['time', 'step']).compute()
if __name__ == '__main__':
if sys.argv[1] == 'dask':
client = Client()
print(client)
t = time.time()
tmp1 = pywgrib2_()
print('pywgrib2:', time.time() - t)
t = time.time()
tmp2 = cfgrib()
print('cfgrib:', time.time() - t)
assert np.allclose(tmp1.values[::-1,:], tmp2.values)
The last line compares results. Since pywgrib2 always converts grid orientation to WE:SN, the y-axis has to be swapped. The first run is single-threaded, the second uses dask distributed scheduler. pywgrib2_xr inventory and cfgrib index files already exist. The timing is done on an 8-core AMD FX-8350 processor:
python example1.py single
pywgrib2: 30.7
cfgrib: 20.8
cfgrib is substantially faster. This is mostly due to the default chunking by
pywgrib2_xr, that is one chunk per model reference time. When analogous chunks
are set in the call to xr.open_mfdataset
, cfgrib run time increases to 30 s.
python example1.py dask
<Client: 'tcp://127.0.0.1:38535' processes=4 threads=8, memory=33.56 GB>
pywgrib2: 8.3
cfgrib: 11.1
Here situation is reversed. There are only two files, so only two processes are used by cfgrib. However with chunking, cfgrib is faster, the run time is about 7 s.
The next example illustrates performance with with a typical archive, where each data file contains weather elements for model run and one forecast time. We will calculate average minimum temperature in the atmosphere over a period of one month. The input files are GFS model with latitude-longitude projection at 0.5 deg resolution. File is about 60 MB. We select mudel runs at 00Z and 12Z and forecast hours 0 (i.e. analysis), 3, 6 and 9. This gives valid times at every 3 hours. There are 31 * 2 * 4 = 248 files. The timing code calculates minimum temperature in a vertical column 1000 to 100 hPa, then averages it over time:
import glob
import sys
import time
from dask.distributed import Client
import numpy as np
import xarray as xr
import pywgrib2_xr as pywgrib2
files = sorted(glob.glob('gfs_4_201801??_?[02]*_00[0369].grb2'))
def pywgrib2_():
p = lambda x: x.varname == 'TMP' and x.level_code == 100
tmpl = pywgrib2.make_template(files[:4], p, vertlevels='isobaric')
ds = pywgrib2.open_dataset(files, tmpl) # , chunks={'time1': 1})
tmp = ds['TMP.isobaric'][:,:,:21,:,:]
ds.close()
return tmp.min('isobaric1').mean(['reftime', 'time1']).compute()
def cfgrib():
args = {'filter_by_keys': {'typeOfLevel': 'isobaricInhPa', 'shortName': 't'}}
nested = [files[::4], files[1::4], files[2::4], files[3::4]]
ds = xr.open_mfdataset(nested, engine='cfgrib', backend_kwargs=args,
combine='nested', concat_dim=['step', 'time'])
tmp = ds['t'][:,:,:21,:,:]
ds.close()
return tmp.min('isobaricInhPa').mean(['time', 'step']).compute()
if __name__ == '__main__':
if sys.argv[1] == 'dask':
client = Client()
print(client)
t = time.time()
tmp1 = pywgrib2_()
print('pywgrib2: {:.1f} s'.format(time.time() - t))
t = time.time()
tmp2 = cfgrib()7.019040584564209
print('cfgrib: {:.1f} s'.format(time.time() - t))
assert np.allclose(tmp1.values[::-1,:], tmp2.values)
ny = tmp1.shape[0]
print('South Pole: {:.2f} degC'.format(tmp1[0,:].mean().values - 273.15))
print('Equator: {:.2f} degC'.format(tmp1[ny//2+1,:].mean().values - 273.15))
print('North Pole: {:.2f} degC'.format(tmp1[ny-1,:].mean().values - 273.15))
The most time consuming part is creation of inventory/index files. The reported times are for runs where the inventory/index files have been created.
python example2.py single
pywgrib2: 48.0 s
cfgrib: 254.3 s
python example2.py dask
pywgrib2: 23.4 s
cfgrib: 98.5 s
South Pole: -52.90 degC
Equator: -81.05 degC
North Pole: -69.65 degC
In this case, cfgrib default step
chunk is 1. If the equivalent pywgrib2
chunk time1
is also set to 1, the run time for single tread increases to
about 48 s, which is still much faster than cfgrib 4 min. The dask run time
stays at about 23 s.