Tutorial#

xcollection extends xarray’s data model to be able to handle a dictionary of xarray Datasets. A xcollection.main.Collection behaves like a regular dictionary, but it also has a few extra methods that make it easier to work with.

Let’s start by importing the necessary packages.

import xarray as xr
import xcollection as xc
import typing

Accessing keys and values in a collection#

To access the keys and values of a collection, we can use the xcollection.main.Collection.keys() and xcollection.main.Collection.values() methods.

col.keys()

dict_keys(['foo', 'bar'])

col.values()

dict_values([<xarray.Dataset>
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ..., <xarray.Dataset>
Dimensions:  (time: 36, y: 205, x: 275)
Coordinates:
  * time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (y, x) float64 189.2 189.4 189.6 189.7 ... 17.65 17.4 17.15 16.91
    yc       (y, x) float64 16.53 16.78 17.02 17.27 ... 28.26 28.01 27.76 27.51
Dimensions without coordinates: y, x
Data variables:
    Tair     (time, y, x) float64 ...])

In addition, we can use the xcollection.main.Collection.items() method to get a list of tuples of the keys and values.

for key, value in col.items():
    print(key, value)

foo <xarray.Dataset>
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ...
bar <xarray.Dataset>
Dimensions:  (time: 36, y: 205, x: 275)
Coordinates:
  * time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (y, x) float64 189.2 189.4 189.6 189.7 ... 17.65 17.4 17.15 16.91
    yc       (y, x) float64 16.53 16.78 17.02 17.27 ... 28.26 28.01 27.76 27.51
Dimensions without coordinates: y, x
Data variables:
    Tair     (time, y, x) float64 ...

Saving a collection to disk#

To save a collection to disk, we can use the xcollection.main.Collection.to_zarr() method. This method takes a path to a directory or a cloud bucket storage and writes the collection as a zarr store. Each key in the collection is saved as a zarr group with the same name as the key.

col.to_zarr('/tmp/my_collection.zarr', consolidated=True, mode='w')

/home/docs/checkouts/readthedocs.org/user_builds/xcollection/conda/latest/lib/python3.10/site-packages/xarray/core/dataset.py:2060: SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs
  return to_zarr(  # type: ignore

[<xarray.backends.zarr.ZarrStore at 0x7f1022bda030>,
 <xarray.backends.zarr.ZarrStore at 0x7f10207cce40>]

!ls -ltrha /tmp/my_collection.zarr

total 28K
-rw-r--r-- 1 docs docs   24 Sep  6 03:23 .zgroup
drwxrwxrwt 1 root root 4.0K Sep  6 03:23 ..
drwxr-xr-x 6 docs docs 4.0K Sep  6 03:23 foo
drwxr-xr-x 6 docs docs 4.0K Sep  6 03:23 bar
-rw-r--r-- 1 docs docs 7.0K Sep  6 03:23 .zmetadata
drwxr-xr-x 4 docs docs 4.0K Sep  6 03:23 .