Tutorial¶

xcollection extends xarray’s data model to be able to handle a dictionary of xarray Datasets. A xcollection.main.Collection behaves like a regular dictionary, but it also has a few extra methods that make it easier to work with.

Let’s start by importing the necessary packages.

import xarray as xr
import xcollection as xc
import typing

Accessing keys and values in a collection¶

To access the keys and values of a collection, we can use the xcollection.main.Collection.keys() and xcollection.main.Collection.values() methods.

col.keys()

dict_keys(['foo', 'bar'])

col.values()

dict_values([<xarray.Dataset>
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ..., <xarray.Dataset>
Dimensions:  (time: 36, y: 205, x: 275)
Coordinates:
  * time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (y, x) float64 189.2 189.4 189.6 189.7 ... 17.65 17.4 17.15 16.91
    yc       (y, x) float64 16.53 16.78 17.02 17.27 ... 28.26 28.01 27.76 27.51
Dimensions without coordinates: y, x
Data variables:
    Tair     (time, y, x) float64 ...])

In addition, we can use the xcollection.main.Collection.items() method to get a list of tuples of the keys and values.

for key, value in col.items():
    print(key, value)

foo <xarray.Dataset>
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ...
bar <xarray.Dataset>
Dimensions:  (time: 36, y: 205, x: 275)
Coordinates:
  * time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (y, x) float64 189.2 189.4 189.6 189.7 ... 17.65 17.4 17.15 16.91
    yc       (y, x) float64 16.53 16.78 17.02 17.27 ... 28.26 28.01 27.76 27.51
Dimensions without coordinates: y, x
Data variables:
    Tair     (time, y, x) float64 ...

Saving a collection to disk¶

To save a collection to disk, we can use the xcollection.main.Collection.to_zarr() method. This method takes a path to a directory or a cloud bucket storage and writes the collection as a zarr store. Each key in the collection is saved as a zarr group with the same name as the key.

col.to_zarr('/tmp/my_collection.zarr', consolidated=True, mode='w')

/home/docs/checkouts/readthedocs.org/user_builds/xcollection/conda/stable/lib/python3.10/site-packages/xarray/core/dataset.py:2037: SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs
  return to_zarr(

[<xarray.backends.zarr.ZarrStore at 0x7f7977a2a570>,
 <xarray.backends.zarr.ZarrStore at 0x7f797784c580>]

!ls -ltrha /tmp/my_collection.zarr

total 28K
-rw-r--r-- 1 docs docs   24 Dec 23 16:57 .zgroup
drwxrwxrwt 1 root root 4.0K Dec 23 16:57 ..
drwxr-xr-x 6 docs docs 4.0K Dec 23 16:57 foo
drwxr-xr-x 6 docs docs 4.0K Dec 23 16:57 bar
-rw-r--r-- 1 docs docs 7.0K Dec 23 16:57 .zmetadata
drwxr-xr-x 4 docs docs 4.0K Dec 23 16:57 .