CollectionΒΆ

class xcollection.main.Collection(datasets=None)[source]

A collection of datasets. The keys are the dataset names and the values are the datasets.

Parameters

datasets (dict, optional) – A dictionary of datasets to initialize the collection with.

Examples

>>> import xcollection as xc
>>> import xarray as xr
>>> ds = xr.tutorial.open_dataset('rasm')
>>> c = xc.Collection({'foo': ds.isel(time=0), 'bar': ds.isel(y=0)})
>>> c
<Collection (2 keys)>
πŸ”‘ foo
<xarray.Dataset>
Dimensions:  (y: 205, x: 275)
Coordinates:
    time     object 1980-09-16 12:00:00
    xc       (y, x) float64 ...
    yc       (y, x) float64 ...
Dimensions without coordinates: y, x
Data variables:
    Tair     (y, x) float64 ...
πŸ”‘ bar
<xarray.Dataset>
Dimensions:  (time: 36, x: 275)
Coordinates:
* time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (x) float64 ...
    yc       (x) float64 ...
Dimensions without coordinates: x
Data variables:
    Tair     (time, x) float64 ...
choose(data_vars, *, mode='any')[source]

Return a collection with datasets containing all or any of the specified data variables.

Parameters
  • data_vars (str or list of str) – The data variables to select on.

  • mode (str, optional) – The selection mode. Must be one of β€˜all’ or β€˜any’. Defaults to β€˜any’.

Returns

Collection – A new collection containing only the selected datasets.

Examples

>>> c
<Collection (3 keys)>
πŸ”‘ foo
<xarray.Dataset>
Dimensions:  (y: 205, x: 275)
Coordinates:
    time     object 1980-09-16 12:00:00
    xc       (y, x) float64 ...
    yc       (y, x) float64 ...
Dimensions without coordinates: y, x
Data variables:
    Tair     (y, x) float64 ...
πŸ”‘ bar
<xarray.Dataset>
Dimensions:  (time: 36, x: 275)
Coordinates:
* time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (x) float64 ...
    yc       (x) float64 ...
Dimensions without coordinates: x
Data variables:
    Tair     (time, x) float64 ...
πŸ”‘ baz
<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*
>>> len(c)
3
>>> c.keys()
dict_keys(['foo', 'bar', 'baz'])
>>> d = c.choose(data_vars=['Tair'], mode='any')
>>> len(d)
2
>>> d.keys()
dict_keys(['foo', 'bar'])
>>> d = c.choose(data_vars=['Tair'], mode='all')
items()[source]

Return the items of the collection.

keymap(func)[source]

Apply a function to each key in the collection.

Parameters

func (callable) – The function to apply to each key.

Returns

Collection – A new collection containing the results of the function.

Examples

>>> c
<Collection (2 keys)>
πŸ”‘ foo
<xarray.Dataset>
Dimensions:  (y: 205, x: 275)
Coordinates:
    time     object 1980-09-16 12:00:00
    xc       (y, x) float64 ...
    yc       (y, x) float64 ...
Dimensions without coordinates: y, x
Data variables:
    Tair     (y, x) float64 ...
πŸ”‘ bar
<xarray.Dataset>
Dimensions:  (time: 36, x: 275)
Coordinates:
* time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (x) float64 ...
    yc       (x) float64 ...
Dimensions without coordinates: x
Data variables:
    Tair     (time, x) float64 ...
>>> c.keys()
dict_keys(['foo', 'bar'])
>>> d = c.keymap(lambda x: x.upper())
>>> d.keys()
dict_keys(['FOO', 'BAR'])
keys()[source]

Return the keys of the collection.

map(func, args=(), **kwargs)[source]

Apply a function to each dataset in the collection.

Parameters
  • func (callable) – The function to apply to each dataset.

  • args (tuple, optional) – Positional arguments to pass to func in addition to the dataset.

  • kwargs – Additional keyword arguments to pass as keywords arguments to func.

Returns

Collection – A new collection containing the results of the function.

Examples

>>> c
<Collection (2 keys)>
πŸ”‘ foo
<xarray.Dataset>
Dimensions:  (y: 205, x: 275)
Coordinates:
    time     object 1980-09-16 12:00:00
    xc       (y, x) float64 ...
    yc       (y, x) float64 ...
Dimensions without coordinates: y, x
Data variables:
    Tair     (y, x) float64 ...
πŸ”‘ bar
<xarray.Dataset>
Dimensions:  (time: 36, x: 275)
Coordinates:
* time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (x) float64 ...
    yc       (x) float64 ...
Dimensions without coordinates: x
Data variables:
    Tair     (time, x) float64 ...
>>> c.map(func=lambda x: x.isel(x=slice(0, 10)))
<Collection (2 keys)>
πŸ”‘ foo
<xarray.Dataset>
Dimensions:  (y: 205, x: 10)
Coordinates:
    time     object 1980-09-16 12:00:00
    xc       (y, x) float64 ...
    yc       (y, x) float64 ...
Dimensions without coordinates: y, x
Data variables:
    Tair     (y, x) float64 ...
πŸ”‘ bar
<xarray.Dataset>
Dimensions:  (time: 36, x: 10)
Coordinates:
* time     (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00
    xc       (x) float64 ...
    yc       (x) float64 ...
Dimensions without coordinates: x
Data variables:
    Tair     (time, x) float64 ...
to_zarr(store, mode='w', **kwargs)[source]

Write the collection to a Zarr store.

Parameters
  • store (str or pathlib.Path) – Store or path to directory in local or remote file system.

  • mode ({"w", "w-", "a", "r+", None}, optional) – Persistence mode: β€œw” means create (overwrite if exists); β€œw-” means create (fail if exists); β€œa” means override existing variables (create if does not exist); β€œr+” means modify existing array values only (raise an error if any metadata or shapes would change). The default mode is β€œa” if append_dim is set. Otherwise, it is β€œr+” if region is set and w- otherwise.

  • kwargs – Additional keyword arguments to pass to to_zarr() function.

Examples

>>> c.to_zarr(store='/tmp/foo.zarr', mode='w')
values()[source]

Return the values of the collection.

weighted(weights, **kwargs)[source]

Return a collection with datasets weighted by the given weights.

xcollection.main.open_collection(store, **kwargs)[source]ΒΆ

Open a collection stored in a Zarr store.

Parameters
  • store (str or pathlib.Path) – Store or path to directory in local or remote file system.

  • kwargs – Additional keyword arguments to pass to open_dataset() function.

Returns

Collection – A collection containing the datasets in the Zarr store.

Examples

>>> import xcollection as xc
>>> c = xc.open_collection('/tmp/foo.zarr', decode_times=True, use_cftime=True)