Dataset¶
Dataset is a plain dataclass that bundles a Station reference, a Variable descriptor,
and the fetched time-series data. It is returned by station.fetch(key) or station[key].
Access the measurements directly via dataset.data, which is a pandas.DataFrame with
timestamp (datetime) and value (float64) columns. Plotting helpers are
available through the dataset.plot property.
colombia_hydrodata.dataset.Dataset
dataclass
¶
Holds time-series data for a single variable measured at a station.
Attributes:
| Name | Type | Description |
|---|---|---|
station |
Station
|
The station at which the variable was measured. |
variable |
Variable
|
The hydrological or meteorological variable being recorded. |
data |
DataFrame
|
A DataFrame containing the time-series observations for the variable, as retrieved from the Aquarius data source. |
plot
property
¶
Return a plotting helper bound to this dataset.
Provides convenient access to the plotting API via
dataset.plot.<method>() without storing plotting logic directly on
the dataset class itself.
Returns:
| Name | Type | Description |
|---|---|---|
A |
DatasetPlot
|
class: |
DatasetPlot
|
the current dataset. |
from_variable(station, variable)
classmethod
¶
Construct a Dataset by fetching data for the given variable from Aquarius.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
station
|
Station
|
The station associated with the variable. |
required |
variable
|
Variable
|
The variable whose time-series data should be fetched. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance populated with the fetched data. |
__str__()
¶
Return a human-readable summary of the dataset.
Returns:
| Type | Description |
|---|---|
str
|
A comma-separated string containing the station name, station ID, |
str
|
municipality, department, and variable description. |
sight_level(level)
¶
Adjust observed stage values by subtracting the sight level reference.
The sight level is the difference between the observable (staff gauge) reading and the absolute sea-level elevation of the zero mark. Applying it converts raw gauge readings into elevation-referenced stage values, enabling meaningful comparison of water levels across stations along the same river reach.
Note
This method is intended exclusively for stage datasets, i.e.
variables whose key starts with 'NIVEL'. Applying it to other
variable types produces meaningless results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
float
|
The sight level offset (in the same units as the observed values, typically metres) to subtract from every observation. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance with adjusted |
Self
|
the original unchanged. |
rescale(scale)
¶
Convert observed values from one measurement unit to another.
Multiplies every value in the series by a conversion factor, allowing unit transformations without altering the underlying data source.
Example
To convert stage readings from centimetres to metres::
dataset.rescale(1 / 100)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scale
|
float
|
The multiplicative conversion factor to apply to all
observed values. Must be set by the caller according to the
desired unit transformation (e.g. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance with rescaled |
Self
|
the original unchanged. |
interpolate(time_precision=None, **kwargs)
¶
Resample the time series to a regular frequency and interpolate missing values.
Resamples the dataset to a uniform time grid, introducing NaN at
any timestamps where no measurement was recorded, then fills those gaps
using :meth:pandas.DataFrame.interpolate.
The target frequency can be supplied explicitly or derived automatically
from the variable label. Variable labels follow the convention
"<PARAM>_<FREQ>", where <FREQ> is a single-character code
('A' annual, 'M' monthly, 'D' daily, 'H' hourly) that
is mapped to the corresponding pandas offset alias via
time_precision_options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
time_precision
|
str | None
|
A pandas offset alias (e.g. |
None
|
**kwargs
|
Any
|
Additional keyword arguments forwarded to
:meth: |
{}
|
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance with a regularly spaced |
Self
|
and interpolated |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
detrend(**kwargs)
¶
Remove the trend component from the dataset's value series.
Delegates to :func:colombia_hydrodata.utils.tsa.detrend. The
resulting trend and detrended columns are added to a copy of the
underlying DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Keyword arguments forwarded to
:func: |
{}
|
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance with |
Self
|
appended to the data, leaving the original unchanged. |
seasonal()
¶
Compute the seasonal component from the detrended series.
Delegates to :func:colombia_hydrodata.utils.tsa.seasonal_series.
Must be called after :meth:detrend.
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance with a |
Self
|
the data, leaving the original unchanged. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the |
anomalies()
¶
Compute anomalies by removing the seasonal component.
Delegates to :func:colombia_hydrodata.utils.tsa.anomalies_series.
Must be called after :meth:seasonal.
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance with an |
Self
|
the data, leaving the original unchanged. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the |
deconstruction(**kwargs)
¶
Fully decompose the value series in a single step.
Delegates to :func:colombia_hydrodata.utils.tsa.deconstruction,
running detrending, seasonal estimation, and anomaly extraction at
once. Replaces the entire DataFrame with the decomposition result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Keyword arguments forwarded to
:func: |
{}
|
Returns:
| Type | Description |
|---|---|
Self
|
A new Dataset instance whose data contains columns: |
Self
|
|
Self
|
|
options: show_source: false show_root_heading: true show_symbol_type_heading: true show_symbol_type_toc: true members_order: source