peakweather.utils¶
- df_add_missing_columns(df: DataFrame, col0=None, col1=None, fill_value=nan) DataFrame¶
Add missing columns to a MultiIndex
DataFramewith NaN values.- Parameters:
df (pd.DataFrame) – The input
DataFrame.col0 (list, optional) – The first level of the
MultiIndexcolumns. IfNone, will use the existing columns.col1 (list, optional) – The second level of the
MultiIndexcolumns. IfNone, will use the existing columns.fill_value (scalar) – The value to use for missing columns. Default is np.nan.
- Returns:
The
DataFramewith missing columns added.- Return type:
pd.DataFrame
- sliding_window_view(data: ndarray, window_size: int) ndarray¶
Creates a sliding window view of the input data.
- Parameters:
data (np.ndarray) – The input data with shape
(num_time_steps, *).window_size (int) – The size of the sliding window.
- Returns:
- The sliding window view of the input data with shape
(num_windows, window_size, *).
- Return type:
np.ndarray
- timestamps_from_xr(ds: xr.Dataset, delta: str, tz: str | None = 'UTC') ndarray¶
Compute a 2D array of timezone-aware timestamps by combining a reference time coordinate with a time-delta coordinate.
- Parameters:
- Returns:
- A 2D array of shape (num_reftime, num_deltas) containing
pandas.Timestamp objects localized to UTC, where each entry is reftime[i] + offset[j].
- Return type:
np.ndarray
- to_pandas_freq(freq: str)¶
Convert a frequency string to a pandas frequency object.
- Parameters:
freq (str) – The frequency string.
- Returns:
The pandas frequency object.
- Return type:
pd.DateOffset
- Raises:
ValueError – If the frequency string is not valid.
- xr_to_np(a: xr.Dataset, pars: list | None = None, sample_dim: int | None = None, stack_dim: int = -1) ndarray¶
Extract variables from an
Datasetand return them as a stackedndarray.- Parameters:
a (xarray.Dataset) – The input dataset containing one or more data variables.
pars (list[str], optional) – The names of the variables to extract. If None, all data variables in a are used.
sample_dim (int, optional) – The dimension containing the samples in a, if present. If sample_dim is an int, the sample_dim dimension is rearranged as leading dimension (samples, a.shape[~sample_dim]); None indicates no sampling dimension to be moved.
stack_dim (int) – The dimension along which the arrays are stacked.
- Returns:
- A NumPy array where the selected variables are stacked along
the last axis. If each variable has shape (*dims), the returned array has shape (*dims, num_vars).
- Return type:
np.ndarray