Data wrangling: imputation, interpolation, and resampling
Five functions cover the regridding/cleaning surface. Each is a thin dispatch layer in core; the implementations live in extensions, so using the right backend is mandatory.
| Function | What it does | Loads from | Aliases? |
|---|---|---|---|
interpolate | Fit an interpolant to a series | DataInterpolations; or DataInterpolationsND for joint multi-axis | n/a |
upsample | Increase sampling density (factor) | DataInterpolations (thin resample wrapper) | never |
resample | Evaluate an interpolant on an arbitrary new grid | DataInterpolations; ND variant via DataInterpolationsND | only if target is coarser than source |
downsample | Reduce the sampling rate by an integer factor | DSP | no, when antialias = true (default); yes, when false |
impute | Fill NaN/missing/flagged entries by interpolation | DataInterpolations | n/a |
1-D interpolate/upsample/resample
using TimeseriesTools, DataInterpolations
x = Timeseries(sinc.(-π:0.1:π), -π:0.1:π)
itp = interpolate(x, AkimaInterpolation) # any DataInterpolations type works
y1 = upsample(x, 4) # 4× the sampling density
y2 = resample(x, 0.05) # explicit new period
y3 = resample(x, [0.0, 0.5, 1.2]) # arbitrary target pointsresample accepts an AbstractVector, a Dimension, or a Number (interpreted as a sampling period). Multivariate input is handled per-slice along dims = 1 (the time axis); units flow through both the values and the time lookup.
downsample
downsample is in DSPExt and is the only spectrally-correct way to reduce a sampling rate. It applies a polyphase anti-aliasing FIR filter, then decimates by an integer factor.
using DSP
y = downsample(x, 4) # antialias = true (default)
z = downsample(x, 4; antialias=false) # plain x[1:4:end], aliases — see belowTwo distinct intents are supported via the antialias keyword:
antialias = true(default): filter then decimate. Use for a faithful low-rate representation of the kept band — the honest "lower the rate" operation.antialias = false: plain decimation. Use to simulate having physically sampled the process at a lower rate (aliasing and all). The fold-back is the real acquisition behaviour you're reproducing.
Neither resample to a coarser grid nor x[1:N:end] includes anti-aliasing: they silently fold high-frequency content into your kept band.
impute
Fill flagged entries (default [NaN, Nothing, Missing]) by interpolation.
v = sin.(0.0:0.1:9.9); v[10:15] .= NaN
x = Timeseries(v, 0.0:0.1:9.9)
y = impute(x) # NaNs filled by Akima fit
y = impute(x; replace = [NaN, Nothing, Missing, Complex]) # also flag complex valuesreplace accepts sentinel values (matched by isequal, with NaN matched by isnan) and types (matched by isa). The mechanism is mask → missing → DataInterpolations.munge_data drops the masked pairs → interpolant fit to survivors → evaluated at every original time point. For arrays of more than one dimension, each slice along dims = 1 is imputed independently.
Joint N-dimensional interpolation
For grid data where you want a single interpolant over multiple axes (rather than per-axis), load DataInterpolationsND:
using DataInterpolationsND
X = Timeseries(rand(11, 9), 𝑡(0:0.5:5.0), Var(0:0.5:4.0))
itp = interpolate(X, LinearInterpolationDimension) # joint over both axes
Y = upsample(X, 2, LinearInterpolationDimension) # joint, not separableLinearInterpolationDimension and ConstantInterpolationDimension pass through the samples. BSplineInterpolationDimension uses the data as B-spline control points (not samples) and so smooths rather than interpolates for degree > 1 — see the docstring. The ND path requires a complete (gap-free) grid: use impute first if needed, then ND-regrid.
Reference
TimeseriesTools.interpolate Function
interpolate(x, args...; kwargs...)Fit an interpolant to a time series. Method-only function: the implementation is provided by extensions, and the available signatures depend on which interpolation backend is loaded.
using DataInterpolationsenables a per-axis 1-D method onAbstractDimVector, parameterised by aDataInterpolations.AbstractInterpolationtype.using DataInterpolationsNDenables a joint N-D method onAbstractDimArray, parameterised byDataInterpolationsND.AbstractInterpolationDimensiontypes (or instances), one per axis.
Without either loaded, calling this function throws MethodError. See resample, upsample, downsample, and impute for the higher-level operations built on top.
TimeseriesTools.upsample Function
upsample(x::AbstractDimArray, factor, args...; dims = 1, kwargs...)Increase the sampling density of x by factor along dims by interpolation. A thin wrapper over resample: for each dimension in dims it builds a denser target grid (step / factor, or mean(diff) / factor for irregular lookups) and resamples that axis onto it. args.../kwargs... (e.g. the interpolation type) are forwarded to resample.
To reduce the sampling rate it is suggested to use downsample, which anti-alias filters before decimating.
TimeseriesTools.resample Function
resample(x, target, args...; dims = 1, kwargs...)Evaluate an interpolant of x on target, returning a new series with the same non-interpolated dimensions. Method-only function: provided by extensions.
using DataInterpolations: per-axis resampling alongdims = 1.targetmay be anAbstractVectorof new sample points, aDimensionalData.Dimension, or aNumber(interpreted as a sampling period). Pass anAbstractInterpolationsubtype asargs[1]to select the interpolation method (defaultAkimaInterpolation).using DataInterpolationsND: joint multi-axis resampling.targetis a tuple of new lookups (one per dimension ofx) or aNumber(common period along every axis), andargs[1]is aDataInterpolationsND.AbstractInterpolationDimensiontype or tuple.
For reducing a regular sampling rate prefer downsample (filter-then-decimate); plain resample onto a coarser grid does not anti-alias.
TimeseriesTools.downsample Function
downsample(x::RegularTimeseries, factor::Integer; antialias = true)Reduce the sampling rate of x by an integer factor. Method-only function: provided by DSPExt (loaded with using DSP).
antialias = true(default): apply an anti-aliasing FIR filter before decimating, so content above the new Nyquist is removed rather than aliased back into the kept band.antialias = false: plainx[1:factor:end], no filter. Use to simulate having physically sampled the process at the lower rate.
TimeseriesTools.impute Function
impute(x, interp = AkimaInterpolation, args...; dims = 1, replace = [NaN, Nothing, Missing], kwargs...)Fill flagged entries of x by interpolation. Method-only function: provided by DataInterpolationsExt (loaded with using DataInterpolations).
Entries matching any element of replace (sentinel values by isequal/isnan, types by isa) are set to missing, an interpolant is fit to the survivors, and the result is evaluated at every original time point. For arrays of more than one dimension, each slice along dims = 1 is imputed independently.