ruckus.base.RKHS class¶
- class ruckus.base.RKHS(*, take=None, filter=None, copy_X=True)[source]¶
Bases:
sklearn.base.TransformerMixin
,sklearn.base.BaseEstimator
Base instance of a Reproducing Kernel Hilbert Space [1]. An RKHS consists of a Hilbert space \(H\), a feature mapping \(\phi:X \rightarrow H\) from the data space \(X\) into \(H\), and a kernel \(k(x,y)\) on \(X^2\) defined by \(k(x,y) = \left<\phi(x),\phi(y)\right>_H\). This base RKHS sets \(H=X\) by default, with \(\phi(x)=x\) and \(k(x,y)=x^T y\).
Certain functions \(f\) may be represented in \(H\) with a vector \(F\) satisfying \(\left<F,\phi(x)\right>_H=f(x)\) for all \(x \in X\). This representation can be discovered using ridge regression [2]. The set of valid functions depends on \(H\) and \(k\). This base RKHS class can only represent linear functions.
The
fit()
method will typically determine the dimensions and shapes of \(H\) and \(X\), as well as any other necessary parameters for determining the feature mapping \(\phi\). Thetransform()
method will implement the feature mapping \(\phi\). Thekernel()
method will evaluate the kernel \(k\). Thefit_function()
method will find the representation of a function \(f\) given the vector \(y_i=f(x_i)\) of its values on the predictor variables.RKHS instances can be combined with one another via composition, direct sum and tensor product. These produce compound RKHS classes,
CompositeRKHS
,DirectSumRKHS
, andProductRKHS
. These combinations can be instantiated with the corresponding class, or generated from arbitrary RKHS instances using the operations@
for composition,+
for direct sum, and*
for tensor product. See the corresponding classes for further details.Aronszajn, N. “Theory of reproducing kernels.” Trans. Amer. Math. Soc. 68 (1950), 337-404.
Murphy, K. P. “Machine Learning: A Probabilistic Perspective”, The MIT Press. chapter 14.4.3, pp. 492-493
- Parameters
take (
numpy.ndarray
ofdtype int
orbool
, ortuple
ofnumpy.ndarray
instances of typeint
, orNone
) – Default =None
. Specifies which values to take from the datapoint for transformation. IfNone
, the entire datapoint will be taken in its original shape. Ifbool
array, acts as a mask setting values markedFalse
to0
and leaving values marked True unchanged. Ifint
array, the integers specify the indices (along the first feature dimension) which are to be taken, in the order/shape of the desired input. Iftuple
ofint
arrays, allows for drawing indices across multiple dimensions, similar to passing atuple
to anumpy
array.filter (
numpy.ndarray
ofdtype float
orNone
) – Default =None
. Specifies a linear preprocessing of the data. Applied after take. IfNone
, no changes are made to the input data. If the same shape as the input datapoints,filter
and the datapoint are multiplied elementwise. Iffilter
has a larger dimension than the datapoint, then its first dimensions will be contracted with the datapoint vianumpy.tensordot()
. The final shape is determined by the remaining dimensions of filter.copy_X (
bool
) – Default =True
. IfTrue
, inputX
is copied and stored by the model in theX_fit_
attribute. If no further changes will be done toX
, settingcopy_X=False
saves memory by storing a reference.
- Parameters
shape_in_ (
tuple
) – The required shape of the input datapoints, aka the shape of the domain space \(X\).shape_out_ (
tuple
) – The final shape of the transformed datapoints, aka the shape of the Hilbert space \(H\).X_fit_ (
numpy.ndarray
of shape (n_samples,)+self.shape_in_) – The data which was used to fit the model.
- fit(X, y=None)[source]¶
Fit the model from data in
X
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Training vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. Must be consistent with preprocessing instructions inself.take
andself.filter
.y (Ignored) – Not used, present for API consistency by convention.
- Returns
The instance itself
- Return type
- fit_function(y, X=None, regressor=None, alpha=1)[source]¶
Fit a function using its values on the predictor data and a regressor.
- Parameters
y (
numpy.ndarray
of shape(n_samples, n_targets)
) – Target vector, wheren_samples
is the number of samples andn_targets
is the number of target functions.X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Default =None
. Training vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
. IfNone
,self.X_fit_
is used.regressor (
sklearn.base.BaseEstimator
) – The regressor object to use to fit the function. IfNone
, asklearn.linear_model.Ridge
instance is used withfit_intercept=False
andalpha
specified below.alpha – The ridge parameter used in the default Ridge regressor.
type – float
- Returns
regressor
, fitted to provide the function representation.- Return type
- fit_transform(X, y=None)[source]¶
Fit the model from data in
X
and transformX
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Training vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. Must be consistent with preprocessing instructions inself.take
andself.filter
.- Returns
The transformed data
- Return type
numpy.ndarray
of shape(n_samples,)+self.shape_out_
- kernel(X, Y=None)[source]¶
Evaluates the kernel on
X
andY
(orX
andX
).- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
.Y (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Default =None
. Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
. IfNone
,X
is used.
- Returns
The matrix
K[i,j] = k(X[i],Y[j])
- Return type
numpy.ndarray
of shape(n_samples_1,n_samples_2)
- transform(X)[source]¶
Transform
X
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
.- Returns
The transformed data
- Return type
numpy.ndarray
of shape(n_samples,)+self.shape_out_