ruckus.base.ProductRKHS class¶
- class ruckus.base.ProductRKHS(factors, *, copy_X=True)[source]¶
Bases:
ruckus.base.RKHS
Given a sequence of RKHS’s with Hilbert spaces \(H_1\), …, \(H_n\) and feature maps \(\phi_1\), …, \(\phi_n\), their composition lives in the tensor product Hilbert space \(H_1\otimes \dots \otimes H_n\) and has feature map \(\phi_1 \otimes \dots \otimes \phi_n\) [1]. Correspondingly, the
shape_out_
of aProductRKHS
instance is the tuple-sum of theshape_out_
tuples of its factors, while all its factors share the sameshape_in_
.Product RKHS’s are particularly useful for working with kernel embeddings of distributions and their conditional probabilities [2]. A
ProductRKHS
can be reduced to its marginal along a set of factors using themarginal()
method, and can be reduced into a marginal space paired with a ridge-regressed conditional map using theconditional()
method.- Parameters
factors (list of
RKHS
objects) – The factorRKHS
objects, listed in the order that their dimensions will appear in indexing.copy_X (
bool
) – Default =True
. IfTrue
, inputX
is copied and stored by the model in theX_fit_
attribute. If no further changes will be done toX
, settingcopy_X=False
saves memory by storing a reference.
- Parameters
shape_in_ (
tuple
) – The required shape of the input datapoints, aka the shape of the domain space \(X\).shape_out_ (
tuple
) – The final shape of the transformed datapoints, aka the shape of the Hilbert space \(H\).X_fit_ (
numpy.ndarray
of shape (n_samples,)+self.shape_in_) – The data which was used to fit the model.
- conditional(predictor_inds, response_inds, regressor=None, alpha=1.0)[source]¶
Returns a pair of outputs, the first being a
sklearn.pipelines.Pipeline
consisting of the marginal RKHS ofpredictor_inds
and a regressor which represents the conditional distribution embedding, and the second being the marginal RKHS ofresponse_inds
.For two systems \(X\) and \(Y\), embedded in Hilbert spaces \(H_1\) and \(H_2\) respectively, the conditional distribution embedding is a linear map \(C_{Y|X}:H_1\rightarrow H_2\) such that \(C_{Y|X}\phi_1(x)\) gives the kernel embedding of the distribution of \(Y\) conditioned on \(X=x\). This is typically determined by using a ridge regression, though we allow the user to pass a custom regressor for model selection purposes. See [1] for details.
- Parameters
predictor_inds (
array
-like ofint
) – List of indices of the factors inself.factors
on which theresponse_inds
will be conditioned.response_inds – List of indices of the factors in
self.factors
which are to be conditioned on thepredictor_inds
.regressor (
sklearn.base.BaseEstimator
) – The regressor object to use to fit the conditional embedding. IfNone
, asklearn.linear_model.Ridge
instance is used withfit_intercept=False
andalpha
specified below.alpha (float) – The ridge parameter used in the default Ridge regressor.
- Returns
(
pipe
,``response``), wherepipe
is a pipeline consisting of the marginal ofpredictor_inds
and the fittedregressor
, andresponse
is the marginal ofresponse_inds
.- Return type
(
sklearn.pipelines.Pipeline
,ProductRKHS
)
- fit(X, y=None)[source]¶
Fit the model from data in
X
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Training vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. Must be consistent with preprocessing instructions in fac.take and fac.filter for each fac in self.factors.- Returns
The instance itself
- Return type
- kernel(X, Y=None)[source]¶
Evaluates the kernel on
X
andY
(orX
andX
) by multiplying the kernels of the factors.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
.Y (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Default =None
. Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
. IfNone
,X
is used.
- Returns
The matrix
K[i,j] = k(X[i],Y[j])
- Return type
numpy.ndarray
of shape(n_samples_1,n_samples_2)
- marginal(var_inds, copy_X=False)[source]¶
Construct a
ProductRKHS
from only the factors specified byvar_inds
. Only to be used ifProductRKHS
is already fit, and you’d rather not fit again.- Parameters
var_inds (
array
-like ofint
) – List of indices of the factors inself.factors
from which to the marginalProductRKHS
.copy_X (
bool
) – Default =True
. IfTrue
, inputself.X_fit_
is copied and stored as the new model’sX_fit_
attribute. If no further changes will be done toX
, settingcopy_X=False
saves memory by storing a reference.
- Returns
The marginal
ProductRKHS
of thevar_inds
.- Return type
ProductRKHS
- transform(X)[source]¶
Transform
X
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
.- Returns
The transformed data
- Return type
numpy.ndarray
of shape(n_samples,)+self.shape_out_