ruckus.embedding.RandomFourierRBF class¶
- class ruckus.embedding.RandomFourierRBF(n_components=100, gamma=None, complex=False, engine=None, engine_params=None, take=None, filter=None, copy_X=True)[source]¶
Bases:
ruckus.base.RKHS
RandomFourierRBF
generates an embedding map \(\phi:X\rightarrow H\) by constructing random Fourier phase signals; that is,\[\begin{split}\phi(x) = \frac{1}{\sqrt{K}}\begin{bmatrix} e^{i x\cdot w_1} \\ \vdots \\ e^{i x\cdot w_K} \end{bmatrix}\end{split}\]where \(K\) is the specified
n_components
and \((w_1,\dots,w_K)\) is drawn from a multivariate normal with covariance matrix \(\mathrm{diag}(\gamma,\dots,\gamma)\). The result that the kernel \(k(x,y) = \left<\phi(x),\phi(y)\right>\) is approximately a Gaussian RBF with scale parameter \(\gamma\) [1].Rather than drawing a truly random set of phase vectors (which converges \(O(n^{-1/2})\)) we use quasi-Monte Carlo sampling via
scipy.stats.qmc.QMCEngine()
, which converges \(O((\log n)^d n^{-1})\) where \(d\) corresponds to the number of features in \(X\).- Parameters
n_components (
int
) – Default = 100. The number of random Fourier features to generate.gamma (
float
) – Default =None
. Specifies the scale parameter of the Gaussian kernel to be approximated. IfNone
, set to1/n_features
.complex (
bool
) – Default =False
. IfFalse
, the output vector has shape(n_samples,2*n_components)
, where real and imaginary parts are written in pairs.engine (child class of
scipy.stats.qmc.QMCEngine()
) – Default =None
. The sampler class to use. IfNone
, set toscipy.stats.qmc.Sobol()
.engine_params (
dict
) – Default =None
. Initialization parameters to use forengine
.take (
numpy.ndarray
ofdtype int
orbool
, ortuple
ofnumpy.ndarray
instances of typeint
, orNone
) – Default =None
. Specifies which values to take from the datapoint for transformation. IfNone
, the entire datapoint will be taken in its original shape. Ifbool
array, acts as a mask setting values markedFalse
to0
and leaving values marked True unchanged. Ifint
array, the integers specify the indices (along the first feature dimension) which are to be taken, in the order/shape of the desired input. Iftuple
ofint
arrays, allows for drawing indices across multiple dimensions, similar to passing atuple
to anumpy
array.filter (
numpy.ndarray
ofdtype float
orNone
) – Default =None
. Specifies a linear preprocessing of the data. Applied after take. IfNone
, no changes are made to the input data. If the same shape as the input datapoints,filter
and the datapoint are multiplied elementwise. Iffilter
has a larger dimension than the datapoint, then its first dimensions will be contracted with the datapoint vianumpy.tensordot()
. The final shape is determined by the remaining dimensions of filter.copy_X (
bool
) – Default =True
. IfTrue
, inputX
is copied and stored by the model in theX_fit_
attribute. If no further changes will be done toX
, settingcopy_X=False
saves memory by storing a reference.
- Parameters
ws_ (
numpy.ndarray
of shape(n_components,n_features)
) – Randomly-selected phase coefficients used to generate Fourier features.shape_in_ (
tuple
) – The required shape of the input datapoints, aka the shape of the domain space \(X\).shape_out_ (
tuple
) – The final shape of the transformed datapoints, aka the shape of the Hilbert space \(H\).X_fit_ (
numpy.ndarray
of shape(n_samples,)+self.shape_in_
) – The data which was used to fit the model.
- fit(X, y=None)[source]¶
Fit the model from data in
X
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Training vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. Must be consistent with preprocessing instructions inself.take
andself.filter
.y (Ignored) – Not used, present for API consistency by convention.
- Returns
The instance itself
- Return type
RKHS
- transform(X)[source]¶
Transform
X
.- Parameters
X (
numpy.ndarray
of shape(n_samples, n_features_1,...,n_features_d)
) – Data vector, wheren_samples
is the number of samples and(n_features_1,...,n_features_d)
is the shape of the input data. These must matchself.shape_in_
.- Returns
The transformed data
- Return type
numpy.ndarray
of shape(n_samples,)+self.shape_out_