TheilSenRegressor
Theil-Sen Estimator: robust multivariate regression model.
The algorithm calculates least square solutions on subsets with size n_subsamples of the samples in X. Any value of n_subsamples between the number of features and samples leads to an estimator with a compromise between robustness and efficiency. Since the number of least square solutions is “n_samples choose n_subsamples”, it can be extremely large and can therefore be limited with max_subpopulation. If this limit is reached, the subsets are chosen randomly. In a final step, the spatial median (or L1 median) is calculated of all least square solutions.
Read more in the User Guide.
Python Reference (opens in a new tab)
Constructors
constructor()
Signature
new TheilSenRegressor(opts?: object): TheilSenRegressor;
Parameters
Name | Type | Description |
---|---|---|
opts? | object | - |
opts.copy_X? | boolean | If true , X will be copied; else, it may be overwritten. Default Value true |
opts.fit_intercept? | boolean | Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations. Default Value true |
opts.max_iter? | number | Maximum number of iterations for the calculation of spatial median. Default Value 300 |
opts.max_subpopulation? | number | Instead of computing with a set of cardinality ‘n choose k’, where n is the number of samples and k is the number of subsamples (at least number of features), consider only a stochastic subpopulation of a given maximal size if ‘n choose k’ is larger than max_subpopulation. For other than small problem sizes this parameter will determine memory usage and runtime if n_subsamples is not changed. Note that the data type should be int but floats such as 1e4 can be accepted too. Default Value 10000 |
opts.n_jobs? | number | Number of CPUs to use during the cross validation. undefined means 1 unless in a joblib.parallel\_backend (opens in a new tab) context. \-1 means using all processors. See Glossary for more details. |
opts.n_subsamples? | number | Number of samples to calculate the parameters. This is at least the number of features (plus 1 if fit_intercept=true ) and the number of samples as a maximum. A lower number leads to a higher breakdown point and a low efficiency while a high number leads to a low breakdown point and a high efficiency. If undefined , take the minimum number of subsamples leading to maximal robustness. If n_subsamples is set to n_samples, Theil-Sen is identical to least squares. |
opts.random_state? | number | A random number generator instance to define the state of the random permutations generator. Pass an int for reproducible output across multiple function calls. See Glossary. |
opts.tol? | number | Tolerance when calculating spatial median. Default Value 0.001 |
opts.verbose? | boolean | Verbose mode when fitting the model. Default Value false |
Returns
Defined in: generated/linear_model/TheilSenRegressor.ts:25 (opens in a new tab)
Properties
_isDisposed
boolean
=false
Defined in: generated/linear_model/TheilSenRegressor.ts:23 (opens in a new tab)
_isInitialized
boolean
=false
Defined in: generated/linear_model/TheilSenRegressor.ts:22 (opens in a new tab)
_py
PythonBridge
Defined in: generated/linear_model/TheilSenRegressor.ts:21 (opens in a new tab)
id
string
Defined in: generated/linear_model/TheilSenRegressor.ts:18 (opens in a new tab)
opts
any
Defined in: generated/linear_model/TheilSenRegressor.ts:19 (opens in a new tab)
Accessors
breakdown_
Approximated breakdown point.
Signature
breakdown_(): Promise<number>;
Returns
Promise
<number
>
Defined in: generated/linear_model/TheilSenRegressor.ts:349 (opens in a new tab)
coef_
Coefficients of the regression model (median of distribution).
Signature
coef_(): Promise<ArrayLike>;
Returns
Promise
<ArrayLike
>
Defined in: generated/linear_model/TheilSenRegressor.ts:295 (opens in a new tab)
feature_names_in_
Names of features seen during fit. Defined only when X
has feature names that are all strings.
Signature
feature_names_in_(): Promise<ArrayLike>;
Returns
Promise
<ArrayLike
>
Defined in: generated/linear_model/TheilSenRegressor.ts:457 (opens in a new tab)
intercept_
Estimated intercept of regression model.
Signature
intercept_(): Promise<number>;
Returns
Promise
<number
>
Defined in: generated/linear_model/TheilSenRegressor.ts:322 (opens in a new tab)
n_features_in_
Number of features seen during fit.
Signature
n_features_in_(): Promise<number>;
Returns
Promise
<number
>
Defined in: generated/linear_model/TheilSenRegressor.ts:430 (opens in a new tab)
n_iter_
Number of iterations needed for the spatial median.
Signature
n_iter_(): Promise<number>;
Returns
Promise
<number
>
Defined in: generated/linear_model/TheilSenRegressor.ts:376 (opens in a new tab)
n_subpopulation_
Number of combinations taken into account from ‘n choose k’, where n is the number of samples and k is the number of subsamples.
Signature
n_subpopulation_(): Promise<number>;
Returns
Promise
<number
>
Defined in: generated/linear_model/TheilSenRegressor.ts:403 (opens in a new tab)
py
Signature
py(): PythonBridge;
Returns
PythonBridge
Defined in: generated/linear_model/TheilSenRegressor.ts:87 (opens in a new tab)
Signature
py(pythonBridge: PythonBridge): void;
Parameters
Name | Type |
---|---|
pythonBridge | PythonBridge |
Returns
void
Defined in: generated/linear_model/TheilSenRegressor.ts:91 (opens in a new tab)
Methods
dispose()
Disposes of the underlying Python resources.
Once dispose()
is called, the instance is no longer usable.
Signature
dispose(): Promise<void>;
Returns
Promise
<void
>
Defined in: generated/linear_model/TheilSenRegressor.ts:150 (opens in a new tab)
fit()
Fit linear model.
Signature
fit(opts: object): Promise<any>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.X? | ArrayLike [] | Training data. |
opts.y? | ArrayLike | Target values. |
Returns
Promise
<any
>
Defined in: generated/linear_model/TheilSenRegressor.ts:167 (opens in a new tab)
init()
Initializes the underlying Python resources.
This instance is not usable until the Promise
returned by init()
resolves.
Signature
init(py: PythonBridge): Promise<void>;
Parameters
Name | Type |
---|---|
py | PythonBridge |
Returns
Promise
<void
>
Defined in: generated/linear_model/TheilSenRegressor.ts:100 (opens in a new tab)
predict()
Predict using the linear model.
Signature
predict(opts: object): Promise<any>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.X? | any | Samples. |
Returns
Promise
<any
>
Defined in: generated/linear_model/TheilSenRegressor.ts:209 (opens in a new tab)
score()
Return the coefficient of determination of the prediction.
The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y\_true \- y\_pred)\*\* 2).sum()
and \(v\) is the total sum of squares ((y\_true \- y\_true.mean()) \*\* 2).sum()
. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y
, disregarding the input features, would get a \(R^2\) score of 0.0.
Signature
score(opts: object): Promise<number>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.X? | ArrayLike [] | Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n\_samples, n\_samples\_fitted) , where n\_samples\_fitted is the number of samples used in the fitting for the estimator. |
opts.sample_weight? | ArrayLike | Sample weights. |
opts.y? | ArrayLike | True values for X . |
Returns
Promise
<number
>
Defined in: generated/linear_model/TheilSenRegressor.ts:246 (opens in a new tab)