ColumnTransformer
Applies transformers to columns of an array or pandas DataFrame.
This estimator allows different columns or column subsets of the input to be transformed separately and the features generated by each transformer will be concatenated to form a single feature space. This is useful for heterogeneous or columnar data, to combine several feature extraction mechanisms or transformations into a single transformer.
Read more in the User Guide.
Python Reference (opens in a new tab)
Constructors
constructor()
Signature
new ColumnTransformer(opts?: object): ColumnTransformer;
Parameters
Name | Type | Description |
---|---|---|
opts? | object | - |
opts.n_jobs? | number | Number of jobs to run in parallel. undefined means 1 unless in a joblib.parallel\_backend (opens in a new tab) context. \-1 means using all processors. See Glossary for more details. |
opts.remainder? | "drop" | "passthrough" | By default, only the specified columns in transformers are transformed and combined in the output, and the non-specified columns are dropped. (default of 'drop' ). By specifying remainder='passthrough' , all remaining columns that were not specified in transformers , but present in the data passed to fit will be automatically passed through. This subset of columns is concatenated with the output of the transformers. For dataframes, extra columns not seen during fit will be excluded from the output of transform . By setting remainder to be an estimator, the remaining non-specified columns will use the remainder estimator. The estimator must support fit and transform. Note that using this feature requires that the DataFrame columns input at fit and transform have identical order. Default Value 'drop' |
opts.sparse_threshold? | number | If the output of the different transformers contains sparse matrices, these will be stacked as a sparse matrix if the overall density is lower than this value. Use sparse\_threshold=0 to always return dense. When the transformed output consists of all dense data, the stacked result will be dense, and this keyword will be ignored. Default Value 0.3 |
opts.transformer_weights? | any | Multiplicative weights for features per transformer. The output of the transformer is multiplied by these weights. Keys are transformer names, values the weights. |
opts.transformers? | any | List of (name, transformer, columns) tuples specifying the transformer objects to be applied to subsets of the data. |
opts.verbose? | boolean | If true , the time elapsed while fitting each transformer will be printed as it is completed. Default Value false |
opts.verbose_feature_names_out? | boolean | If true , get\_feature\_names\_out will prefix all feature names with the name of the transformer that generated that feature. If false , get\_feature\_names\_out will not prefix any feature names and will error if feature names are not unique. Default Value true |
Returns
Defined in: generated/compose/ColumnTransformer.ts:25 (opens in a new tab)
Properties
_isDisposed
boolean
=false
Defined in: generated/compose/ColumnTransformer.ts:23 (opens in a new tab)
_isInitialized
boolean
=false
Defined in: generated/compose/ColumnTransformer.ts:22 (opens in a new tab)
_py
PythonBridge
Defined in: generated/compose/ColumnTransformer.ts:21 (opens in a new tab)
id
string
Defined in: generated/compose/ColumnTransformer.ts:18 (opens in a new tab)
opts
any
Defined in: generated/compose/ColumnTransformer.ts:19 (opens in a new tab)
Accessors
n_features_in_
Number of features seen during fit. Only defined if the underlying transformers expose such an attribute when fit.
Signature
n_features_in_(): Promise<number>;
Returns
Promise
<number
>
Defined in: generated/compose/ColumnTransformer.ts:432 (opens in a new tab)
output_indices_
A dictionary from each transformer name to a slice, where the slice corresponds to indices in the transformed output. This is useful to inspect which transformer is responsible for which transformed feature(s).
Signature
output_indices_(): Promise<any>;
Returns
Promise
<any
>
Defined in: generated/compose/ColumnTransformer.ts:405 (opens in a new tab)
py
Signature
py(): PythonBridge;
Returns
PythonBridge
Defined in: generated/compose/ColumnTransformer.ts:73 (opens in a new tab)
Signature
py(pythonBridge: PythonBridge): void;
Parameters
Name | Type |
---|---|
pythonBridge | PythonBridge |
Returns
void
Defined in: generated/compose/ColumnTransformer.ts:77 (opens in a new tab)
sparse_output_
Boolean flag indicating whether the output of transform
is a sparse matrix or a dense numpy array, which depends on the output of the individual transformers and the sparse\_threshold
keyword.
Signature
sparse_output_(): Promise<boolean>;
Returns
Promise
<boolean
>
Defined in: generated/compose/ColumnTransformer.ts:378 (opens in a new tab)
transformers_
The collection of fitted transformers as tuples of (name, fitted_transformer, column). fitted\_transformer
can be an estimator, ‘drop’, or ‘passthrough’. In case there were no columns selected, this will be the unfitted transformer. If there are remaining columns, the final element is a tuple of the form: (‘remainder’, transformer, remaining_columns) corresponding to the remainder
parameter. If there are remaining columns, then len(transformers\_)==len(transformers)+1
, otherwise len(transformers\_)==len(transformers)
.
Signature
transformers_(): Promise<any[]>;
Returns
Promise
<any
[]>
Defined in: generated/compose/ColumnTransformer.ts:351 (opens in a new tab)
Methods
dispose()
Disposes of the underlying Python resources.
Once dispose()
is called, the instance is no longer usable.
Signature
dispose(): Promise<void>;
Returns
Promise
<void
>
Defined in: generated/compose/ColumnTransformer.ts:138 (opens in a new tab)
fit()
Fit all transformers using X.
Signature
fit(opts: object): Promise<any>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.X? | ArrayLike [] | Input data, of which specified subsets are used to fit the transformers. |
opts.y? | ArrayLike [] | Targets for supervised learning. |
Returns
Promise
<any
>
Defined in: generated/compose/ColumnTransformer.ts:155 (opens in a new tab)
fit_transform()
Fit all transformers, transform the data and concatenate results.
Signature
fit_transform(opts: object): Promise<ArrayLike>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.X? | ArrayLike [] | Input data, of which specified subsets are used to fit the transformers. |
opts.y? | ArrayLike | Targets for supervised learning. |
Returns
Promise
<ArrayLike
>
Defined in: generated/compose/ColumnTransformer.ts:197 (opens in a new tab)
get_feature_names_out()
Get output feature names for transformation.
Signature
get_feature_names_out(opts: object): Promise<any>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.input_features? | any | Input features. |
Returns
Promise
<any
>
Defined in: generated/compose/ColumnTransformer.ts:241 (opens in a new tab)
init()
Initializes the underlying Python resources.
This instance is not usable until the Promise
returned by init()
resolves.
Signature
init(py: PythonBridge): Promise<void>;
Parameters
Name | Type |
---|---|
py | PythonBridge |
Returns
Promise
<void
>
Defined in: generated/compose/ColumnTransformer.ts:86 (opens in a new tab)
set_output()
Set the output container when "transform"
and "fit\_transform"
are called.
Calling set\_output
will set the output of all estimators in transformers
and transformers\_
.
Signature
set_output(opts: object): Promise<any>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.transform? | "default" | "pandas" | Configure output of transform and fit\_transform . |
Returns
Promise
<any
>
Defined in: generated/compose/ColumnTransformer.ts:281 (opens in a new tab)
transform()
Transform X separately by each transformer, concatenate results.
Signature
transform(opts: object): Promise<ArrayLike>;
Parameters
Name | Type | Description |
---|---|---|
opts | object | - |
opts.X? | ArrayLike [] | The data to be transformed by subset. |
Returns
Promise
<ArrayLike
>
Defined in: generated/compose/ColumnTransformer.ts:316 (opens in a new tab)