diff --git a/404.html b/404.html index 16932f9cf..d56d937c8 100644 --- a/404.html +++ b/404.html @@ -13,10 +13,10 @@ - - - - + + + + @@ -34,7 +34,7 @@ mlr3pipelines - 0.5.0 + 0.5.1
rhs
:: character
Ids of the 'right-hand-side' PipeOp
s that have some unconnected output channels and therefore act as Graph
output layer.
input
:: data.table
with columns name
(character
), train
(character
), predict
(character
), op.id
(character
), channel.name
(character
)
+
input
:: data.table
with columns name
(character
), train
(character
), predict
(character
), op.id
(character
), channel.name
(character
)
Input channels of the Graph
. For each channel lists the name, input type during training, input type during prediction,
PipeOp
$id
of the PipeOp
the channel pertains to, and channel name as the PipeOp
knows it.
output
:: data.table
with columns name
(character
), train
(character
), predict
(character
), op.id
(character
), channel.name
(character
)
+
output
:: data.table
with columns name
(character
), train
(character
), predict
(character
), op.id
(character
), channel.name
(character
)
Output channels of the Graph
. For each channel lists the name, output type during training, output type during prediction,
PipeOp
$id
of the PipeOp
the channel pertains to, and channel name as the PipeOp
knows it.
packages
:: character
@@ -153,7 +153,7 @@
phash
:: character(1)
Stores a checksum calculated on the Graph
configuration, which includes all PipeOp
hashes
except their $param_set$values
, and a hash of $edges
.
keep_results
:: logical(1)
+
keep_results
:: logical(1)
Whether to store intermediate results in the PipeOp
's $.result
slot, mostly for debugging purposes. Default FALSE
.
man
:: character(1)
Identifying string of the help page that shows with help()
.
logical(1)
) -> character
PipeOp
s. This is in order that PipeOp
s were added if
sorted
is FALSE
, and topologically sorted if sorted
is TRUE
.
-add_pipeop(op, clone = TRUE)
+(PipeOp
| Learner
| Filter
| ...
, logical(1)
) -> self
Mutates Graph
by adding a PipeOp
to the Graph
. This does not add any edges, so the new PipeOp
will not be connected within the Graph
at first.
Instead of supplying a PipeOp
directly, an object that can naturally be converted to a PipeOp
can also
be supplied, e.g. a Learner
or a Filter
; see as_pipeop()
.
-The argument given as op
is always cloned; to access a Graph
's PipeOp
s by-reference, use $pipeops
.
+The argument given as op
is cloned if clone
is TRUE
(default); to access a Graph
's PipeOp
s
+by-reference, use $pipeops
.
Note that $add_pipeop()
is a relatively low-level operation, it is recommended to build graphs using %>>%
.
add_edge(src_id, dst_id, src_channel = NULL, dst_channel = NULL)
(character(1)
, character(1)
,
@@ -251,16 +252,19 @@
$predict
methods. See $packages
slot.
@@ -210,7 +210,7 @@ $train()
, because private$.train()
may theoretically be executed in a different R
-session (e.g. for parallelization).
$state
should furthermore always be set to something with copy-semantics, since it is never cloned. This is a limitation
not of PipeOp
or mlr3pipelines
, but of the way the system as a whole works, together with GraphLearner
and mlr3
.input :: data.table
with columns name
(character
), train
(character
), predict
(character
)
+
input :: data.table
with columns name
(character
), train
(character
), predict
(character
)
Input channels of PipeOp
. Column name
gives the names (and order) of values in the list given to
$train()
and $predict()
. Column train
is the (S3) class that an input object must conform to during
training, column predict
is the (S3) class that an input object must conform to during prediction. Types
@@ -222,7 +222,7 @@
.train()
and .predict()
functions are called multiple times, once for each Multiplicity
element.
The type enclosed by square brackets indicates that only a Multiplicity
containing values of this type are accepted.
See Multiplicity
for more information.output :: data.table
with columns name
(character
), train
(character
), predict
(character
)
+
output :: data.table
with columns name
(character
), train
(character
), predict
(character
)
Output channels of PipeOp
, in the order in which they will be given in the list returned by $train
and
$predict
functions. Column train
is the (S3) class that an output object must conform to during training,
column predict
is the (S3) class that an output object must conform to during prediction. The PipeOp
checks
@@ -400,9 +400,9 @@
$$new(id, param_set = ParamSet$new(), param_vals = list(), whole_task_dependent = FALSE, packages = character(0), task_type = "Task") PipeOpImpute
id
:: character(1)
+
context_cols
:: character
Names of features being selected by the context_columns
parameter.
intasklayout
:: data.table
+
intasklayout
:: data.table
Copy of the training Task
's $feature_types
slot. This is used during prediction to ensure that
the prediction Task
has the same features, feature layout, and feature types as during training.
outtasklayout
:: data.table
+
outtasklayout
:: data.table
Copy of the trained Task
's $feature_types
slot. This is used during prediction to ensure that
the Task
resulting from the prediction operation has the same features, feature layout, and feature types as after training.
model
:: named list
@@ -178,17 +178,17 @@
PipeOpImpute
;
If this method is not overloaded, it defaults to selecting the columns of type indicated by the feature_types
construction argument..train_imputer(feature, type, context)
-(atomic
, character(1)
, data.table
) -> any
+(atomic
, character(1)
, data.table
) -> any
Abstract function that must be overloaded when inheriting.
Called once for each feature selected by affect_columns
to create the model entry to be used for private$.impute()
. This function
is only called for features with at least one non-missing value.
.train_nullmodel(feature, type, context)
-(atomic
, character(1)
, data.table
) -> any
+(atomic
, character(1)
, data.table
) -> any
Like .train_imputer()
, but only called for each feature that only contains missing values. This is not an abstract function
and, if not overloaded, gives a default response of 0
(integer
, numeric
), c(TRUE, FALSE)
(logical
), all available levels (factor
/ordered
),
or the empty string (character
).
.impute(feature, type, model, context)
-(atomic
, character(1)
, any
, data.table
) -> atomic
+(atomic
, character(1)
, any
, data.table
) -> atomic
Imputes the features. model
is the model created by private$.train_imputer()
Default behaviour is to assume model
is an atomic vector
from which values are sampled to impute missing values of feature
. model
may have an attribute probabilities
for non-uniform sampling.
Task
has the same features, feature layout, and feature types as during training.outtasklayout
:: data.table
+
outtasklayout
:: data.table
Copy of the trained Task
's $feature_types
slot. This is used during prediction to ensure that
the Task
resulting from the prediction operation has the same features, feature layout, and feature types as after training.
dt_columns
:: character
@@ -260,9 +260,9 @@
$state
slot. Works analogously to
private$.train_task()
. If private$.predict_task()
should only be overloaded if private$.train_task()
is overloaded (i.e. private$.train_dt()
is not used)..train_dt(dt, levels, target)
-(data.table
, named list
, any
) -> data.table
| data.frame
| matrix
+(data.table
, named list
, any
) -> data.table
| data.frame
| matrix
Train PipeOpTaskPreproc
on dt
, transform it and store a state in $state
. A transformed object must be returned
-that can be converted to a data.table
using as.data.table
. dt
does not need to be copied deliberately, it
+that can be converted to a data.table
using as.data.table
. dt
does not need to be copied deliberately, it
is possible and encouraged to change it in-place.
The levels
argument is a named list of factor levels for factorial or character features.
If the input Task
inherits from TaskSupervised
, the target
argument
@@ -271,9 +271,9 @@
PipeOpTaskPreproc
, together with private$.predict_dt()
and optionally
private$.select_cols()
; alternatively, private$.train_task()
and private$.predict_task()
can be overloaded..predict_dt(dt, levels)
-(data.table
, named list
) -> data.table
| data.frame
| matrix
+(data.table
, named list
) -> data.table
| data.frame
| matrix
Predict on new data in dt
, possibly using the stored $state
. A transformed object must be returned
-that can be converted to a data.table
using as.data.table
. dt
does not need to be copied deliberately, it
+that can be converted to a data.table
using as.data.table
. dt
does not need to be copied deliberately, it
is possible and encouraged to change it in-place.
The levels
argument is a named list of factor levels for factorial or character features.
This method can be overloaded when inheriting PipeOpTaskPreproc
, together with private$.train_dt()
and optionally
diff --git a/reference/PipeOpTaskPreprocSimple.html b/reference/PipeOpTaskPreprocSimple.html
index b6d1958a1..76d98bf9f 100644
--- a/reference/PipeOpTaskPreprocSimple.html
+++ b/reference/PipeOpTaskPreprocSimple.html
@@ -10,7 +10,7 @@
something that will be written into $state
(which must not be NULL), private$.transform() should modify its argument in-place;
it is called both during training and prediction.
-This inherits from PipeOpTaskPreproc and behaves essentially the same.'>
as.data.table(dict)
Dictionary
-> data.table::data.table
+
as.data.table(dict)
Dictionary
-> data.table::data.table
Returns a data.table
with column key
(character
).
pipeline_bagging(graph, iterations = 10, frac = 0.7, averager = NULL)
pipeline_bagging(
+ graph,
+ iterations = 10,
+ frac = 0.7,
+ averager = NULL,
+ replace = FALSE
+)
PipeOpClassifAvg$new(innum = 0, collect_multiplicity = FALSE, id = "classifavg", param_vals = list())
innum
:: numeric(1)
+
PipeOpClassifAvg$new(innum = 0, collect_multiplicity = FALSE, id = "classifavg", param_vals = list())
innum
:: numeric(1)
Determines the number of input channels.
If innum
is 0 (default), a vararg input channel is created that can take an arbitrary number of inputs.
collect_multiplicity
:: logical(1)
diff --git a/reference/mlr_pipeops_classweights.html b/reference/mlr_pipeops_classweights.html
index 90b232d1d..2398e797f 100644
--- a/reference/mlr_pipeops_classweights.html
+++ b/reference/mlr_pipeops_classweights.html
@@ -3,7 +3,7 @@
able to use for sample weighting. Sample weights are added to each sample according to the target class.
Only binary classification tasks are supported.
Caution: when constructed naively without parameter, the weights are all set to 1. The minor_weight parameter
-must be adjusted for this PipeOp to be useful.">
PipeOpClassWeights$new(id = "classweights", param_vals = list())
id
:: character(1)
+
id
:: character(1)
Identifier of the resulting object, default "classweights"
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "colapply", param_vals = list()) PipeOpColApply
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "colapply"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
Task
is based on the (column) name(s) of the return value of the applicator function,
prefixed with the original feature name separated by a dot (.
).
@@ -148,7 +148,7 @@ Calls map
on the data, using the value of applicator
as f.
and coerces the output via as.data.table
.
Calls map
on the data, using the value of applicator
as f.
and coerces the output via as.data.table
.
$new(id = "collapsefactors", param_vals = list()) PipeOpCollapseFactors
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "collapsefactors"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "colroles", param_vals = list()) PipeOpColRoles
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "colroles"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise
@@ -207,20 +207,20 @@
$new(outnum, id = "copy", param_vals = list()) PipeOpCopy
outnum
:: numeric(1)
+
outnum
:: numeric(1)
Number of output channels, and therefore number of copies being made.
id
:: character(1)
Identifier of resulting object, default "copy"
.
$new(id = "encodeimpact", param_vals = list()) PipeOpEncodeImpact
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "encodeimpact"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would
@@ -238,18 +238,19 @@
$new(id = "imputehist", param_vals = list()) PipeOpImputeHist
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "imputehist"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(learner, id = NULL, param_vals = list()) PipeOpImputeLearner
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "impute."
, followed by the id
of the Learner
.
learner
:: Learner
| character(1)
Learner
to wrap, or a string identifying a Learner
in the mlr3::mlr_learners
Dictionary
.
@@ -305,7 +305,7 @@
$new(id = "imputemean", param_vals = list()) PipeOpImputeMean
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "imputemean"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "imputemedian", param_vals = list()) PipeOpImputeMedian
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "imputemedian"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
train_log
:: data.table
with columns class
(character
), msg
(character
)
+
train_log
:: data.table
with columns class
(character
), msg
(character
)
Errors logged during training.
train_time
:: numeric(1)
Training time, in seconds.
predict_log
:: NULL
| data.table
with columns class
(character
), msg
(character
)
+
predict_log
:: NULL
| data.table
with columns class
(character
), msg
(character
)
Errors logged during prediction.
predict_time
:: NULL
| numeric(1)
Prediction time, in seconds.
$new(learner, id = NULL, param_vals = list()) PipeOpLearnerCV
learner
:: Learner
Learner
to use for cross validation / prediction, or a string identifying a
+
learner
:: Learner
Learner
to use for cross validation / prediction, or a string identifying a
Learner
in the mlr3::mlr_learners
Dictionary
.
This argument is always cloned; to access the Learner
inside PipeOpLearnerCV
by-reference, use $learner
.
id
:: character(1)
@@ -148,11 +148,11 @@
$state
is set to the $state
slot of the Learner
object, together with the $state
elements inherited from the
PipeOpTaskPreproc
. It is a named list
with the inherited members, as well as:model
:: any
Model created by the Learner
's $.train()
function.
train_log
:: data.table
with columns class
(character
), msg
(character
)
+
train_log
:: data.table
with columns class
(character
), msg
(character
)
Errors logged during training.
train_time
:: numeric(1)
Training time, in seconds.
predict_log
:: NULL
| data.table
with columns class
(character
), msg
(character
)
+
predict_log
:: NULL
| data.table
with columns class
(character
), msg
(character
)
Errors logged during prediction.
predict_time
:: NULL
| numeric(1)
Prediction time, in seconds.
$new(id = "modelmatrix", param_vals = list()) PipeOpModelMatrix
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "modelmatrix"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(outnum , id = "multiplicityexply", param_vals = list()) PipeOpMultiplicityExply
outnum
:: numeric(1)
| character
+
outnum
:: numeric(1)
| character
Determines the number of output channels.
id
:: character(1)
Identifier of the resulting object, default "multiplicityexply"
.
$new(innum = 0, id = "multiplicityimply", param_vals = list()) PipeOpMultiplicityImply
innum
:: numeric(1)
| character
+
innum
:: numeric(1)
| character
Determines the number of input channels.
If innum
is 0 (default), a vararg input channel is created that can take an arbitrary number
of inputs. If innum
is a character
vector, the number of input channels is the length of
diff --git a/reference/mlr_pipeops_mutate.html b/reference/mlr_pipeops_mutate.html
index b7877737c..a5236cf0b 100644
--- a/reference/mlr_pipeops_mutate.html
+++ b/reference/mlr_pipeops_mutate.html
@@ -1,6 +1,6 @@
$new(id = "ovrunite", param_vals = list()) PipeOpOVRUnite
id
:: character(1)
+
id
:: character(1)
Identifier of the resulting object, default "ovrunite"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "quantilebin", param_vals = list()) PipeOpQuantileBin
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "quantilebin"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "randomprojection", param_vals = list()) PipeOpRandomProjection
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "randomprojection"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that
@@ -234,6 +234,7 @@
$new(id = "scalerange", param_vals = list()) PipeOpScaleRange
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "scalerange"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "smote", param_vals = list()) PipeOpSmote
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "smote"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "spatialsign", param_vals = list()) PipeOpSpatialSign
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "spatialsign"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
$new(id = "targettrafoscalerange", param_vals = list()) PipeOpTargetTrafoScaleRange
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "targettrafoscalerange"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise
@@ -227,12 +227,12 @@
$new(id = "textvectorizer", param_vals = list()) PipeOpTextVectorizer
id
:: character(1)
+
id
:: character(1)
Identifier of resulting object, default "textvectorizer"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
library("mlr3")
library("data.table")
# create some text data
-dt = data.table(
+dt = data.table(
txt = replicate(150, paste0(sample(letters, 3), collapse = " "))
)
task = tsk("iris")$cbind(dt)
@@ -361,6 +361,7 @@ Examples#> Use 'as(., "TsparseMatrix")' instead.
#> See help("Deprecated") and help("Matrix-deprecated").
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width txt.d txt.n
+#> <fctr> <num> <num> <num> <num> <num> <num>
#> 1: setosa 1.4 0.2 5.1 3.5 1 1
#> 2: setosa 1.4 0.2 4.9 3.0 1 0
#> 3: setosa 1.3 0.2 4.7 3.2 0 0
@@ -373,6 +374,7 @@ Examples#> 149: virginica 5.4 2.3 6.2 3.4 0 1
#> 150: virginica 5.1 1.8 5.9 3.0 0 0
#> txt.e txt.x txt.y txt.p txt.j txt.l txt.h txt.v txt.m txt.o txt.r txt.f
+#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 1 0 0 0 0 0 0 0 0 0 0 0
#> 2: 1 1 0 0 0 0 0 0 0 0 0 0
#> 3: 1 0 1 1 0 0 0 0 0 0 0 0
@@ -385,6 +387,7 @@ Examples#> 149: 0 0 0 0 0 0 0 0 0 1 0 0
#> 150: 0 0 0 0 1 0 0 0 0 0 1 0
#> txt.k txt.t txt.s txt.u txt.b txt.z txt.c txt.q txt.g txt.w
+#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 0 0 0 0 0 0 0 0 0 0
#> 2: 0 0 0 0 0 0 0 0 0 0
#> 3: 0 0 0 0 0 0 0 0 0 0
@@ -400,15 +403,19 @@ Examplesone_line_of_iris = task$filter(13)
one_line_of_iris$data()
-#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width txt
-#> 1: setosa 1.4 0.1 4.8 3 i k f
+#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width txt
+#> <fctr> <num> <num> <num> <num> <char>
+#> 1: setosa 1.4 0.1 4.8 3 i k f
pos$predict(list(one_line_of_iris))[[1]]$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width txt.d txt.n txt.e
+#> <fctr> <num> <num> <num> <num> <num> <num> <num>
#> 1: setosa 1.4 0.1 4.8 3 0 0 0
#> txt.x txt.y txt.p txt.j txt.l txt.h txt.v txt.m txt.o txt.r txt.f txt.k
+#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 0 0 0 0 0 0 0 0 0 0 1 1
#> txt.t txt.s txt.u txt.b txt.z txt.c txt.q txt.g txt.w
+#> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 0 0 0 0 0 0 0 0 0
diff --git a/reference/mlr_pipeops_threshold.html b/reference/mlr_pipeops_threshold.html
index 42432dcf3..3d67bacbb 100644
--- a/reference/mlr_pipeops_threshold.html
+++ b/reference/mlr_pipeops_threshold.html
@@ -1,7 +1,7 @@