-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
creating objective function from a dataset #98
Comments
Hi @hududed Maybe similarly as in this vignette in the old mlrMBO (https://mlrmbo.mlr-org.com/articles/supplementary/human_in_the_loop_MBO.html). Maybe you can provide some additional info. In general, for human in the loop BO within
library(mlr3mbo)
library(mlr3learners)
library(bbotk)
library(data.table)
set.seed(1)
# helper function to print xs
format_xs = function(xs) {
paste0(map_chr(seq_along(xs), function(i) paste0(names(xs)[[i]], ": ", xs[[i]])), collapse = "; ")
}
# function that waits for evaluation and accepts user input
fun = function(xs) {
y = readline(prompt = paste0("Evaluate: ", format_xs(xs), "\n", "y = "))
list(y = as.numeric(y))
}
# objective
obfun = ObjectiveRFun$new(
fun = fun,
domain = ps(q = p_dbl(lower = -1, upper = 2), v = p_dbl(lower = -2, upper = 3)),
codomain = ps(y = p_dbl(tags = "minimize")),
properties = "noisy")
# instance
instance = OptimInstanceSingleCrit$new(
objective = obfun,
terminator = trm("evals", n_evals = 10))
# evaluate a custom design
design = data.table(q = c(0.5, 1, 2, -0.9, 1.8), v = c(-1.9, 1, 0, 2, 2.9))
instance$eval_batch(design)
# continue the optimization with mbo
opt("mbo")$optimize(instance) when you run this code you will see that the evaluation of an If this does not work for you, you can try to operate on the primitives of # example with your data
# probably not meaningful because you did not provide much info
# assume maximization of ratio depending on power, time, pressure, resistance
set.seed(1)
data = data.table(read.csv("batch-ai-data.csv"))
search_space = ps(power = p_int(lower = 0, upper = 1000),
time = p_int(lower = 0, upper = 5000),
pressure = p_int(lower = 100, upper = 500),
resistance = p_dbl(lower = 0, upper = 1))
codomain = ps(ratio = p_dbl(tags = "maximize"))
#data[, batch_nr := 1] # needed because Archive methods rely on it; assume data is the initial design
# construct the archive manually
archive = Archive$new(search_space = search_space, codomain = codomain)
# initialize archive with data
# archive$data = data
archive$add_evals(xdt = data[, c("power", "time", "pressure", "resistance")], ydt = data[, "ratio"])
# then work with the primitives as you would in `?bayesopt_ego`
# create a surrogate, acquisition function and acquisition function optimizer, for defaults, see `?mbo_defaults`
surrogate = srlrn(lrn("regr.km", control = list(trace = FALSE)), archive = archive) # GP
acq_function = acqf("ei", surrogate = surrogate) # EI
acq_optimizer = acqo(opt("random_search", batch_size = 1000),
terminator = trm("evals", n_evals = 1000),
acq_function = acq_function) # small random search
# now everything is initialized
# the following would be done repeatedly, i.e., this is now manually performing one iteration of the BO loop
acq_function$surrogate$update()
acq_function$update()
candidate = acq_optimizer$optimize() # tells you which candidate to evaluate
candidate
# proceed to "evaluate" the candidate (or any other point you want to) and update the archive manually
data_new = data.table(power = 370, time = 2779, pressure = 178, resistance = 0.05585319, ratio = 5)
#data_new[, batch_nr := archive$n_batch + 1] # we just evaluated a new point so we added the next batch
#archive$data = rbind(archive$data, data_new, fill = TRUE)
archive$add_evals(xdt = data_new[, c("power", "time", "pressure", "resistance")], ydt = data_new[, "ratio"])
# proceed to determine the best result manually (e.g., by surrogate prediction) ... I hope you find this helpful. |
Yes I think this is very helpful! Your second solution seems more appropriate, but let me try to expand with more info. I plan to train the initial data set and propose next N number of I will try your solution and give updates soon, thanks! |
@sumny So the second solution does propose a single
How do I work with the primitives of |
Here is an example on how to work with the primitives of # example with your data
# assume maximization of ratio depending on power, time, pressure, resistance
set.seed(1)
data = data.table(read.csv("batch-ai-data.csv"))
search_space = ps(power = p_int(lower = 0, upper = 1000),
time = p_int(lower = 0, upper = 5000),
pressure = p_int(lower = 100, upper = 500),
resistance = p_dbl(lower = 0, upper = 1))
codomain = ps(ratio = p_dbl(tags = "maximize"))
#data[, batch_nr := 1] # needed because Archive methods rely on it; assume data is the initial design
# construct the archive manually
archive = Archive$new(search_space = search_space, codomain = codomain)
# initialize archive with data
# archive$data = data
archive$add_evals(xdt = data[, c("power", "time", "pressure", "resistance")], ydt = data[, "ratio"])
# then work with the primitives as you would in `?bayesopt_mpcl`
# create a surrogate, acquisition function and acquisition function optimizer, for defaults, see `?mbo_defaults`
surrogate = srlrn(lrn("regr.km", control = list(trace = FALSE)), archive = archive) # GP
acq_function = acqf("ei", surrogate = surrogate) # EI
acq_optimizer = acqo(opt("random_search", batch_size = 1000),
terminator = trm("evals", n_evals = 1000),
acq_function = acq_function) # small random search
q = 14 # we want 14 proposals
lie = data.table() # needed for constant liear
liar = mean # liar function, e.g., constant mean
# now everything is initialized
# the following would be done repeatedly, i.e., this is now manually performing one iteration of the BO loop
# ----- begin of loop
acq_function$surrogate$update()
acq_function$update()
candidate = acq_optimizer$optimize() # first candidate
# prepare lie objects
tmp_archive = archive$clone(deep = TRUE)
acq_function$surrogate$archive = tmp_archive
lie[, archive$cols_y := liar(archive$data[[archive$cols_y]])]
candidate_new = candidate
# obtain the other q-1 candidates using fake archive
for (i in seq_len(q)[-1L]) {
tmp_archive$add_evals(xdt = candidate_new, xss_trafoed = transform_xdt_to_xss(candidate_new, tmp_archive$search_space), ydt = lie)
# update all objects with lie and obtain new candidate
acq_function$surrogate$update()
acq_function$update()
candidate_new = acq_optimizer$optimize()
candidate = rbind(candidate, candidate_new)
}
acq_function$surrogate$archive = archive # reset the working archive to the actual one and not the temporary lie archive
# proceed to "evaluate" the candidates and update the archive manually
data_new = data.table(power = c(370, 352, ...),
time = c(2779, 788, ...),
pressure = c(178, 160, ...),
resistance = c(0.05585319, 0.21729239, ...),
ratio = c(5, 9, ...) # evaluate all 14 candidates (indicated via dots)
#data_new[, batch_nr := archive$n_batch + 1] # we just evaluated a new batch
#archive$data = rbind(archive$data, data_new, fill = TRUE)
archive$add_evals(xdt = data_new[, c("power", "time", "pressure", "resistance")], ydt = data_new[, "ratio"])
# proceed to determine the best result manually (e.g., by surrogate prediction) ...
# ----- end of loop This is actually plenty of code now. It might work for you, however, I will think about adding more higher level support for such human in the loop scenarios as yours (see issue #100). On a side note, another possibility might be to write the function evaluation of your This way you would not have to manually work with the primitives on such a low level. |
Thanks for this. For single-point proposals, the loop to update the surrogate works for some learners e.g.
|
First, note that I updated the examples above to use Regarding your errors. lrn("regr.lightgbm")
(note the line We should assert this in If you still want to use regression models without an |
Ah I missed that edit! Thanks. Ok I may just look for those with both |
Previously with
mlrMBO
we were able to instantiate the model with batch-ai-data.csv ininitSMBO
as design:From the mlr3mbo docs its clear how to create the objective function
ObjectiveRFun$new(fun, domain, codomain)
with existing functionfun
- is there an example of how to create the objective function without a knownfun
but instead with a dataset, as ininitSMBO
above?The text was updated successfully, but these errors were encountered: