Fork me on GitHub

bolero.behavior_search.ContextualBehaviorSearch

class bolero.behavior_search.ContextualBehaviorSearch[source]

Common interface for (contextual) behavior search.

__init__()

x.__init__(…) initializes x; see help(type(x)) for signature

get_args()

Get parameters for this estimator.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

get_behavior_from_results(result_path)[source]

Recover search state from file.

Parameters:
result_path : string

path in which we search for the file

get_best_behavior_template()[source]

Return current best estimate of contextual policy.

get_desired_context()[source]

Chooses desired context for next evaluation.

Returns:
context : array-like, shape (n_context_dims,), optional (default: None)

The context in which the next rollout shall be performed. If None, the environment may select the next context without any preferences.

get_next_behavior()[source]

Obtain next behavior for evaluation.

Returns:
behavior : Behavior

mapping from input to output

init(n_inputs, n_outputs, n_context_dims)[source]

Initialize the behavior search.

Parameters:
n_inputs : int

number of inputs of the behavior

n_outputs : int

number of outputs of the behavior

n_context_dims : int

number of context dimensions

is_behavior_learning_done()[source]

Check if the behavior learning is finished, e.g. it converged.

Returns:
finished : bool

Is the learning of a behavior finished?

set_context(context)[source]

Set context of next evaluation.

Note that the set context need not necessarily be the same that was requested by get_desired_context().

Parameters:
context : array-like, shape (n_context_dims,)

The context in which the next rollout will be performed

set_evaluation_feedback(feedbacks)[source]

Set feedback for the last behavior.

Parameters:
feedbacks : list of float

feedback for each step or for the episode, depends on the problem

write_results(result_path)[source]

Store current search state.

Parameters:
result_path : string

path in which the state should be stored