bolero.behavior_search
.MonteCarloRL¶bolero.behavior_search.
MonteCarloRL
(action_space, gamma=0.9, epsilon=0.1, random_state=None)[source]¶Tabular Monte Carlo is a model-free reinforcement learning method.
This implements the epsilon-soft on-policy Monte Carlo control algorithm shown at page 120 of “Reinforcement Learning: An Introduction” (Sutton and Barto, 2nd edition, http://people.inf.elte.hu/lorincz/Files/RL_2006/SuttonBook.pdf). The action space and the state space must be discrete for this implementation.
Parameters: |
|
---|
get_args
()¶Get parameters for this estimator.
Returns: |
|
---|
get_behavior_from_results
(result_path)¶Recover search state from file.
Parameters: |
|
---|
get_best_behavior
()[source]¶Returns the best behavior found so far.
Returns: |
|
---|
get_next_behavior
()[source]¶Obtain next behavior for evaluation.
Returns: |
|
---|
init
(n_inputs, n_outputs)[source]¶Initialize the behavior search.
Parameters: |
|
---|
is_behavior_learning_done
()[source]¶Check if the value function converged.
Returns: |
|
---|
set_evaluation_feedback
(feedbacks)[source]¶Set feedback for the last behavior.
Parameters: |
|
---|
write_results
(result_path)¶Store current search state.
Parameters: |
|
---|