bolero.behavior_search.MonteCarloRL¶bolero.behavior_search.MonteCarloRL(action_space, gamma=0.9, epsilon=0.1, random_state=None)[source]¶Tabular Monte Carlo is a model-free reinforcement learning method.
This implements the epsilon-soft on-policy Monte Carlo control algorithm shown at page 120 of “Reinforcement Learning: An Introduction” (Sutton and Barto, 2nd edition, http://people.inf.elte.hu/lorincz/Files/RL_2006/SuttonBook.pdf). The action space and the state space must be discrete for this implementation.
| Parameters: |
|
|---|
get_args()¶Get parameters for this estimator.
| Returns: |
|
|---|
get_behavior_from_results(result_path)¶Recover search state from file.
| Parameters: |
|
|---|
get_best_behavior()[source]¶Returns the best behavior found so far.
| Returns: |
|
|---|
get_next_behavior()[source]¶Obtain next behavior for evaluation.
| Returns: |
|
|---|
init(n_inputs, n_outputs)[source]¶Initialize the behavior search.
| Parameters: |
|
|---|
is_behavior_learning_done()[source]¶Check if the value function converged.
| Returns: |
|
|---|
set_evaluation_feedback(feedbacks)[source]¶Set feedback for the last behavior.
| Parameters: |
|
|---|
write_results(result_path)¶Store current search state.
| Parameters: |
|
|---|