bolero.environment
.Catapult¶bolero.environment.
Catapult
(segments=10, catapult_pos=array([ 0., 0.]), velocity_penalty=0.1, context_distribution=None, context_interval=(2, 10), random_state=None, verbose=0)[source]¶Catapult environment, a benchmark for contextual policy search.
In this benchmark problem, the agent controls a catapult which shoots onto specific target positions (the contexts) on a one-dimensional surface. The agent sets the parameters of the shot (velocity and angle of the catapult), and this environment simulates the shoot. The actual position where the object hits the ground is not communicated to the agent. Instead the agent is told only the cost of this specific trial, where cost is defined as cost = -abs(hit_position - target_position) - velocity_penalty * v, where target_position is the respective context, v ist the velocity of the shoot and velocity_penalty is configurable. Thus, this environment defines a contextual policy search problem.
See also
Bruno Castro da Silva, George Konidaris, Andrew Barto, “Active Learning of Parameterized Skills”, ICML 2014
Parameters: |
|
---|
__init__
(segments=10, catapult_pos=array([ 0., 0.]), velocity_penalty=0.1, context_distribution=None, context_interval=(2, 10), random_state=None, verbose=0)[source]¶get_args
()¶Get parameters for this estimator.
Returns: |
|
---|
request_context
(context=None)[source]¶Request that a specific context is used.
Parameters: |
|
---|---|
Returns: |
|