python 推车杆

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python 推车杆相关的知识,希望对你有一定的参考价值。

import gym
import numpy

def run_episode(env, params):
    observation = env.reset()
    totalreward = 0
    for _ in xrange(200):
        env.render()
        action = 0 if numpy.matmul(params, observation) < 0 else 1
        observation, reward, done, info = env.step(action)
        totalreward += reward
        # print totalreward
        if done:
            print "Done. Observation:"
            print observation
            break

    return totalreward

def train():
    env = gym.make('CartPole-v0')
    # env = gym.make("SpaceInvaders-v0")
    print env.action_space.sample()

    noise_scaling = 0.1
    params = numpy.random.rand(4) * 2 - 1
    # params = [ 0.83377038, 0.55692095, 0.87120624, 0.4070341 ]
    # params = [ 0.79753266, 0.53757924, 0.95562536, 0.41932016 ]
    best_reward = 0

    for _ in xrange(10000):
        new_params = params + (numpy.random.rand(4) * 2 - 1) * noise_scaling
        reward = run_episode(env, new_params)
        if reward > best_reward:
            best_reward = reward
            params = new_params
            print "==================================================================== New best: %d" % best_reward
            print params
            if reward == 200:
                break

train()

以上是关于python 推车杆的主要内容,如果未能解决你的问题,请参考以下文章

python 笔记 :Gym库 (官方文档笔记)

text 迷你推车数量改变数量迷你

php 迷你推车迷你

php 推车式的functions.php

markdown 抽屉推车中的额外结帐按钮

css 使基于Woocommerce表的推车响应,以便它可以在移动设备上运行