modified README

This commit is contained in:
rtz19970824 2018-02-24 16:26:19 +08:00
parent f3aee448e0
commit a40e5aec54

View File

@ -83,12 +83,6 @@ Try to use full names. Don't use abbrevations for class/function/variable names
The """xxx""" comment should be written right after class/function. Also comment the part that's not intuitive during the code. We must comment, but for now we don't need to polish them. The """xxx""" comment should be written right after class/function. Also comment the part that's not intuitive during the code. We must comment, but for now we don't need to polish them.
# High Priority TODO
For Haosheng and Tongzheng: separate actor and critic, rewrite the interfaces for policy
Others can still focus on the task below.
## TODO ## TODO
Search based method parallel. Search based method parallel.
@ -106,6 +100,18 @@ Note: install openai/gym first to run the Atari environment; note that interface
Without preprocessing and other tricks, this example will not train to any meaningful results. Codes should past two tests: individual module test and run through this example code. Without preprocessing and other tricks, this example will not train to any meaningful results. Codes should past two tests: individual module test and run through this example code.
## Some bug to fix
For DQN and other deterministic policy: $\epsilon$-greedy or other exploration during collection?
In Batch.py, notice that we cannot stop by setting num_timestep
Magic numbers
## One idea
Like zhusuan, we can register losses background so that we need not claim it in the example.
## Dependency ## Dependency
Tensorflow (Version >= 1.4) Tensorflow (Version >= 1.4)
Gym Gym