modified README

2018-02-24 16:26:19 +08:00 · 2018-02-24 16:26:19 +08:00 · a40e5aec54
commit a40e5aec54
parent f3aee448e0
1 changed files with 12 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -83,12 +83,6 @@ Try to use full names. Don't use abbrevations for class/function/variable names
 The """xxx""" comment should be written right after class/function. Also comment the part that's not intuitive during the code. We must comment, but for now we don't need to polish them.
 # High Priority TODO
 For Haosheng and Tongzheng: separate actor and critic, rewrite the interfaces for policy
 Others can still focus on the task below.
 ## TODO
 Search based method parallel.
@ -106,6 +100,18 @@ Note: install openai/gym first to run the Atari environment; note that interface
 Without preprocessing and other tricks, this example will not train to any meaningful results. Codes should past two tests: individual module test and run through this example code.
 ## Some bug to fix
 For DQN and other deterministic policy: $\epsilon$-greedy or other exploration during collection?
 In Batch.py, notice that we cannot stop by setting num_timestep
 Magic numbers
 ## One idea
 Like zhusuan, we can register losses background so that we need not claim it in the example.
 ## Dependency
 Tensorflow (Version >= 1.4)
 Gym