modified README

2018-02-24 16:26:19 +08:00 · 2018-02-24 16:26:19 +08:00 · a40e5aec54
commit a40e5aec54
parent f3aee448e0
1 changed files with 12 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -83,12 +83,6 @@ Try to use full names. Don't use abbrevations for class/function/variable names

 The """xxx""" comment should be written right after class/function. Also comment the part that's not intuitive during the code. We must comment, but for now we don't need to polish them.

-# High Priority TODO
-
-For Haosheng and Tongzheng: separate actor and critic, rewrite the interfaces for policy
-
-Others can still focus on the task below.
-
 ## TODO
 Search based method parallel.

@ -106,6 +100,18 @@ Note: install openai/gym first to run the Atari environment; note that interface

 Without preprocessing and other tricks, this example will not train to any meaningful results. Codes should past two tests: individual module test and run through this example code.

+## Some bug to fix
+
+For DQN and other deterministic policy: $\epsilon$-greedy or other exploration during collection?
+
+In Batch.py, notice that we cannot stop by setting num_timestep
+
+Magic numbers
+
+## One idea
+
+Like zhusuan, we can register losses background so that we need not claim it in the example.
+
 ## Dependency
 Tensorflow (Version >= 1.4)
 Gym