text generation using Estimator API

Multi tool use

text generation using Estimator API
I have been trying to start transitioning to Estimator API since it is recommended by tensorflow people. However, I wonder how some of the basic stuff can be done efficiently with estimator framework. During weekend I tried to create GRU based model for text generation and followed tensorflow example for building custom estimators. I have been able to create a model that I can train relatively easily and its results match with non-estimator version. However for sampling (generating text) I faced with some trouble. I finally made it work but it is very slow, since every time it predict a character, the estimator framework load the whole graph which makes the whole thing slow. Is there a way to not load the graph every time or any other solution?
Second issue: I also had to use state_is_tuple=False since I have to send back and forth the state of GRU (between model method and generator method) and I can't send tuples. Any one knows how to deal with this?
thanks
P.S. Here is a link to my code example: https://github.com/amirharati/sample_estimator_charlm/blob/master/RnnLm.py
tf.contrib.seq2seq
thanks I will check it out
– Amir Harati
Jul 11 at 17:25
1 Answer
1
I finally had the time to try suggestion by xdurch0 Here is the code if anyone else had same problem:
https://github.com/amirharati/sample_estimator_charlm/blob/master/RnnLmS2S.py
In short it works but it seems to me it takes more time to converge relative to simple RNN implementation (for example the character model tends to generate repetitive pattern and have more difficulty learnt o emit but models works better for word model)
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
You can slightly abuse the functionality in
tf.contrib.seq2seq
for this purpose. Unfortunately this is quite involved and introduces tons of extra concepts. See here for a tutorial. The idea would be not to have an encoder-decoder like in the tutorial, but basically only a decoder that generates text "from nothing" (or some initial state).– xdurch0
Jul 11 at 7:27