Post by jmewasiuk on Nov 26, 2016 19:53:06 GMT
Hi all,
So I have a preliminary model in tensorflow for a multilayered LSTM cell network.
Where I'm at now is formatting input data.
An LSTM network basically processes, sequences of chunks.
So for our case I figure that
scene (converation) == whole sequence
each line == a chunk
so a scene is a sequence of lines
I think we want to create inputs (x's) and targets (t's) as such:
1. t's : depends on which character in a script we are training for, so we need a "target id" -> name of the character should be good
2. Format our sequence training to train from start to finish of a scene(conversation).
Let tn = # of lines spoken by target_id(character
Let lti = length of line i spoken by target_id(character)
Then for each scene we will have a number of "subscenes" to train the model on the probability of the next
word from our vocabulary to choose given the sequence of words already used in this scene
So... (not sure if I got all the indices accurately... let me know if you don't quite understand the sequence)
subscene 1: x1 = the chunks(lines) of words up to target_id's 1st line, t1 = 1st word in target_id's 1st line
subscene 2: x2 = x1 + t1, t2 = 2nd word in target_id's 1st line
...
subscene k-1: x_k-1 = the chunks(lines) of words up to target_id's k-1th line + sum[0..lti-1]{ target_id's k-1th line}, t_k-1 = lti'th word in target_ids' k-1'th line
subscene k: xk = the chunks(lines) of words up to target_id's kth line, tk = 1st word in target_id's kth line
...
stop when we reach the end of target_id's last line in the scene
I figure with this type of training, we can give our model some line of text, then for some maximum variance we can generate a number of lines (sentences) of varying lengths by looping through each previously generated sequence of words and appending the next predicted word. We can have a stopping condition where we track the probability of the last appended word and then stop the generation when the probability of all words is below some percent to indicate that we have reached "end of thought".
That all said, our raw input data needs to be consistently formatted so we can automate the separation of scenes, lines and characters that spoke the lines.
Forgot to add: I'm still working on coding up running of each epoch to train the model I have set up.
So I have a preliminary model in tensorflow for a multilayered LSTM cell network.
Where I'm at now is formatting input data.
An LSTM network basically processes, sequences of chunks.
So for our case I figure that
scene (converation) == whole sequence
each line == a chunk
so a scene is a sequence of lines
I think we want to create inputs (x's) and targets (t's) as such:
1. t's : depends on which character in a script we are training for, so we need a "target id" -> name of the character should be good
2. Format our sequence training to train from start to finish of a scene(conversation).
Let tn = # of lines spoken by target_id(character
Let lti = length of line i spoken by target_id(character)
Then for each scene we will have a number of "subscenes" to train the model on the probability of the next
word from our vocabulary to choose given the sequence of words already used in this scene
So... (not sure if I got all the indices accurately... let me know if you don't quite understand the sequence)
subscene 1: x1 = the chunks(lines) of words up to target_id's 1st line, t1 = 1st word in target_id's 1st line
subscene 2: x2 = x1 + t1, t2 = 2nd word in target_id's 1st line
...
subscene k-1: x_k-1 = the chunks(lines) of words up to target_id's k-1th line + sum[0..lti-1]{ target_id's k-1th line}, t_k-1 = lti'th word in target_ids' k-1'th line
subscene k: xk = the chunks(lines) of words up to target_id's kth line, tk = 1st word in target_id's kth line
...
stop when we reach the end of target_id's last line in the scene
I figure with this type of training, we can give our model some line of text, then for some maximum variance we can generate a number of lines (sentences) of varying lengths by looping through each previously generated sequence of words and appending the next predicted word. We can have a stopping condition where we track the probability of the last appended word and then stop the generation when the probability of all words is below some percent to indicate that we have reached "end of thought".
That all said, our raw input data needs to be consistently formatted so we can automate the separation of scenes, lines and characters that spoke the lines.
Forgot to add: I'm still working on coding up running of each epoch to train the model I have set up.