|
Post by dominic on Nov 28, 2016 20:43:01 GMT
My post got lost in the large post so here is a repost from there. I can find more scripts if we need them and if Umars code woks well for separating the scripts properly then that can fix up these scripts as well if it has to be done. Here is a link yet again to one of my onedrives. I put a bunch of scripts and I hope the format is consistent enough to transcribe it. 1drv.ms/f/s!AoNIuW-GkoxOgb5w8bOeMo4S3HWbxw Here is the link I got them from in case you guys wanna get anything specific: www.imsdb.com
|
|
|
Post by dominic on Nov 28, 2016 23:51:37 GMT
I trained up the unformatted friends script last week before we decided to use word by word instead of character by character. I let it train up over the weekend and tried the results today, it took 28 hours to train on the lua and torch model and the file was 6mb. 1drv.ms/f/s!AoNIuW-GkoxOgb8Vlvi1zu7Nkw47qw In the output folder there are 3 files called friends###char.txt, the ### refers to the number of characters output. Some of the html tags are broken and it mixed a lot of the talking with the [bracketed off scene actions], so if anything we can use this for the report or presentation and show the importance of formatting data correctly. Also each of the friends outputs are the same except more text is generated so maybe if we used another checkpoint with less trained data it would output a different friends script.
|
|
|
Post by Umar on Nov 29, 2016 1:36:18 GMT
Hey Dominic, My Matlab script can format the other scripits too. The only prerequisite is that the input to the script is an Excel files with 2 columns: First containing the name of the speaker and second contains the sentence spoken. This is how I wrote it because I though it will take less time and basically that's the first thought that came to my mind. I think this can by done by loading the file in Excel and separating based on some delimiter.(space, tab etc)
I updated my script to extract all the bracketed () or [] scene information, so now it doesn't show up in the processed data. f (
|
|
|
Post by dominic on Nov 29, 2016 2:41:35 GMT
Ok sounds good, should I re-upload the scripts in a space deliminated excel file for you or have you already done that?
|
|
|
Post by Umar on Nov 29, 2016 2:54:30 GMT
Assuming that you are talking about your scripts and not the friends script, yes you need to convert them into an Excel format with two columns like I have in the friends case.
|
|
|
Post by dominic on Nov 29, 2016 3:11:21 GMT
Ok I will go through the rest of the friends scripts and then do the movie scripts.
|
|
|
Post by dominic on Nov 29, 2016 7:24:52 GMT
The big post is now on page 3, so I am reposting a direct link to the friends data here so no one misses it.
1drv.ms/f/s!AoNIuW-GkoxOgb5w8bOeMo4S3HWbxw
|
|
|
Post by dominic on Nov 30, 2016 1:13:32 GMT
1drv.ms/f/s!AoNIuW-GkoxOgcJ9jvK3J544lBfMvw Thats a direct link to the target and input data for the season splits.
|
|