Project Discussion

Umar
Administrator

Posts: 15

Project Discussion Nov 18, 2016 23:13:57 GMT

Quote

Post by Umar on Nov 18, 2016 23:13:57 GMT

Hi All,

So we discussed that our project is very similar to a chat bot, but with different personalities. The starting point would then be to build and train a bot with one personality. Then move on to the next level.

We can used data from movie scripts and so far we are thinking about using personalities from the popular show FRIENDS.

Looking at the examples people have done, the chat bot can be put together by using Tensorflow. It will be an RNN with encoder decoder structure. The following links are helpful.

lauragelston.ghost.io/speakeasy-pt1/
www.tensorflow.org/versions/r0.8/tutorials/seq2seq/index.html
www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/
suriyadeepan.github.io/2016-06-28-easy-seq2seq/

Pl add if you think of something.

Thanks

dominic
New Member

Posts: 20

Project Discussion Nov 19, 2016 8:40:31 GMT

Quote

Post by dominic on Nov 19, 2016 8:40:31 GMT

Just posting some of the links I emailed for easy access.

Friends Scripts: www.friendstranscripts.tk
Wikipedia Database: en.wikipedia.org/wiki/Wikipedia:Database_download

Jennifer shared these links with us earlier that are really good for machine learning and natural language processing:
www.youtube.com/playlist?list=PLQVvvaa0QuDf2JswnfiGkliBInZnIC4HL
www.youtube.com/playlist?list=PLQVvvaa0QuDfKTOs3Keq_kaG2P55YRn5v

I will keep reading Umars notes and hopefully we can have something done soon.

EDIT:

This guy does a lot of machine learning and natural language processing. I referenced his site for one of the assignments and this page seems to have a lot of good information that we can use with the report and references that we can check out for more methods.
sebastianruder.com/word-embeddings-1/

Last Edit: Nov 21, 2016 8:59:18 GMT by dominic

jmewasiuk
New Member

Posts: 23

Project Discussion Nov 22, 2016 5:56:23 GMT

Quote

Post by jmewasiuk on Nov 22, 2016 5:56:23 GMT

I found this while going through the links you gave
www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/

It's part 2 of the link you posted above.

Is this the chat bot you were proposing to do?

Jen

dominic
New Member

Posts: 20

Project Discussion Nov 22, 2016 9:28:18 GMT

Quote

Post by dominic on Nov 22, 2016 9:28:18 GMT

Still trying to install all the lua and cuda dependencies, will continue tomorrow and hopefully have something for Wednesday. In the meantime here is the link I was talking about and you all can look over it if you have some free time.

karpathy.github.io/2015/05/21/rnn-effectiveness/

Inside the page is a link to his github with all the code: github.com/karpathy/char-rnn

He has a python version that I will try if this doesn't work but it seems like the lua version is improved with CUDA support.

EDIT:

Just as I posted this I saw your post Jennifer. I think I looked over that page as well but I will try to get this working and let you know.

Last Edit: Nov 22, 2016 9:29:52 GMT by dominic

Umar
Administrator

Posts: 15

Project Discussion Nov 22, 2016 9:44:50 GMT

Quote

Post by Umar on Nov 22, 2016 9:44:50 GMT

Nov 22, 2016 5:56:23 GMT jmewasiuk said:

I found this while going through the links you gave
www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/

It's part 2 of the link you posted above.

Is this the chat bot you were proposing to do?

Jen

Yes. I was thinking of doing something like this

saad New Member Posts: 12	Project Discussion Nov 22, 2016 10:07:51 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by saad on Nov 22, 2016 10:07:51 GMT Good,

saad New Member Posts: 12	Project Discussion Nov 22, 2016 10:08:47 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by saad on Nov 22, 2016 10:08:47 GMT github.com/gunthercox/ChatterBot www.quora.com/What-is-the-best-way-to-learn-and-write-a-AI-Chat-bot

saad New Member Posts: 12	Project Discussion Nov 22, 2016 10:21:19 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by saad on Nov 22, 2016 10:21:19 GMT github.com/glenncameronjr/deeplearning-chatbot www.wildml.com/

dominic
New Member

Posts: 20

Project Discussion Nov 23, 2016 1:01:58 GMT

Quote

Post by dominic on Nov 23, 2016 1:01:58 GMT

Just an update on my part. I am having issues getting the graphics drivers working within my virtual box. I am having the login loop issues if anyone knows how to resolve it. I am looking through troubleshooting guides online and hopefully find a fix soon, but if anyone gets any of the chatbots working then let us know.

EDIT:

1drv.ms/i/s!AoNIuW-GkoxOgbwJ2M66VwUBl1QQ5A

Here is a screenshot from the training so far with CPU. It takes about half a second per epoch with 21000 epochs in a 1mb file containing lines from shakespear. I will try to get GPU working somehow becuse this will take way to long to train most likely for our file.

Last Edit: Nov 23, 2016 2:10:15 GMT by dominic

dominic
New Member

Posts: 20

Project Discussion Nov 23, 2016 6:25:37 GMT

Quote

Post by dominic on Nov 23, 2016 6:25:37 GMT

I tried to get my GPU working through the virtual machine and even though a new extension for PCI passthrough has been included with the latest version of virtual box, it does not support all cards, mine included. However I will run the software to train on the shakespeare data and see how similar the results will be to the authors. I estimate it will take 3 hours to train so hopefully I will update you guys tonight in a few hours. It looks like I would have to dual boot it with ubuntu to use my hardware and Jennifer mentioned she already has a dual boot with a better graphics card so we can just use that most likely.

However if we can't get the chat bot working properly such as having user input and the bot responding. We can change the project slightly by using machine learning to see how well it can write new episodes for tv shows. This would slightly change the project from the original idea but we can do some more research on Recurrent Neural Networks as there are a lot of sources on the page that we can refer to for the graduate requirements of the project. We would have to compile all the friends data like this: raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt ,into one file and then feed it into the algorithm. The author mentions the shakespear data is about 1mb and there is just over 6mb of data in the friends script. Luckily he mentions when there is a lot of data and how to deal with it in his code and specifically mentions 1mb and 6mb.

Otherwise we can use Saad's project idea and go with that. We can discuss this all tomorrow after class and make a final decision before submitting that proposal.

saad
New Member

Posts: 12

Project Discussion Nov 23, 2016 9:28:21 GMT

Quote

Post by saad on Nov 23, 2016 9:28:21 GMT

Plan B (In case: Plan A does not work)

Guys, today I am still thinking if we can use NLP, Deep Learning for Chatbot iaproject. If we somehow get a pre-trained chatbot (just like Dr. Mori shared with us in assignment 3), it will be great.

Here is the link to the Plan-B. The goal is to track the players using the Boosted particle filter. The particle filter is an example of Baysian filtering (HMM), and used for tracking multiple objects in the scene. The PAMI Transaction is listed in his webpage.

www.cs.ubc.ca/~vailen/publications.shtml

dominic
New Member

Posts: 20

Project Discussion Nov 23, 2016 9:35:39 GMT

Quote

Post by dominic on Nov 23, 2016 9:35:39 GMT

It just finished training and I managed to grab some outputs. Click on the link to my onedrive and go to the outputs folder to see the text outputs.

1drv.ms/f/s!AoNIuW-GkoxOgbwwLyn-wbwqxszciw

I put two text and image files for convenience in there. All the trained data is in the CV folder as the algorithm makes checkpoints every 1000 epochs and the naming convention is as follows for the example 'lm_lstm_epoch2.36_1.7675.t7', 2.36 is the current epoch it is at so 2 full passes and working on a third and the 1.7675 is validation lost at that moment. The validation loss improves down to about 1.4 at the final epoch. If you guys wish to try it out with the pretrained data follow the instruction on the following link, one note is that if we want to use the GPU we would have to retrain it with the GPU otherwise there are some errors with compatibility.

github.com/karpathy/char-rnn

jmewasiuk
New Member

Posts: 23

Project Discussion Nov 24, 2016 0:48:14 GMT dominic likes this

Quote

Post by jmewasiuk on Nov 24, 2016 0:48:14 GMT

So I'm running training on my GPU. So far it's motored through 12 epochs somewhere between 5 and 10 minutes (don't know exact time, didn't think to record until I thought about it just now, ~0.12 s/batch, 6000/21150 - do'h it's on epoch 14 now). This is the Lua/Torch example.

dominic
New Member

Posts: 20

Project Discussion Nov 24, 2016 4:10:11 GMT

Quote

Post by dominic on Nov 24, 2016 4:10:11 GMT

Ok great, Umar and I are in the lab installing ubuntu on my lab computer because it has a nvidia gt 640 so we can use that here to train models as well.

EDIT:

I concatenated all the friends data and I will try train up the data tomorrow, unless Jennifer you would like to train it. Here is a link to my update onedrive folder with the friends texts, they are in html format so hopefully that wont be an issue.

1drv.ms/f/s!AoNIuW-GkoxOgbwInz8fWG8DB78RRw

Last Edit: Nov 24, 2016 9:31:49 GMT by dominic

john
New Member

Posts: 3

Project Discussion Nov 24, 2016 19:25:40 GMT via mobile

Quote

Post by john on Nov 24, 2016 19:25:40 GMT

Hi guys,

So I've looked into possible web frameworks to use for out input and output. The one I found is called flask. This framework is simple and seems to have enough functionality for our purposes. I'm working on making an html page for input, and then another html page for the output. I'm at work currently, but I'll hopefully be able to get a fully working version up by tonight or tomorrow morning.

John

Ml-group

Project Discussion

Post by Umar on Nov 18, 2016 23:13:57 GMT

Post by dominic on Nov 19, 2016 8:40:31 GMT

Post by jmewasiuk on Nov 22, 2016 5:56:23 GMT

Post by dominic on Nov 22, 2016 9:28:18 GMT

Post by Umar on Nov 22, 2016 9:44:50 GMT

Post by saad on Nov 22, 2016 10:07:51 GMT

Post by saad on Nov 22, 2016 10:08:47 GMT

Post by saad on Nov 22, 2016 10:21:19 GMT

Post by dominic on Nov 23, 2016 1:01:58 GMT

Post by dominic on Nov 23, 2016 6:25:37 GMT

Post by saad on Nov 23, 2016 9:28:21 GMT

Post by dominic on Nov 23, 2016 9:35:39 GMT

Post by jmewasiuk on Nov 24, 2016 0:48:14 GMT

Post by dominic on Nov 24, 2016 4:10:11 GMT

Post by john on Nov 24, 2016 19:25:40 GMT

Quick Reply