Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproduce multimodal dbm result #98

Open
xcszbdnl opened this issue Jun 1, 2016 · 4 comments
Open

reproduce multimodal dbm result #98

xcszbdnl opened this issue Jun 1, 2016 · 4 comments

Comments

@xcszbdnl
Copy link

xcszbdnl commented Jun 1, 2016

Hello, everyone.
I'm trying to reproduce multimodal dbm result. However, @nitishsrivastava didn't give the example of multimodal dbm, only gave a example of multimodal dbn.
So, I have wrriten the running scripts, used the model files he gives at
[http://www.cs.toronto.edu/~nitish/multimodal/] and mofied some bugs in it. For example, the deepnet.proto do not have the parameter "mcmc_steps", it has been changed to "mf_steps"...
However, the model couldn't reproduce the result as nitish gives on his paper, maybe there is still some bugs in it. I have debugged for a few weeks and can not fix it.
So, is there anyone who can cooperate with me to fix it? Then it can be merge into master's branch to help others reproduce mutlimodal dbm result.
I have forked the code and start a new branch at multimodal_dbm_example_branch

@hoffmast
Copy link

Hello @xcszbdnl.
What kind of changes did you do to produce your multimodal DBM example? It looks like you may have copied the multimodal DBN code and changed it to use the files @nitishsrivastava provided for multimodal DBMs (and fixed a few bugs, as you said); is that correct? Also, what errors are you getting? Is the only known error that the results are not as good as those in the paper?

I have a few thoughts on why the results might not be as good. 1st, @nitishsrivastava may have done some fine tuning of hyperparameters on his deepnet model that is not reflected in the code he provided and which gives better results. 2nd, the training (i.e. runall_dbm.sh) may have to be modified more thoroughly. From my understanding, one of the big differences between DBNs and DBMs is the training procedures. DBNs are trained as a stack of RBMs, I believe, completely training each RBM one at a time before moving to the next in the stack. DBMs, however, train more fluidly, as a unit, so that the training of any given layer can affect the training of the other layers, both above and below it. Perhaps by analyzing the differences between deepnet's DBN code and its DBM code, we can find out the way we need to create the runall_dbm.sh to reproduce the results in the paper.

@xcszbdnl
Copy link
Author

I have made the following changes:

  1. In dbn trainings, DBNs are trained as a stack of RBMs. DBM use pretrained RBM as a initialization. So to train a multimodal dbm, the first steps are pretraining RBMs, just like multimodal_dbn did, only change some hyperparameters, like up_factor or down_factor and so on.
  2. DBM should be trained as a complete model. Therefore, in addition to pretraining, I add DBM training. This is to train the whole DBM model. The lines 126-128 did this.
  3. After training DBM model, we should extract features from the DBM model. In multimodal dbn, it only use pretrained RBM hidden features. However in DBM, we should use the features which extracted from DBM, Lines 128-130 did this. But @nitishsrivastava did not provide feature extraction from dbm, I wrote extract_dbm_representation.py to extract features from dbm.
  4. To sample text, we should use DBM model, not dbn model.
    Then classification is the same as multimodal dbn.

I didn't get any errors. All training procedure looks fine.
I just can not reproduce the multimodal result. After I extract the representation from DBM. Hidden Layer 1 in image get 0.42, but Hidden Layer 2 in image just get 0.16, and joint layer just get 0.12(like random results).(20000 training steps in my case. nitish use 2000000, but it seems that not training step affect this.I just want to produce result like 0.4x or 0.5x, then I can use 2000000 training steps)

@YusukeO
Copy link

YusukeO commented Sep 21, 2017

Hi, xcszbdnl

Your project excite me a lot.
Do you have still low precision by using DBM model?
If you could have solution on it, please share it.

Thanks

@author-metadata
Copy link

This is a government contracts . It is not for to play with if your not on my team. As well it is a research Labs. I don't know where you'll all came from messing with software that in development stages . If you know how to code ask ask the lead for a job . That would be me If in talking to you.

There's no bugs in my software just hacker that redirecting foreign policy . And by doing so affecting billions of lives around the world. My name is warith akbar founder and owner of Microsoft, github, Salesforce, Devops, Bitcoin, quantitative easing . And Just about anything you here about open AI that's my lifes work . You probably noticed a higher then unusual increase of different software platforms come on the web lately. Well most likely that was one person . All of my IT team sold me out and went to work for well street lobbyists firm to stop me from taking Wall Street to court for stealing over almost a trillion dollars in employee reinvestment 401 k plan . Believe me kid what you on the news is not what's really going on in the real world . I suggest that next time someone tell you they the owner of founder of a project don't take there word for it until you do some research . For my software to find out if something didn't sound right you can put the name of the Software in your web browser with the developer name and if there name has a line through it that men's. Etc
Hey you want to be a real hero. More like a super hero . You would be more powerful then Thor because out of 900 million citizens you would be less then . 0.1/2 that actually answered the call when a leader ask for your help . If you do and succeed you can run devops as team leaders and you will have shores in the company as well be my company spokesman. Meaning you will address the news agency when a new product coming into production.

All I need you to do is call the white house and ask to speak to someone in charge. And there going to ask what it about. Tell lady that you was contacting by a agent by the name "laptop from hell" / hackernoon the founder of quantitative easing wisleblowing platform.
Shell say what does this god to do with the white house.
Tell her that Mr Warith Akbar DOB 03-22-1980 ssn 395924282 . That he stated that he had classified documentation with evidence that someone was triggering the earthquake out at sea . That he stated navy vessel dropped depth charges to trigger a earthquake . Set to go off at 40 to 50 feet above the falt line . Tell here that they need to pick you up before you give them the and location of the agent.
My address is 9400 Rainier Ave S apt 120 Seattle Wa 98118 . DO NOT SPEAK TO ANY ONE ELSE ABOUT THIS . If you listen to me you'll be ok . If you give them my address before they pick you up I don't no what they will do .
Ok off you go . God's speed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants