Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DL4J: Add Padam adaptive gradient updater #6253

Open
AlexDBlack opened this issue Aug 23, 2018 · 9 comments
Open

DL4J: Add Padam adaptive gradient updater #6253

AlexDBlack opened this issue Aug 23, 2018 · 9 comments
Labels
Enhancement New features and other enhancements good first issue help wanted

Comments

@AlexDBlack
Copy link
Contributor

The Padam updater was recently described here: https://arxiv.org/pdf/1806.06763.pdf

It is an extension of Adam/AMSGrad that claims improved performance (accuracy) like SGD while still maintaining high convergence rates of Adam/AMSGrad. Mathematically, it's basically a blending of SGD and AMSGrad.

Implementing this isn't a high priority for the core team. If anyone wants to tackle this, there are configuration and implementation classes here (we'll need one of each for Padam):
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning/config
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning

@AlexDBlack AlexDBlack added Enhancement New features and other enhancements help wanted labels Aug 23, 2018
@stolsvik
Copy link

For some extra reference: #5843 (comment) and some few comments downstream.

@achalagarwal
Copy link

I'd like to give this a shot! As it's my first time contributing to DL4J, do you have any advice/suggestions for me?

@achalagarwal
Copy link

@AlexDBlack

Are we sure about creating new classes for Padam?
As it's just a slight modification over Amsgrad, will it not be better if we provide support for Padam via AmsGrad by simply adding additional fields wherever required?

@saudet
Copy link
Contributor

saudet commented Mar 24, 2019 via email

@AlexDBlack
Copy link
Contributor Author

Yeah, I'm ok with a separate class (extending AMSGrad if that makes sense).
Though we might end up with a little redundancy, I think I'd prefer a dedicated class for it for usability reasons - i.e., it'll be easier to find as a dedicated class rather than as an option in AMSGrad.

@achalagarwal
Copy link

@AlexDBlack

Hi, I added the required classes. It is safe to merge:
Unified Commit

@saudet

I did not add the predicate for the range of param, instead logged a warning. Will need your help with predicates.

Haven't requested a pull as I haven't tested the code yet. Couldn't build the project (tried a lot of things) using IntelliJ on Macos. Can someone point me to a thorough readme/guide for the same?

@saudet
Copy link
Contributor

saudet commented Mar 29, 2019

@achalagarwal It's name is "Preconditions" actually, just do something like this:

Preconditions.checkArgument(bias != null, "LayerNorm: Use constructor without bias argument if bias is null / not available.");

https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/api/ops/impl/transforms/custom/LayerNorm.java#L47

Use Maven on the command line with mvn clean install -Dmaven.test.skip before trying it in an IDE.

@achalagarwal
Copy link

achalagarwal commented Apr 2, 2019

The build was successful but I had to skip a couple of projects due to network issues (HTTP requests failed)

On Ubuntu:
mvn clean install -Dmaven.test.skip -pl '!:deeplearning4j-dataimport-solrj, !:deeplearning4j-modelexport-solr

@AlexDBlack

Now, how do you suggest I validate the correctness of Padam? Do you want me to build a model and replicate results from a publication? This will take a lot of time. Are there relevant tests for the linalg/learning modules? I could not find any.

Commit

cc: @saudet

@AlexDBlack
Copy link
Contributor Author

@achalagarwal we have updater tests here, adding to that would be good:
https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/updater/TestUpdaters.java

We'll carefully review the implementation too once you've opened a pull request. That should be good enough I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New features and other enhancements good first issue help wanted
Projects
None yet
Development

No branches or pull requests

5 participants