Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Example] fix cpp example inception-bn and training acc issue #13284

Merged
merged 11 commits into from
Nov 27, 2018

Conversation

roywei
Copy link
Member

@roywei roywei commented Nov 15, 2018

Description

  1. Fix: [cpp-package] inception_bn.cpp is wrong #9417
    Root Cause:
    Passing correct padding (1, 1) in Pooling in InceptionFactoryB
    Reference python implementation, pad=(1, 1)
    Tested example working fine.

  2. fix some models not training: CPP examples training acc does not increase #13243, Strange behaviour/possible bug with mxnet::cpp::Symbol::Variable name argument #12966, [C++] Some Variable Name cause no Gradient #8108
    Root Cause:
    Wrong logic for gradient requirement: symbols with name length 3 and 4 won't have gradients
    Changed to:
    Symbols names contains "data" or "label" should not be required for gradients

  3. Added parameter initialization same as python versions

  4. added inception-bn to unit tests

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

see description

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@roywei roywei requested a review from nswamy as a code owner November 15, 2018 17:40
@roywei roywei changed the title [Example ]fix cpp example inception-bn and training acc issue [Example] fix cpp example inception-bn and training acc issue Nov 15, 2018
Copy link
Contributor

@stu1130 stu1130 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest LGTM

@@ -172,7 +173,13 @@ int main(int argc, char const *argv[]) {
auto val_iter = MXDataIter("MNISTIter");
setDataIter(&val_iter, "Label", data_files, batch_size);

Optimizer* opt = OptimizerRegistry::Find("ccsgd");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason to change the optimizer?

@@ -36,6 +36,9 @@ cp ../../build/cpp-package/example/lenet_with_mxdataiter .
cp ../../build/cpp-package/example/resnet .
./resnet 5

cp ../../build/cpp-package/example/resnet .
./inception-bn 5

cp ../../build/cpp-package/example/mlp .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add the mlp_csv example here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stu1130 ccsgd was deprecated long time ago, actually, I will update to change ccsgd in all other examples.

@kalyc
Copy link
Contributor

kalyc commented Nov 16, 2018

@roywei please take a look at the failed CI and re-trigger it

@roywei roywei requested a review from marcoabreu as a code owner November 19, 2018 21:12
Copy link
Contributor

@stu1130 stu1130 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stu1130
Copy link
Contributor

stu1130 commented Nov 20, 2018

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Nov 20, 2018
@roywei
Copy link
Member Author

roywei commented Nov 20, 2018

@marcoabreu according to dev list discussion, this PR will be blocked by #13344 as it's changing Jenkins file

Jenkinsfile Outdated Show resolved Hide resolved
@roywei
Copy link
Member Author

roywei commented Nov 22, 2018

@nswamy I created separate PR #13367, lets merge this if CI passed.
Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[cpp-package] inception_bn.cpp is wrong
5 participants