Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get the log during training the Yolov5 model? #3457

Open
gganduu opened this issue Nov 11, 2021 · 10 comments · Fixed by #3552
Open

How can I get the log during training the Yolov5 model? #3457

gganduu opened this issue Nov 11, 2021 · 10 comments · Fixed by #3552
Assignees

Comments

@gganduu
Copy link

gganduu commented Nov 11, 2021

Pytorch Yolov5 training loss will return two parameters, one is for the log, but there is only one return in Analytics Zoo loss, how can I get the training log then?

@yangw1234
Copy link
Contributor

Hi @gganduu I do not understand your question.

Could you provide more details like code snippets?

@gganduu
Copy link
Author

gganduu commented Nov 12, 2021

Yes, we sent the code by email already.(To wang yang and qiu xin)

@yangw1234
Copy link
Contributor

yangw1234 commented Nov 12, 2021

Synced offline, here are the brief summary:

Feature Requests:

Automatically printing loss and other metrics of every epoch or iteration when calling Estimator.fit

Current workaround we provides:

fitting multiple epochs in a loop. E.g.

change

est = Estimator.from_pytorch(...)
est.fit(data, epochs=epochs)

to

est = Estimator.from_pytorch(...)
for i in range(num_epochs):
    result = est.fit(data, epochs=1)
    model = est.get_model()
    print(f"epoch {i}: {result}")

@jason-dai
Copy link
Contributor

Synced offline, here are the brief summary:

Feature Requests:

Automatically printing loss and other metrics of every epoch or iteration when calling Estimator.fit

Current workaround we provides:

fitting multiple epochs in a loop. E.g.

change

est = Estimator.from_pytorch(...)
est.fit(data, epochs=epochs)

to

est = Estimator.from_pytorch(...)
for i in range(num_epochs):
    result = est.fit(data, epochs=1)
    model = est.get_model()
    print(f"epoch {i}: {result}")

What should be the desired behavior?

@yangw1234
Copy link
Contributor

What should be the desired behavior?

I think it is a trade off between flexibility and usability.

Our current behavior is more flexible, like pytorch, as it can allow users to checkpointing in their prefered frequency and printing log in their desired format. The downside is they have to write some code.

If we are tageting usability, I think it also make sense to implement a fixed number of checkpointing and logging stategies and ask user to configure the one closest to their needs.

It is a judgment call.

@jason-dai
Copy link
Contributor

What should be the desired behavior?

I think it is a trade off between flexibility and usability.

Our current behavior is more flexible, like pytorch, as it can allow users to checkpointing in their prefered frequency and printing log in their desired format. The downside is they have to write some code.

If we are tageting usability, I think it also make sense to implement a fixed number of checkpointing and logging stategies and ask user to configure the one closest to their needs.

It is a judgment call.

What's the keras behavior?

@yangw1234
Copy link
Contributor

What's the keras behavior?

Keras will automatically output training loss, metrics and speed. Model checkpoint can be configured using model checkpoint callback.
Stick to Keras?

@jason-dai
Copy link
Contributor

What's the keras behavior?

Keras will automatically output training loss, metrics and speed. Model checkpoint can be configured using model checkpoint callback. Stick to Keras?

Either Keras or PyTorch Lightening style?

dding3 pushed a commit to dding3/BigDL that referenced this issue Nov 17, 2021
* add hyperzoo for k8s support (intel-analytics#2140)

* add hyperzoo for k8s support

* format

* format

* format

* format

* run examples on k8s readme (intel-analytics#2163)

* k8s  readme

* fix jdk download issue (intel-analytics#2219)

* add doc for submit jupyter notebook and cluster serving to k8s (intel-analytics#2221)

* add hyperzoo doc

* add hyperzoo doc

* add hyperzoo doc

* add hyperzoo doc

* fix jdk download issue (intel-analytics#2223)

* bump to 0.9s (intel-analytics#2227)

* update jdk download url (intel-analytics#2259)

* update some previous docs (intel-analytics#2284)

* K8docsupdate (intel-analytics#2306)

* Update README.md

* Update s3 related links in readme  and documents (intel-analytics#2489)

* Update s3 related links in readme  and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* update

* update

* modify line length limit

* update

* Update mxnet-mkl version in hyper-zoo dockerfile (intel-analytics#2720)

Co-authored-by: gaoping <pingx.gao@intel.com>

* update bigdl version (intel-analytics#2743)

* update bigdl version

* hyperzoo dockerfile add cluster-serving (intel-analytics#2731)

* hyperzoo dockerfile add cluster-serving

* update

* update

* update

* update jdk url

* update jdk url

* update

Co-authored-by: gaoping <pingx.gao@intel.com>

* Support init_spark_on_k8s (intel-analytics#2813)

* initial

* fix

* code refactor

* bug fix

* update docker

* style

* add conda to docker image (intel-analytics#2894)

* add conda to docker image

* Update Dockerfile

* Update Dockerfile

Co-authored-by: glorysdj <glorysdj@gmail.com>

* Fix code blocks indents in .md files (intel-analytics#2978)

* Fix code blocks indents in .md files

Previously a lot of the code blocks in markdown files were horribly indented with bad white spaces in the beginning of lines. Users can't just select, copy, paste, and run (in the case of python). I have fixed all these, so there is no longer any code block with bad white space at the beginning of the lines.
It would be nice if you could try to make sure in future commits that all code blocks are properly indented inside and have the right amount of white space in the beginning!

* Fix small style issue

* Fix indents

* Fix indent and add \ for multiline commands

Change indent from 3 spaces to 4, and add "\" for multiline bash commands

Co-authored-by: Yifan Zhu <fanzhuyifan@gmail.com>

* enable bigdl 0.12 (intel-analytics#3101)

* switch to bigdl 0.12

* Hyperzoo example ref (intel-analytics#3143)

* specify pip version to fix oserror 0 of proxy (intel-analytics#3165)

* Bigdl0.12.1 (intel-analytics#3155)

* bigdl 0.12.1

* bump 0.10.0-Snapshot (intel-analytics#3237)

* update runtime image name (intel-analytics#3250)

* update jdk download url (intel-analytics#3316)

* update jdk8 url (intel-analytics#3411)

Co-authored-by: ardaci <dongjie.shi@intel.com>

* update hyperzoo docker image (intel-analytics#3429)

* update hyperzoo image (intel-analytics#3457)

* fix jdk in az docker (intel-analytics#3478)

* fix jdk in az docker

* fix jdk for hyperzoo

* fix jdk in jenkins docker

* fix jdk in cluster serving docker

* fix jdk

* fix readme

* update python dep to fit cnvrg (intel-analytics#3486)

* update ray version doc (intel-analytics#3568)

* fix deploy hyperzoo issue (intel-analytics#3574)

Co-authored-by: gaoping <pingx.gao@intel.com>

* add spark fix and net-tools and status check (intel-analytics#3742)

* intsall netstat and add check status

* add spark fix for graphene

* bigdl 0.12.2 (intel-analytics#3780)

* bump to 0.11-S and fix version issues except ipynb

* add multi-stage build Dockerfile (intel-analytics#3916)

* add multi-stage build Dockerfile

* multi-stage build dockerfile

* multi-stage build dockerfile

* Rename Dockerfile.multi to Dockerfile

* delete Dockerfile.multi

* remove comments, add TINI_VERSION to common arg, remove Dockerfile.multi

* multi-stage add tf_slim

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* update hyperzoo doc and k8s doc (intel-analytics#3959)

* update userguide of k8s

* update k8s guide

* update hyperzoo doc

* Update k8s.md

add note

* Update k8s.md

add note

* Update k8s.md

update notes

* fix 4087 issue (intel-analytics#4097)

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* fixed 4086 and 4083 issues (intel-analytics#4098)

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* Reduce image size (intel-analytics#4132)

* Reduce Dockerfile size
1. del redis stage
2. del flink stage
3. del conda & exclude some python packages
4. add copies layer stage

* update numpy version to 1.18.1

Co-authored-by: zzti-bsj <shaojiex.bai@intel.com>

* update hyperzoo image (intel-analytics#4250)

Co-authored-by: Adria777 <Adria777@github.com>

* bigdl 0.13 (intel-analytics#4210)

* bigdl 0.13

* update

* print exception

* pyspark2.4.6

* update release PyPI script

* update

* flip snapshot-0.12.0 and spark2.4.6 (intel-analytics#4254)

* s-0.12.0 master

* Update __init__.py

* Update python.md

* fix docker issues due to version update (intel-analytics#4280)

* fix docker issues

* fix docker issues

* update Dockerfile to support spark 3.1.2 && 2.4.6 (intel-analytics#4436)

Co-authored-by: shaojie <otnw_bsj@163.com>

* update hyperzoo, add lib for tf2 (intel-analytics#4614)

* delete tf 1.15.0 (intel-analytics#4719)

Co-authored-by: Le-Zheng <30695225+Le-Zheng@users.noreply.github.com>
Co-authored-by: pinggao18 <44043817+pinggao18@users.noreply.github.com>
Co-authored-by: pinggao187 <44044110+pinggao187@users.noreply.github.com>
Co-authored-by: gaoping <pingx.gao@intel.com>
Co-authored-by: Kai Huang <huangkaivision@gmail.com>
Co-authored-by: GavinGu07 <55721214+GavinGu07@users.noreply.github.com>
Co-authored-by: Yifan Zhu <zhuyifan@stanford.edu>
Co-authored-by: Yifan Zhu <fanzhuyifan@gmail.com>
Co-authored-by: Song Jiaming <litchy233@gmail.com>
Co-authored-by: ardaci <dongjie.shi@intel.com>
Co-authored-by: Yang Wang <yang3.wang@intel.com>
Co-authored-by: zzti-bsj <2779090360@qq.com>
Co-authored-by: shaojie <shaojiex.bai@intel.com>
Co-authored-by: Lingqi Su <33695124+Adria777@users.noreply.github.com>
Co-authored-by: Adria777 <Adria777@github.com>
Co-authored-by: shaojie <otnw_bsj@163.com>
dding3 pushed a commit to dding3/BigDL that referenced this issue Nov 17, 2021
* add hyperzoo for k8s support (intel-analytics#2140)

* add hyperzoo for k8s support

* format

* format

* format

* format

* run examples on k8s readme (intel-analytics#2163)

* k8s  readme

* fix jdk download issue (intel-analytics#2219)

* add doc for submit jupyter notebook and cluster serving to k8s (intel-analytics#2221)

* add hyperzoo doc

* add hyperzoo doc

* add hyperzoo doc

* add hyperzoo doc

* fix jdk download issue (intel-analytics#2223)

* bump to 0.9s (intel-analytics#2227)

* update jdk download url (intel-analytics#2259)

* update some previous docs (intel-analytics#2284)

* K8docsupdate (intel-analytics#2306)

* Update README.md

* Update s3 related links in readme  and documents (intel-analytics#2489)

* Update s3 related links in readme  and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* Update s3 related links in readme and documents

* update

* update

* modify line length limit

* update

* Update mxnet-mkl version in hyper-zoo dockerfile (intel-analytics#2720)

Co-authored-by: gaoping <pingx.gao@intel.com>

* update bigdl version (intel-analytics#2743)

* update bigdl version

* hyperzoo dockerfile add cluster-serving (intel-analytics#2731)

* hyperzoo dockerfile add cluster-serving

* update

* update

* update

* update jdk url

* update jdk url

* update

Co-authored-by: gaoping <pingx.gao@intel.com>

* Support init_spark_on_k8s (intel-analytics#2813)

* initial

* fix

* code refactor

* bug fix

* update docker

* style

* add conda to docker image (intel-analytics#2894)

* add conda to docker image

* Update Dockerfile

* Update Dockerfile

Co-authored-by: glorysdj <glorysdj@gmail.com>

* Fix code blocks indents in .md files (intel-analytics#2978)

* Fix code blocks indents in .md files

Previously a lot of the code blocks in markdown files were horribly indented with bad white spaces in the beginning of lines. Users can't just select, copy, paste, and run (in the case of python). I have fixed all these, so there is no longer any code block with bad white space at the beginning of the lines.
It would be nice if you could try to make sure in future commits that all code blocks are properly indented inside and have the right amount of white space in the beginning!

* Fix small style issue

* Fix indents

* Fix indent and add \ for multiline commands

Change indent from 3 spaces to 4, and add "\" for multiline bash commands

Co-authored-by: Yifan Zhu <fanzhuyifan@gmail.com>

* enable bigdl 0.12 (intel-analytics#3101)

* switch to bigdl 0.12

* Hyperzoo example ref (intel-analytics#3143)

* specify pip version to fix oserror 0 of proxy (intel-analytics#3165)

* Bigdl0.12.1 (intel-analytics#3155)

* bigdl 0.12.1

* bump 0.10.0-Snapshot (intel-analytics#3237)

* update runtime image name (intel-analytics#3250)

* update jdk download url (intel-analytics#3316)

* update jdk8 url (intel-analytics#3411)

Co-authored-by: ardaci <dongjie.shi@intel.com>

* update hyperzoo docker image (intel-analytics#3429)

* update hyperzoo image (intel-analytics#3457)

* fix jdk in az docker (intel-analytics#3478)

* fix jdk in az docker

* fix jdk for hyperzoo

* fix jdk in jenkins docker

* fix jdk in cluster serving docker

* fix jdk

* fix readme

* update python dep to fit cnvrg (intel-analytics#3486)

* update ray version doc (intel-analytics#3568)

* fix deploy hyperzoo issue (intel-analytics#3574)

Co-authored-by: gaoping <pingx.gao@intel.com>

* add spark fix and net-tools and status check (intel-analytics#3742)

* intsall netstat and add check status

* add spark fix for graphene

* bigdl 0.12.2 (intel-analytics#3780)

* bump to 0.11-S and fix version issues except ipynb

* add multi-stage build Dockerfile (intel-analytics#3916)

* add multi-stage build Dockerfile

* multi-stage build dockerfile

* multi-stage build dockerfile

* Rename Dockerfile.multi to Dockerfile

* delete Dockerfile.multi

* remove comments, add TINI_VERSION to common arg, remove Dockerfile.multi

* multi-stage add tf_slim

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* update hyperzoo doc and k8s doc (intel-analytics#3959)

* update userguide of k8s

* update k8s guide

* update hyperzoo doc

* Update k8s.md

add note

* Update k8s.md

add note

* Update k8s.md

update notes

* fix 4087 issue (intel-analytics#4097)

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* fixed 4086 and 4083 issues (intel-analytics#4098)

Co-authored-by: shaojie <shaojiex.bai@intel.com>

* Reduce image size (intel-analytics#4132)

* Reduce Dockerfile size
1. del redis stage
2. del flink stage
3. del conda & exclude some python packages
4. add copies layer stage

* update numpy version to 1.18.1

Co-authored-by: zzti-bsj <shaojiex.bai@intel.com>

* update hyperzoo image (intel-analytics#4250)

Co-authored-by: Adria777 <Adria777@github.com>

* bigdl 0.13 (intel-analytics#4210)

* bigdl 0.13

* update

* print exception

* pyspark2.4.6

* update release PyPI script

* update

* flip snapshot-0.12.0 and spark2.4.6 (intel-analytics#4254)

* s-0.12.0 master

* Update __init__.py

* Update python.md

* fix docker issues due to version update (intel-analytics#4280)

* fix docker issues

* fix docker issues

* update Dockerfile to support spark 3.1.2 && 2.4.6 (intel-analytics#4436)

Co-authored-by: shaojie <otnw_bsj@163.com>

* update hyperzoo, add lib for tf2 (intel-analytics#4614)

* delete tf 1.15.0 (intel-analytics#4719)

Co-authored-by: Le-Zheng <30695225+Le-Zheng@users.noreply.github.com>
Co-authored-by: pinggao18 <44043817+pinggao18@users.noreply.github.com>
Co-authored-by: pinggao187 <44044110+pinggao187@users.noreply.github.com>
Co-authored-by: gaoping <pingx.gao@intel.com>
Co-authored-by: Kai Huang <huangkaivision@gmail.com>
Co-authored-by: GavinGu07 <55721214+GavinGu07@users.noreply.github.com>
Co-authored-by: Yifan Zhu <zhuyifan@stanford.edu>
Co-authored-by: Yifan Zhu <fanzhuyifan@gmail.com>
Co-authored-by: Song Jiaming <litchy233@gmail.com>
Co-authored-by: ardaci <dongjie.shi@intel.com>
Co-authored-by: Yang Wang <yang3.wang@intel.com>
Co-authored-by: zzti-bsj <2779090360@qq.com>
Co-authored-by: shaojie <shaojiex.bai@intel.com>
Co-authored-by: Lingqi Su <33695124+Adria777@users.noreply.github.com>
Co-authored-by: Adria777 <Adria777@github.com>
Co-authored-by: shaojie <otnw_bsj@163.com>
@jason-dai
Copy link
Contributor

Do you need to backport to AZ?

@jason-dai jason-dai reopened this Nov 25, 2021
@yangw1234
Copy link
Contributor

Do you need to backport to AZ?

I'll backport. This is closed automatically by github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants