Skip to content

[tune] Incorrect trial/iteration increment for function API throughout trial #3949

@andrewztan

Description

@andrewztan

Describe the problem

There's an issue with the trial/iteration counter when running experiments with the functional API. Similar to #3834, but the issue persists throughout the experiment, not just the end.

Source code / logs

== Status ==
Using FIFO scheduling algorithm.
Resources requested: 1/4 CPUs, 0/0 GPUs
Memory usage on this node: 6.6/8.6 GB
Result logdir: /Users/andrewtan/ray_results/my_exp
RUNNING trials:
 - exp_0:	RUNNING

Result for exp_0:
  date: 2019-02-04_16-40-41
  done: true
  experiment_id: df4e206afbd4446cb7a9c8257c12cbc4
  hostname: airbears2-10-142-33-12.airbears2.1918.berkeley.edu
  iterations_since_restore: 1
  itr: 99
  node_ip: 10.142.33.12
  pid: 96939
  time_since_restore: 1.0051090717315674
  time_this_iter_s: 1.0051090717315674
  time_total_s: 1.0051090717315674
  timestamp: 1549327241
  timesteps_since_restore: 0
  training_iteration: 1

== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/4 CPUs, 0/0 GPUs
Memory usage on this node: 6.8/8.6 GB
Result logdir: /Users/andrewtan/ray_results/my_exp
TERMINATED trials:
 - exp_0:	TERMINATED [pid=96939], 1 s, 1 iter

== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/4 CPUs, 0/0 GPUs
Memory usage on this node: 6.8/8.6 GB
Result logdir: /Users/andrewtan/ray_results/my_exp
TERMINATED trials:
 - exp_0:	TERMINATED [pid=96939], 1 s, 1 iter

The test file run for this has a stopping criteria of 100 iterations. However, the logger shows that the experiment ended at iteration 1. It seems like the number of iterations logged by the Function Runner and Trainable are not synced

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions