fix log monitor read error #5221

ConeyLiu · 2019-07-18T06:47:20Z

What do these changes do?

Currently, we will encounter some errors when monitoring all the file under logs. And we also just need the worker logfile from the code. In this patch, we only read those file which matches a worker log file format.

Related issue number

Closes #5220

Linter

I've run scripts/format.sh to lint the changes in this PR.

ConeyLiu · 2019-07-18T07:14:52Z

scripts/format.sh failed as: flake8: error: no such option: --inline-quotes

AmplabJenkins · 2019-07-18T09:51:45Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15471/
Test PASSed.

AmplabJenkins · 2019-07-18T10:15:21Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15473/
Test PASSed.

richardliaw · 2019-07-22T22:56:39Z

pip install flake8-quotes

richardliaw · 2019-07-22T22:58:00Z

python/ray/log_monitor.py

@@ -93,7 +101,9 @@ def update_log_filenames(self):

        for log_filename in log_filenames:
            full_path = os.path.join(self.logs_dir, log_filename)
-            if full_path not in self.log_filenames:
+            filename = full_path.split("/")[-1]


os.path.basename?

richardliaw · 2019-07-22T22:58:29Z

python/ray/log_monitor.py

@@ -93,7 +101,9 @@ def update_log_filenames(self):

        for log_filename in log_filenames:
            full_path = os.path.join(self.logs_dir, log_filename)
-            if full_path not in self.log_filenames:
+            filename = full_path.split("/")[-1]
+            if (file_should_monitor(filename)


this won't even run right? self.file_should_monitor

can you use glob instead?

ConeyLiu · 2019-07-23T03:02:40Z

thanks @richardliaw for the comments. Is there any way to running the UT tests in locally? I have run the
pip install -e . --verbose and got the following errors. It seems not related.

ERROR: /home/lxy/git_repository/ray/BUILD.bazel:718:1: Executing genrule //:redis failed (Exit 2) bash failed: error executing command
      (cd /home/lxy/.cache/bazel/_bazel_lxy/92fa78d6e2702ccf396ac1998b6e1ae8/sandbox/linux-sandbox/4845/execroot/com_github_ray_project_ray && \
      exec env - \
        LD_LIBRARY_PATH=/usr/local/lib/:/usr/local/lib/:/usr/local/lib/: \
        PATH=/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/lxy/utils/sbt/bin:/home/lxy/utils/sbt/bin:/home/lxy/utils/sbt/bin:/home/lxy/bin:/home/lxy/utils/sbt/bin:/home/lxy/bin \
        PYTHON_BIN_PATH=/home/lxy/anaconda3/bin/python3 \
      /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh;
            set -x &&
            curl -sL "https://github.com/antirez/redis/archive/5.0.3.tar.gz" | tar xz --strip-components=1 -C . &&
            make &&
            mv ./src/redis-server bazel-out/k8-opt/bin/redis-server &&
            chmod +x bazel-out/k8-opt/bin/redis-server &&
            mv ./src/redis-cli bazel-out/k8-opt/bin/redis-cli &&
            chmod +x bazel-out/k8-opt/bin/redis-cli
        ')
    Execution platform: @bazel_tools//platforms:host_platform

AmplabJenkins · 2019-07-23T05:54:48Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15576/
Test PASSed.

richardliaw · 2019-07-24T00:12:04Z

python/ray/log_monitor.py

-            if full_path not in self.log_filenames:
-                self.log_filenames.add(full_path)
+        # we only monior worker log files
+        log_file_paths = glob.glob("{}/worker*[.out|.err]".format(


isn't there other log files that the log_monitor tracks? like the monitor log, the raylet log?

ConeyLiu · 2019-07-24T03:28:14Z

python/ray/log_monitor.py

@@ -195,7 +199,7 @@ def check_log_files_and_publish_updates(self):
            # Record the current position in the file.
            file_info.file_position = file_info.file_handle.tell()

-            if len(lines_to_publish) > 0 and is_worker:
+            if len(lines_to_publish) > 0:
                self.redis_client.publish(


@richardliaw we only publish worker log file, not sure why. Please correct me if I am not right.

robertnishihara · 2019-08-01T22:45:37Z

@ConeyLiu, thanks for submitting the PR. You're right that we currently only stream the worker logs back to the driver. However, in the future I think it would make sense to collect all of the process logs and store them in some central database or something like that.

The PR looks good to me.

ConeyLiu · 2019-08-02T05:17:59Z

thanks a lot @richardliaw @robertnishihara.

fix log monitor read error

ff9332f

update with pylint

c271a97

richardliaw reviewed Jul 22, 2019

View reviewed changes

address comments

14261a6

richardliaw reviewed Jul 24, 2019

View reviewed changes

ConeyLiu commented Jul 24, 2019

View reviewed changes

robertnishihara approved these changes Aug 1, 2019

View reviewed changes

robertnishihara merged commit 3ae54a2 into ray-project:master Aug 1, 2019

ConeyLiu deleted the log-monitor-bug-fix branch August 2, 2019 05:18

edoakes pushed a commit to edoakes/ray that referenced this pull request Aug 9, 2019

Fix log monitor read error (ray-project#5221)

4bac54b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix log monitor read error #5221

fix log monitor read error #5221

ConeyLiu commented Jul 18, 2019 •

edited

Loading

ConeyLiu commented Jul 18, 2019

AmplabJenkins commented Jul 18, 2019

AmplabJenkins commented Jul 18, 2019

richardliaw commented Jul 22, 2019

richardliaw Jul 22, 2019

richardliaw Jul 22, 2019

richardliaw Jul 22, 2019

ConeyLiu commented Jul 23, 2019

AmplabJenkins commented Jul 23, 2019

richardliaw Jul 24, 2019

ConeyLiu Jul 24, 2019

robertnishihara commented Aug 1, 2019

ConeyLiu commented Aug 2, 2019

fix log monitor read error #5221

fix log monitor read error #5221

Conversation

ConeyLiu commented Jul 18, 2019 • edited Loading

What do these changes do?

Related issue number

Linter

ConeyLiu commented Jul 18, 2019

AmplabJenkins commented Jul 18, 2019

AmplabJenkins commented Jul 18, 2019

richardliaw commented Jul 22, 2019

richardliaw Jul 22, 2019

Choose a reason for hiding this comment

richardliaw Jul 22, 2019

Choose a reason for hiding this comment

richardliaw Jul 22, 2019

Choose a reason for hiding this comment

ConeyLiu commented Jul 23, 2019

AmplabJenkins commented Jul 23, 2019

richardliaw Jul 24, 2019

Choose a reason for hiding this comment

ConeyLiu Jul 24, 2019

Choose a reason for hiding this comment

robertnishihara commented Aug 1, 2019

ConeyLiu commented Aug 2, 2019

ConeyLiu commented Jul 18, 2019 •

edited

Loading