Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix log monitor read error #5221

Merged
merged 3 commits into from
Aug 1, 2019

Conversation

ConeyLiu
Copy link
Contributor

@ConeyLiu ConeyLiu commented Jul 18, 2019

What do these changes do?

Currently, we will encounter some errors when monitoring all the file under logs. And we also just need the worker logfile from the code. In this patch, we only read those file which matches a worker log file format.

Related issue number

Closes #5220

Linter

  • I've run scripts/format.sh to lint the changes in this PR.

@ConeyLiu
Copy link
Contributor Author

scripts/format.sh failed as: flake8: error: no such option: --inline-quotes

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15471/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15473/
Test PASSed.

@richardliaw
Copy link
Contributor

pip install flake8-quotes

@@ -93,7 +101,9 @@ def update_log_filenames(self):

for log_filename in log_filenames:
full_path = os.path.join(self.logs_dir, log_filename)
if full_path not in self.log_filenames:
filename = full_path.split("/")[-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.basename?

@@ -93,7 +101,9 @@ def update_log_filenames(self):

for log_filename in log_filenames:
full_path = os.path.join(self.logs_dir, log_filename)
if full_path not in self.log_filenames:
filename = full_path.split("/")[-1]
if (file_should_monitor(filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't even run right? self.file_should_monitor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use glob instead?

@ConeyLiu
Copy link
Contributor Author

thanks @richardliaw for the comments. Is there any way to running the UT tests in locally? I have run the
pip install -e . --verbose and got the following errors. It seems not related.

ERROR: /home/lxy/git_repository/ray/BUILD.bazel:718:1: Executing genrule //:redis failed (Exit 2) bash failed: error executing command
      (cd /home/lxy/.cache/bazel/_bazel_lxy/92fa78d6e2702ccf396ac1998b6e1ae8/sandbox/linux-sandbox/4845/execroot/com_github_ray_project_ray && \
      exec env - \
        LD_LIBRARY_PATH=/usr/local/lib/:/usr/local/lib/:/usr/local/lib/: \
        PATH=/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/home/lxy/utils/gradle-4.0/bin:/home/lxy/anaconda3/bin:/home/lxy/utils/scala-2.11.8/bin:/home/lxy/utils/maven/bin:/home/lxy/utils/jdk1.8.0_111/bin:/home/lxy/bin:/home/lxy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/lxy/utils/sbt/bin:/home/lxy/utils/sbt/bin:/home/lxy/utils/sbt/bin:/home/lxy/bin:/home/lxy/utils/sbt/bin:/home/lxy/bin \
        PYTHON_BIN_PATH=/home/lxy/anaconda3/bin/python3 \
      /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh;
            set -x &&
            curl -sL "https://github.com/antirez/redis/archive/5.0.3.tar.gz" | tar xz --strip-components=1 -C . &&
            make &&
            mv ./src/redis-server bazel-out/k8-opt/bin/redis-server &&
            chmod +x bazel-out/k8-opt/bin/redis-server &&
            mv ./src/redis-cli bazel-out/k8-opt/bin/redis-cli &&
            chmod +x bazel-out/k8-opt/bin/redis-cli
        ')
    Execution platform: @bazel_tools//platforms:host_platform

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15576/
Test PASSed.

if full_path not in self.log_filenames:
self.log_filenames.add(full_path)
# we only monior worker log files
log_file_paths = glob.glob("{}/worker*[.out|.err]".format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there other log files that the log_monitor tracks? like the monitor log, the raylet log?

@@ -195,7 +199,7 @@ def check_log_files_and_publish_updates(self):
# Record the current position in the file.
file_info.file_position = file_info.file_handle.tell()

if len(lines_to_publish) > 0 and is_worker:
if len(lines_to_publish) > 0:
self.redis_client.publish(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardliaw we only publish worker log file, not sure why. Please correct me if I am not right.

@robertnishihara
Copy link
Collaborator

@ConeyLiu, thanks for submitting the PR. You're right that we currently only stream the worker logs back to the driver. However, in the future I think it would make sense to collect all of the process logs and store them in some central database or something like that.

The PR looks good to me.

@robertnishihara robertnishihara merged commit 3ae54a2 into ray-project:master Aug 1, 2019
@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented Aug 2, 2019

thanks a lot @richardliaw @robertnishihara.

@ConeyLiu ConeyLiu deleted the log-monitor-bug-fix branch August 2, 2019 05:18
edoakes pushed a commit to edoakes/ray that referenced this pull request Aug 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

log monitor report UnicodeDecodeError
4 participants