Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Evaluation by docker does not work #34

Closed
VichyTong opened this issue Jul 27, 2024 · 5 comments
Closed

[Question] Evaluation by docker does not work #34

VichyTong opened this issue Jul 27, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@VichyTong
Copy link

Hi Bigcodebench team,

Thank you for your excellent work! But I have some issue when I tried to evaluate the pre-generated samples.

When I run the evaluation by docker using this script:

sudo docker run \
--name eval \
-m 128g \
-e http_proxy=http://host.docker.internal:7890 \
-e https_proxy=http://host.docker.internal:7890 \
--add-host=host.docker.internal:host-gateway \
-v $(pwd):/app bigcodebench/bigcodebench-evaluate:latest \
--split instruct \
--samples 01-ai--Yi-1.5-9B-Chat--bigcodebench-instruct--vllm-0-1-sanitized-calibrated.jsonl

There are error messages like ImportError: /usr/local/lib/python3.10/site-packages/matplotlib/_c_internal_utils.cpython-310-x86_64-linux-gnu.so: failed to map segment from shared object. I noticed the solution in the readme, and assign a 128G memory for the container.

When I run docker stat eval, the memory limit seems set correctly.

CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT   MEM %     NET I/O           BLOCK I/O   PIDS
bcb3f4c10b3c   eval      0.00%     124MiB / 128GiB     0.09%     3.67MB / 69.5kB   0B / 16MB   17

Could you please help me analyze the problem?

@terryyz
Copy link
Collaborator

terryyz commented Jul 27, 2024

Hi @VichyTong, I'll try to have a look next week. Meanwhile, would you mind checking if this is related to this issue #8 (comment)?

@terryyz terryyz added the bug Something isn't working label Jul 27, 2024
@terryyz terryyz self-assigned this Jul 27, 2024
@VichyTong
Copy link
Author

Hi @terryyz , Thank you for your quick reply! Sure, I will check it.

@VichyTong
Copy link
Author

Thank you @terryyz, I fixed this issue by uplifting the limit of as, data, and stack.

here is the command I add:

--max-as-limit 40960 \
--max-data-limit 40960 \
--max-stack-limit 40960

Now I can run the evaluation perfectly!

@terryyz
Copy link
Collaborator

terryyz commented Jul 29, 2024

Hi @VichyTong, it's great to hear that you resolved the issue :) I think the value for max-stack-limit probably doesn't need to be that big though.

Just a note that I'm also working on an HF space to help users do the real-time code execution for consistency. Will let you know once it's ready.

@terryyz
Copy link
Collaborator

terryyz commented Jul 29, 2024

Closed the issue for now. Feel free to reopen it!

@terryyz terryyz closed this as completed Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants