Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[testbed] Failing test TestLogLargeFiles #35053

Open
crobert-1 opened this issue Sep 6, 2024 · 4 comments
Open

[testbed] Failing test TestLogLargeFiles #35053

crobert-1 opened this issue Sep 6, 2024 · 4 comments
Labels
flaky test a test is flaky testbed

Comments

@crobert-1
Copy link
Member

Component(s)

testbed

Describe the issue you're reporting

This test was introduced in #34417 and has been failing every time on main since it was merged.

Example failing CI/CD link

Failure output sample:

=== RUN   TestLogLargeFiles/filelog-largefiles-6GB-lifetime
2024/09/03 18:15:12 Starting mock backend...
2024/09/03 18:15:12 Starting Agent (/home/ghrunner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/bin/oteltestbedcol_linux_amd64)
2024/09/03 18:15:12 Writing Agent log to /home/ghrunner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/testbed/tests/results/TestLogLargeFiles/filelog-largefiles-6GB-lifetime/agent.log
2024/09/03 18:15:12 Agent running, pid=319946
2024/09/03 18:15:12 Starting load generator at 330000 items/sec.
2024/09/03 18:15:15  | Sent:   456,310 logs (152,134/sec) | Received:   377,693 items (125,898/sec)
2024/09/03 18:15:18  | Sent:   919,930 logs (153,338/sec) | Received:   796,380 items (132,732/sec)
2024/09/03 18:15:21  | Sent: 1,391,800 logs (154,655/sec) | Received: 1,223,337 items (135,927/sec)
2024/09/03 18:15:24  | Sent: 1,857,050 logs (154,762/sec) | Received: 1,666,629 items (138,886/sec)
2024/09/03 18:15:27  | Sent: 2,328,690 logs (155,252/sec) | Received: 2,101,[63](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:64)1 items (140,109/sec)
2024/09/03 18:15:30  | Sent: 2,795,580 logs (155,315/sec) | Received: 2,520,476 items (140,027/sec)
2024/09/03 18:15:33  | Sent: 3,263,870 logs (155,427/sec) | Received: 2,931,350 items (139,588/sec)
2024/09/03 18:15:36  | Sent: 3,724,920 logs (155,209/sec) | Received: 3,333,528 items (138,897/sec)
2024/09/03 18:15:39  | Sent: 4,193,120 logs (155,304/sec) | Received: 3,784,939 items (140,183/sec)
2024/09/03 18:15:42  | Sent: 4,650,343 logs (155,014/sec) | Received: 4,220,074 items (140,669/sec)
2024/09/03 18:15:45  | Sent: 5,118,250 logs (155,101/sec) | Received: 4,605,991 items (139,575/sec)
2024/09/03 18:15:48  | Sent: 5,589,908 logs (155,278/sec) | Received: 5,074,158 items (140,949/sec)
2024/09/03 18:15:51  | Sent: 6,060,430 logs (155,398/sec) | Received: 5,517,357 items (141,470/sec)
2024/09/03 18:15:54  | Sent: 6,526,930 logs (155,405/sec) | Received: 5,919,778 items (140,947/sec)
2024/09/03 18:15:57  | Sent: 6,997,510 logs (155,502/sec) | Received: 6,404,174 items (142,314/sec)
2024/09/03 18:16:00  | Sent: 7,456,890 logs (155,353/sec) | Received: 6,822,765 items (142,141/sec)
2024/09/03 18:16:03  | Sent: 7,923,870 logs (155,371/sec) | Received: 7,208,780 items (141,348/sec)
2024/09/03 18:16:06  | Sent: 8,384,086 logs (155,262/sec) | Received: 7,594,612 items (140,[64](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:65)1/sec)
2024/09/03 18:16:09  | Sent: 8,853,290 logs (155,322/sec) | Received: 8,021,773 items (140,733/sec)
2024/09/03 18:16:12  | Sent: 9,324,691 logs (155,413/sec) | Received: 8,464,927 items (141,082/sec)
2024/09/03 18:16:15  | Sent: 9,789,944 logs (155,397/sec) | Received: 8,925,052 items (141,667/sec)
2024/09/03 18:16:18  | Sent:10,255,310 logs (155,385/sec) | Received: 9,360,214 items (141,821/sec)
2024/09/03 18:16:21  | Sent:10,708,080 logs (155,191/sec) | Received: 9,721,646 items (140,893/sec)
2024/09/03 18:16:24  | Sent:11,175,730 logs (155,219/sec) | Received:10,181,679 items (141,412/sec)
2024/09/03 18:16:27  | Sent:11,643,380 logs (155,246/sec) | Received:10,616,888 items (141,558/sec)
2024/09/03 18:16:30  | Sent:12,107,460 logs (155,225/sec) | Received:11,035,333 items (141,478/sec)
2024/09/03 18:16:33  | Sent:12,572,785 logs (155,220/sec) | Received:11,470,281 items (141,608/sec)
2024/09/03 18:16:36  | Sent:13,030,820 logs (155,130/sec) | Received:11,872,402 items (141,338/sec)
2024/09/03 18:16:39  | Sent:13,494,780 logs (155,113/sec) | Received:12,299,212 items (141,370/sec)
2024/09/03 18:16:42  | Sent:13,957,860 logs (155,088/sec) | Received:12,734,249 items (141,491/sec)
2024/09/03 18:16:45  | Sent:14,420,470 logs (155,059/sec) | Received:13,177,441 items (141,693/sec)
2024/09/03 18:16:48  | Sent:14,874,680 logs (154,945/sec) | Received:13,588,002 items (141,541/sec)
2024/09/03 18:16:51  | Sent:15,334,243 logs (154,892/sec) | Received:13,998,743 items (141,401/sec)
2024/09/03 18:16:54  | Sent:15,800,610 logs (154,908/sec) | Received:14,442,229 items (141,590/sec)
2024/09/03 18:16:57  | Sent:16,258,990 logs (154,848/sec) | Received:14,877,316 items (141,688/sec)
2024/09/03 18:17:00  | Sent:16,713,960 logs (154,759/sec) | Received:15,271,694 items (141,404/sec)
2024/09/03 18:17:03  | Sent:17,166,880 logs (154,[65](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:66)7/sec) | Received:15,706,679 items (141,501/sec)
2024/09/03 18:17:06  | Sent:17,635,930 logs (154,702/sec) | Received:16,141,761 items (141,594/sec)
2024/09/03 18:17:09  | Sent:18,105,370 logs (154,747/sec) | Received:16,601,641 items (141,894/sec)
2024/09/03 18:17:12  | Sent:18,573,104 logs (154,776/sec) | Received:17,028,523 items (141,904/sec)
2024/09/03 18:17:15  | Sent:19,029,290 logs (154,710/sec) | Received:17,422,645 items (141,647/sec)
2024/09/03 18:17:18  | Sent:19,496,760 logs (154,737/sec) | Received:17,857,908 items (141,729/sec)
2024/09/03 18:17:21  | Sent:19,956,009 logs (154,698/sec) | Received:18,260,124 items (141,551/sec)
2024/09/03 18:17:24  | Sent:20,421,360 logs (154,708/sec) | Received:18,703,629 items (141,694/sec)
2024/09/03 18:17:27  | Sent:20,893,310 logs (154,765/sec) | Received:19,180,003 items (142,074/sec)
2024/09/03 18:17:30  | Sent:21,359,050 logs (154,776/sec) | Received:19,631,671 items (142,258/sec)
2024/09/03 18:17:33  | Sent:21,816,782 logs (154,729/sec) | Received:20,017,450 items (141,967/sec)
2024/09/03 18:17:36  | Sent:22,290,144 logs (154,793/sec) | Received:20,460,956 items (142,090/sec)
2024/09/03 18:17:39  | Sent:22,764,200 logs (154,859/sec) | Received:20,970,246 items (142,654/sec)
2024/09/03 18:17:42  | Sent:23,240,529 logs (154,937/sec) | Received:21,470,993 items (143,140/sec)
2024/09/03 18:17:45  | Sent:23,698,508 logs (154,892/sec) | Received:21,856,971 items (142,856/sec)
2024/09/03 18:17:48  | Sent:24,159,760 logs (154,870/sec) | Received:22,325,0[66](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:67) items (143,109/sec)
2024/09/03 18:17:51  | Sent:24,625,800 logs (154,879/sec) | Received:22,760,076 items (143,145/sec)
2024/09/03 18:17:54  | Sent:25,094,112 logs (154,902/sec) | Received:23,203,394 items (143,230/sec)
2024/09/03 18:17:57  | Sent:25,563,070 logs (154,928/sec) | Received:23,[67](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:68)9,302 items (143,510/sec)
2024/09/03 18:18:00  | Sent:26,030,629 logs (154,944/sec) | Received:24,130,792 items (143,635/sec)
2024/09/03 18:18:03  | Sent:26,511,[68](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:69)2 logs (155,039/sec) | Received:24,648,234 items (144,141/sec)
2024/09/03 18:18:06  | Sent:26,966,550 logs (154,980/sec) | Received:25,017,796 items (143,780/sec)
2024/09/03 18:18:09  | Sent:27,426,236 logs (154,951/sec) | Received:25,444,624 items (143,755/sec)
2024/09/03 18:18:12  | Sent:27,900,440 logs (155,002/sec) | Received:25,904,307 items (143,912/sec)
2024/09/03 18:18:15  | Sent:28,362,114 logs (154,984/sec) | Received:26,331,400 items (143,887/sec)
2024/09/03 18:18:18  | Sent:28,822,270 logs (154,958/sec) | Received:26,741,998 items (143,774/sec)
2024/09/03 18:18:21  | Sent:29,291,290 logs (154,980/sec) | Received:27,193,772 items (143,882/sec)
2024/09/03 18:18:24  | Sent:29,756,420 logs (154,981/sec) | Received:27,645,373 items (143,986/sec)
2024/09/03 18:18:27  | Sent:30,221,310 logs (154,981/sec) | Received:28,096,960 items (144,087/sec)
2024/09/03 18:18:30  | Sent:30,687,292 logs (154,986/sec) | Received:28,548,800 items (144,185/sec)
2024/09/03 18:18:32 Stopped generator. Sent:31,001,590 logs (154,671/sec)
2024/09/03 18:18:33  | Sent:31,001,590 logs (154,236/sec) | Received:29,016,944 items (144,362/sec)
2024/09/03 18:18:36  | Sent:31,001,590 logs (151,968/sec) | Received:29,468,185 items (144,451/sec)
2024/09/03 18:18:39  | Sent:31,001,590 logs (149,766/sec) | Received:29,952,[69](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:70)5 items (144,698/sec)
2024/09/03 18:18:42  | Sent:31,001,590 logs (147,626/sec) | Received:30,453,464 items (145,016/sec)
2024/09/03 18:18:43 Time out waiting for all logs received
    test_case.go:289: Time out waiting for all logs received
2024/09/03 18:18:43 Gracefully terminating Agent pid=319946, sending SIGTEM...
2024/09/03 18:18:43 Agent process stopped, exit code=0
2024/09/03 18:18:43 Stopping mock backend...
2024/09/03 18:18:43 Stopped backend. Received:30,587,096 items (145,035/sec)
--- FAIL: TestLogLargeFiles (322.21s)
    --- FAIL: TestLogLargeFiles/filelog-largefiles-2Gb-lifetime (111.31s)
    --- FAIL: TestLogLargeFiles/filelog-largefiles-6GB-lifetime (210.90s)
=== RUN   TestLargeFileOnce
2024/09/03 18:19:00 Starting mock backend...
2024/09/03 18:19:00 Starting Agent (/home/ghrunner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/bin/oteltestbedcol_linux_amd64)
2024/09/03 18:19:00 Writing Agent log to /home/ghrunner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/testbed/tests/results/TestLargeFileOnce/agent.log
2024/09/03 18:19:00 Agent running, pid=320035
2024/09/03 18:19:03  | Sent:         0 logs (0/sec) | Received:   402,571 items (134,173/sec)
2024/09/03 18:19:06  | Sent:         0 logs (0/sec) | Received:   8[70](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:71),575 items (145,070/sec)
2024/09/03 18:19:09  | Sent:         0 logs (0/sec) | Received: 1,346,982 items (149,646/sec)
2024/09/03 18:19:12  | Sent:         0 logs (0/sec) | Received: 1,815,047 items (151,247/sec)
2024/09/03 18:19:15  | Sent:         0 logs (0/sec) | Received: 2,250,551 items (150,026/sec)
2024/09/03 18:19:18  | Sent:         0 logs (0/sec) | Received: 2,775,922 items (154,216/sec)
2024/09/03 18:19:21 Time out waiting for all logs received
    test_case.go:289: Time out waiting for all logs received
2024/09/03 18:19:21 Gracefully terminating Agent pid=320035, sending SIGTEM...
2024/09/03 18:19:21 Agent process stopped, exit code=0
2024/09/03 18:19:21 Stopped generator. Sent:         0 logs (0/sec)
2024/09/03 18:19:21 Stopping mock backend...
2024/09/03 18:19:21 Stopped backend. Received: 3,129,025 items (150,944/sec)
--- FAIL: TestLargeFileOnce (38.22s)
FAIL
exit status 1
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/testbed/tests	360.7[73](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10687803539/job/29627380658#step:10:74)s
make: *** [Makefile:11: run-tests] Error 1
# Test PerformanceResults
Started: Tue, 03 Sep 2024 18:13:21 +0000

Test                                    |Result|Duration|CPU Avg%|CPU Max%|RAM Avg MiB|RAM Max MiB|Sent Items|Received Items|
----------------------------------------|------|-------:|-------:|-------:|----------:|----------:|---------:|-------------:|
LogLargeFiles/filelog-largefiles-2Gb-lifetime|FAIL  |    111s|     0.0|     0.0|          0|          0|  16395063|      16395063|Time out waiting for all logs received
LogLargeFiles/filelog-largefiles-6GB-lifetime|FAIL  |    211s|     0.0|     0.0|          0|          0|  31001590|      30587096|Time out waiting for all logs received
LargeFileOnce                           |FAIL  |     21s|     0.0|     0.0|          0|          0|         0|       3129025|Time out waiting for all logs received

Total duration: 343s
@crobert-1 crobert-1 added the needs triage New item requiring triage label Sep 6, 2024
Copy link
Contributor

github-actions bot commented Sep 6, 2024

Pinging code owners:

  • testbed: @open-telemetry/collector-approvers

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member Author

Pinging @VihasMakwana as you likely have more context here. 👍

@VihasMakwana
Copy link
Contributor

@crobert-1 The failures seem to be due to the following reason:

  • We are receiving the logs, but there's a delay in processing them.
  • Currently, we have a 10-second timeout, which works on my local machine but apparently not in the CI environment.
    • I wonder why it passed on my PR though.
  • I’ll propose a fix to extend the timeout. Once that's done, we can monitor the system to see if it resolves the issue.

@VihasMakwana
Copy link
Contributor

@crobert-1 #35054

@crobert-1 crobert-1 added flaky test a test is flaky and removed needs triage New item requiring triage labels Sep 9, 2024
mx-psi pushed a commit that referenced this issue Sep 12, 2024
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this issue Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky test a test is flaky testbed
Projects
None yet
Development

No branches or pull requests

2 participants