Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore]: fix go routine leaks in tests #34729

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile.Common
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ GCI := $(TOOLS_BIN_DIR)/gci
GOTESTSUM := $(TOOLS_BIN_DIR)/gotestsum
TESTIFYLINT := $(TOOLS_BIN_DIR)/testifylint

GOTESTSUM_OPT?= --rerun-fails=1
GOTESTSUM_OPT?=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this option removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enabling the option led to re-running the failed tests (which mostly failed to to go routine leaks and race conditions), which always led to a passed test. This behavior was from some reason blocking the actual problems which were shown after it was removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this in #31253 because we have a problem with flaky tests and not enough resources to deal with them. I feel like this is still the case and that if we remove this we will get a more flaky CI which is detrimental to contributors. Is there any way we can detect leaks while still re-running other unit tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of any possibility here. In fact it seems very strange to me that all of the failing tests become passed after re-running them for the 2. time.

We can leave the re-run option in place until we find the root cause why the re-executed tests have very deterministic behavior in terms of go-leaks and data-races, but we should at least consider merging the fixed tests as part of this PR.

Maybe it can be a caching issue in the gotestsum wrapper, not sure here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you report this upstream on the gotestsum repository then to see if it's actually an issue with them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported here

But there seems to be a bigger problem with the --rerun-fails option, as I see multiple issues created on the upstream related to it https://github.com/gotestyourself/gotestsum/issues

TESTIFYLINT_OPT?= --enable-all --disable=float-compare,require-error,suite-subtest-run

# BUILD_TYPE should be one of (dev, release).
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions exporter/awscloudwatchlogsexporter/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,5 @@ tests:
top:
# See https://github.com/census-instrumentation/opencensus-go/issues/1191 for more information.
- "go.opencensus.io/stats/view.(*worker).start"
- "net/http.(*persistConn).writeLoop"
- "internal/poll.runtime_pollWait"
2 changes: 1 addition & 1 deletion exporter/awss3exporter/generated_package_test.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions exporter/awss3exporter/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,8 @@ status:

tests:
expect_consumer_error: true
goleak:
ignore:
top:
- "net/http.(*persistConn).writeLoop"
- "internal/poll.runtime_pollWait"
1 change: 1 addition & 0 deletions extension/observer/dockerobserver/extension.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ func (d *dockerObserver) Start(ctx context.Context, _ component.Host) error {
}

func (d *dockerObserver) Shutdown(_ context.Context) error {
d.StopListAndWatch()
d.cancel()
return nil
}
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 6 additions & 1 deletion extension/observer/dockerobserver/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,9 @@ status:
# TODO: The tests are not passing on Windows. Either fix them or mark component as not supported on Windows.
tests:
skip_lifecycle: true
skip_shutdown: true
skip_shutdown: true
goleak:
ignore:
top:
- "net/http.(*persistConn).writeLoop"
- "internal/poll.runtime_pollWait"
3 changes: 3 additions & 0 deletions extension/observer/k8sobserver/extension_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ func TestExtensionObserveServices(t *testing.T) {
}, sink.removed[0])

require.NoError(t, ext.Shutdown(context.Background()))
obs.StopListAndWatch()
}

func TestExtensionObservePods(t *testing.T) {
Expand Down Expand Up @@ -209,6 +210,7 @@ func TestExtensionObservePods(t *testing.T) {
}, sink.removed[0])

require.NoError(t, ext.Shutdown(context.Background()))
obs.StopListAndWatch()
}

func TestExtensionObserveNodes(t *testing.T) {
Expand Down Expand Up @@ -308,4 +310,5 @@ func TestExtensionObserveNodes(t *testing.T) {
}, sink.removed[0])

require.NoError(t, ext.Shutdown(context.Background()))
obs.StopListAndWatch()
}
2 changes: 1 addition & 1 deletion extension/observer/k8sobserver/generated_package_test.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions extension/observer/k8sobserver/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,7 @@ status:
tests:
skip_lifecycle: true
skip_shutdown: true
goleak:
ignore:
top:
- "k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop"
2 changes: 1 addition & 1 deletion extension/sigv4authextension/generated_package_test.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions extension/sigv4authextension/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,9 @@ status:
active: [Aneurysm9, erichsueh3]

tests:
goleak:
ignore:
top:
- "net/http.(*persistConn).writeLoop"
- "internal/poll.runtime_pollWait"
config:
2 changes: 1 addition & 1 deletion internal/aws/metrics/metric_calculator.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ type MapWithExpiry struct {
// NewMapWithExpiry automatically starts a sweeper to enforce the maps TTL. ShutDown() must be called to ensure that these
// go routines are properly cleaned up ShutDown() must be called.
func NewMapWithExpiry(ttl time.Duration) *MapWithExpiry {
m := &MapWithExpiry{lock: &sync.Mutex{}, ttl: ttl, entries: make(map[any]*MetricValue), doneChan: make(chan struct{})}
m := &MapWithExpiry{lock: &sync.Mutex{}, ttl: ttl, entries: make(map[any]*MetricValue), doneChan: make(chan struct{}, 1000)}
go m.sweep(m.CleanUp)
return m
}
Expand Down
2 changes: 0 additions & 2 deletions processor/geoipprocessor/integration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@ import (
)

func TestProcessorWithMaxMind(t *testing.T) {
t.Parallel()

tmpDBfiles := testdata.GenerateLocalDB(t, "./internal/provider/maxmindprovider/testdata/")
defer os.RemoveAll(tmpDBfiles)

Expand Down
2 changes: 1 addition & 1 deletion receiver/aerospikereceiver/generated_package_test.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions receiver/aerospikereceiver/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@ status:
active: [djaglowski, antonblock]
seeking_new: true

tests:
goleak:
ignore:
top:
- "github.com/aerospike/aerospike-client-go/v7.(*baseMultiCommand).parseRecordResults"
- "github.com/aerospike/aerospike-client-go/v7.(*Cluster).clusterBoss"
- "sync.runtime_Semacquire"

resource_attributes:
aerospike.node.name:
description: Name of the Aerospike node collected from
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ type Cadvisor struct {
k8sDecorator Decorator
ecsInfo EcsInfo
containerOrchestrator string
metricsExtractors []extractors.MetricExtractor
}

func init() {
Expand Down Expand Up @@ -158,15 +159,13 @@ func New(containerOrchestrator string, hostInfo hostInfo, logger *zap.Logger, op
return c, nil
}

var metricsExtractors = []extractors.MetricExtractor{}

func GetMetricsExtractors() []extractors.MetricExtractor {
return metricsExtractors
func (c *Cadvisor) GetMetricsExtractors() []extractors.MetricExtractor {
return c.metricsExtractors
}

func (c *Cadvisor) Shutdown() error {
var errs error
for _, ext := range metricsExtractors {
for _, ext := range c.metricsExtractors {
errs = errors.Join(errs, ext.Shutdown())
}

Expand Down Expand Up @@ -341,7 +340,7 @@ func (c *Cadvisor) GetMetrics() []pmetric.Metrics {
return result
}

out := processContainers(containerinfos, c.hostInfo, c.containerOrchestrator, c.logger)
out := processContainers(containerinfos, c.hostInfo, c.containerOrchestrator, c.logger, c.GetMetricsExtractors())
results := c.decorateMetrics(out)

if c.containerOrchestrator == ci.ECS {
Expand Down Expand Up @@ -394,12 +393,12 @@ func (c *Cadvisor) initManager(createManager createCadvisorManager) error {
return err
}

metricsExtractors = []extractors.MetricExtractor{}
metricsExtractors = append(metricsExtractors, extractors.NewCPUMetricExtractor(c.logger))
metricsExtractors = append(metricsExtractors, extractors.NewMemMetricExtractor(c.logger))
metricsExtractors = append(metricsExtractors, extractors.NewDiskIOMetricExtractor(c.logger))
metricsExtractors = append(metricsExtractors, extractors.NewNetMetricExtractor(c.logger))
metricsExtractors = append(metricsExtractors, extractors.NewFileSystemMetricExtractor(c.logger))
c.metricsExtractors = make([]extractors.MetricExtractor, 0, 5)
c.metricsExtractors = append(c.metricsExtractors, extractors.NewCPUMetricExtractor(c.logger))
c.metricsExtractors = append(c.metricsExtractors, extractors.NewMemMetricExtractor(c.logger))
c.metricsExtractors = append(c.metricsExtractors, extractors.NewDiskIOMetricExtractor(c.logger))
c.metricsExtractors = append(c.metricsExtractors, extractors.NewNetMetricExtractor(c.logger))
c.metricsExtractors = append(c.metricsExtractors, extractors.NewFileSystemMetricExtractor(c.logger))

return nil
}
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ func TestGetMetrics(t *testing.T) {
assert.NotNil(t, c)
assert.NoError(t, err)
assert.NotNil(t, c.GetMetrics())
assert.NoError(t, c.Shutdown())
}

func TestGetMetricsNoEnv(t *testing.T) {
Expand All @@ -109,6 +110,7 @@ func TestGetMetricsNoClusterName(t *testing.T) {
assert.NotNil(t, c)
assert.NoError(t, err)
assert.Nil(t, c.GetMetrics())
assert.NoError(t, c.Shutdown())
}

func TestGetMetricsErrorWhenCreatingManager(t *testing.T) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ type podKey struct {
namespace string
}

func processContainers(cInfos []*cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider, containerOrchestrator string, logger *zap.Logger) []*extractors.CAdvisorMetric {
func processContainers(cInfos []*cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider, containerOrchestrator string, logger *zap.Logger, metricExtractors []extractors.MetricExtractor) []*extractors.CAdvisorMetric {
var metrics []*extractors.CAdvisorMetric
podKeys := make(map[string]podKey)

Expand All @@ -47,7 +47,7 @@ func processContainers(cInfos []*cInfo.ContainerInfo, mInfo extractors.CPUMemInf
if len(info.Stats) == 0 {
continue
}
outMetrics, outPodKey, err := processContainer(info, mInfo, containerOrchestrator, logger)
outMetrics, outPodKey, err := processContainer(info, mInfo, containerOrchestrator, logger, metricExtractors)
if err != nil {
logger.Warn("drop some container info", zap.Error(err))
continue
Expand All @@ -73,7 +73,7 @@ func processContainers(cInfos []*cInfo.ContainerInfo, mInfo extractors.CPUMemInf
continue
}

metrics = append(metrics, processPod(info, mInfo, podKeys, logger)...)
metrics = append(metrics, processPod(info, mInfo, podKeys, logger, metricExtractors)...)
}

// This happens when our cgroup path based pod detection logic is not working.
Expand All @@ -87,7 +87,7 @@ func processContainers(cInfos []*cInfo.ContainerInfo, mInfo extractors.CPUMemInf
}

// processContainers get metrics for individual container and gather information for pod so we can look it up later.
func processContainer(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider, containerOrchestrator string, logger *zap.Logger) ([]*extractors.CAdvisorMetric, *podKey, error) {
func processContainer(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider, containerOrchestrator string, logger *zap.Logger, metricExtractors []extractors.MetricExtractor) ([]*extractors.CAdvisorMetric, *podKey, error) {
var result []*extractors.CAdvisorMetric
var pKey *podKey

Expand Down Expand Up @@ -152,7 +152,7 @@ func processContainer(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProv

tags[ci.Timestamp] = strconv.FormatInt(extractors.GetStats(info).Timestamp.UnixNano(), 10)

for _, extractor := range GetMetricsExtractors() {
for _, extractor := range metricExtractors {
if extractor.HasValue(info) {
result = append(result, extractor.GetValue(info, mInfo, containerType)...)
}
Expand All @@ -164,7 +164,7 @@ func processContainer(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProv
return result, pKey, nil
}

func processPod(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider, podKeys map[string]podKey, logger *zap.Logger) []*extractors.CAdvisorMetric {
func processPod(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider, podKeys map[string]podKey, logger *zap.Logger, metricExtractors []extractors.MetricExtractor) []*extractors.CAdvisorMetric {
var result []*extractors.CAdvisorMetric
if isContainerInContainer(info.Name) {
logger.Debug("drop metric because it's nested container", zap.String("name", info.Name))
Expand All @@ -183,7 +183,7 @@ func processPod(info *cInfo.ContainerInfo, mInfo extractors.CPUMemInfoProvider,

tags[ci.Timestamp] = strconv.FormatInt(extractors.GetStats(info).Timestamp.UnixNano(), 10)

for _, extractor := range GetMetricsExtractors() {
for _, extractor := range metricExtractors {
if extractor.HasValue(info) {
result = append(result, extractor.GetValue(info, mInfo, ci.TypePod)...)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"go.uber.org/zap"

"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awscontainerinsightreceiver/internal/cadvisor/extractors"
Expand Down Expand Up @@ -49,8 +50,7 @@ func TestIsContainerInContainer(t *testing.T) {

func TestProcessContainers(t *testing.T) {
// set the metrics extractors for testing
originalMetricsExtractors := metricsExtractors
metricsExtractors = []extractors.MetricExtractor{}
metricsExtractors := []extractors.MetricExtractor{}
metricsExtractors = append(metricsExtractors, extractors.NewCPUMetricExtractor(zap.NewNop()))
metricsExtractors = append(metricsExtractors, extractors.NewMemMetricExtractor(zap.NewNop()))
metricsExtractors = append(metricsExtractors, extractors.NewDiskIOMetricExtractor(zap.NewNop()))
Expand All @@ -63,9 +63,10 @@ func TestProcessContainers(t *testing.T) {
containerInContainerInfos := testutils.LoadContainerInfo(t, "./extractors/testdata/ContainerInContainer.json")
containerInfos = append(containerInfos, containerInContainerInfos...)
mInfo := testutils.MockCPUMemInfo{}
metrics := processContainers(containerInfos, mInfo, "eks", zap.NewNop())
metrics := processContainers(containerInfos, mInfo, "eks", zap.NewNop(), metricsExtractors)
assert.Len(t, metrics, 3)

// restore the original value of metrics extractors
metricsExtractors = originalMetricsExtractors
for _, e := range metricsExtractors {
require.NoError(t, e.Shutdown())
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ func TestNetStats(t *testing.T) {

containerType := ci.TypeNode
extractor := NewNetMetricExtractor(nil)
defer require.NoError(t, extractor.Shutdown())
var cMetrics []*CAdvisorMetric
if extractor.HasValue(result[0]) {
cMetrics = extractor.GetValue(result[0], nil, containerType)
Expand Down Expand Up @@ -156,5 +157,4 @@ func TestNetStats(t *testing.T) {
for i := range expectedFields {
AssertContainsTaggedField(t, cMetrics[i], expectedFields[i], expectedTags[i])
}
require.NoError(t, extractor.Shutdown())
}
2 changes: 0 additions & 2 deletions receiver/datadogreceiver/receiver_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -213,8 +213,6 @@ func TestDatadogInfoEndpoint(t *testing.T) {
} {
tc := tc
t.Run(tc.name, func(t *testing.T) {
t.Parallel()

cfg := createDefaultConfig().(*Config)
cfg.Endpoint = "localhost:0" // Using a randomly assigned address

Expand Down
2 changes: 1 addition & 1 deletion receiver/dockerstatsreceiver/generated_package_test.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions receiver/dockerstatsreceiver/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@ status:

sem_conv_version: 1.6.1

tests:
goleak:
ignore:
top:
- "github.com/testcontainers/testcontainers-go.(*Reaper).Connect.func1"
- "net/http.(*persistConn).writeLoop"
- "internal/poll.runtime_pollWait"

# Note: there are other, additional resource attributes that the user can configure through the yaml
resource_attributes:
container.runtime:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ func TestItemCardinalityFilter_Filter(t *testing.T) {
assert.Empty(t, filteredItems)

// Doing this to avoid of relying on timeouts and sleeps(avoid potential flaky tests)
syncChannel := make(chan bool)
syncChannel := make(chan bool, 10)

filterCasted.cache.SetExpirationCallback(func(string, any) {
if filterCasted.cache.Count() > 0 {
Expand Down Expand Up @@ -208,7 +208,7 @@ func TestItemCardinalityFilter_FilterItems(t *testing.T) {
assert.Len(t, filteredItems, totalLimit)

// Doing this to avoid of relying on timeouts and sleeps(avoid potential flaky tests)
syncChannel := make(chan bool)
syncChannel := make(chan bool, 10)

filterCasted.cache.SetExpirationCallback(func(string, any) {
if filterCasted.cache.Count() > 0 {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ import (
)

func TestMain(m *testing.M) {
goleak.VerifyTestMain(m)
goleak.VerifyTestMain(m, goleak.IgnoreTopFunction("github.com/ReneKroon/ttlcache/v2.(*Cache).checkExpirationCallback"))
}
Loading
Loading