Skip to content

Ensure HashSet.TrimExcess() does not change the count #97989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions src/libraries/Common/tests/System/Collections/TestBase.Generic.cs
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,24 @@ protected IEnumerable<T> CreateHashSet(IEnumerable<T> enumerableToMatchTo, int c
return set;
}

#if NETCOREAPP
/// <summary>
/// Create a HashSet with a specific initial capacity and fill it with a specific number of elements.
/// </summary>
protected HashSet<T> CreateHashSetWithCapacity(int count, int capacity)
{
var set = new HashSet<T>(capacity, GetIEqualityComparer());
int seed = 528;

for (int i = 0; i < count; i++)
{
while (!set.Add(CreateT(seed++)));
}

return set;
}
#endif

/// <summary>
/// Helper function to create an SortedSet fulfilling the given specific parameters. The function will
/// create an SortedSet using the Comparer constructor and then add values
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -212,12 +212,39 @@ public void HashSet_TrimAccessWithInvalidArg_ThrowOutOfRange(int size, int newCa
AssertExtensions.Throws<ArgumentOutOfRangeException>(() => hashSet.TrimExcess(newCapacity));
}

[Fact]
public void TrimExcess_Generic_LargeInitialCapacity_TrimReducesSize()
[Theory]
[InlineData(0, 20, 7)]
[InlineData(10, 20, 10)]
[InlineData(10, 20, 13)]
public void HashHet_Generic_TrimExcess_LargePopulatedHashSet_TrimReducesSize(int initialCount, int initialCapacity, int trimCapacity)
{
HashSet<T> set = CreateHashSetWithCapacity(initialCount, initialCapacity);
HashSet<T> clone = new(set, set.Comparer);

Assert.True(set.Capacity >= initialCapacity);
Assert.Equal(initialCount, set.Count);

set.TrimExcess(trimCapacity);

Assert.True(trimCapacity <= set.Capacity && set.Capacity < initialCapacity);
Assert.Equal(initialCount, set.Count);
Assert.Equal(clone, set);
}

[Theory]
[InlineData(10, 20, 0)]
[InlineData(10, 20, 7)]
public void HashHet_Generic_TrimExcess_LargePopulatedHashSet_TrimCapacityIsLessThanCount_ThrowsArgumentOutOfRangeException(int initialCount, int initialCapacity, int trimCapacity)
{
var set = new HashSet<T>(20);
set.TrimExcess(7);
Assert.Equal(7, set.Capacity);
HashSet<T> set = CreateHashSetWithCapacity(initialCount, initialCapacity);

Assert.True(set.Capacity >= initialCapacity);
Assert.Equal(initialCount, set.Count);

Assert.Throws<ArgumentOutOfRangeException>(() => set.TrimExcess(trimCapacity));

Assert.True(set.Capacity >= initialCapacity);
Assert.Equal(initialCount, set.Count);
}

[Theory]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1010,7 +1010,7 @@ private void Resize(int newSize, bool forceNewHashCodes)
/// Sets the capacity of a <see cref="HashSet{T}"/> object to the actual number of elements it contains,
/// rounded up to a nearby, implementation-specific value.
/// </summary>
public void TrimExcess() => SetCapacity(Count);
public void TrimExcess() => TrimExcess(Count);

/// <summary>
/// Sets the capacity of a <see cref="HashSet{T}"/> object to the specified number of entries,
Expand All @@ -1022,12 +1022,6 @@ public void TrimExcess(int capacity)
{
ArgumentOutOfRangeException.ThrowIfLessThan(capacity, Count);

SetCapacity(capacity);
}

private void SetCapacity(int capacity)
{
Debug.Assert(capacity >= Count);
int newSize = HashHelpers.GetPrime(capacity);
Entry[]? oldEntries = _entries;
int currentCapacity = oldEntries == null ? 0 : oldEntries.Length;
Expand Down Expand Up @@ -1055,7 +1049,7 @@ private void SetCapacity(int capacity)
}
}

_count = capacity;
_count = count;
_freeCount = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_freeCount = capacity - count; ?

Copy link
Contributor

@weltkante weltkante Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, not necessarily

  • _count is the number of "used" array slots (both free and actual hashset content)
  • _freeCount is the number of free slots within the first _count array slots

So the slots starting from index _count to the end of the backing array are not counted in _freeCount. This is beneficial because you can skip this section when iterating the hashset, so you don't want to include them in the counters. Trimming moves all content to the front of the new array so you get the most out of this repacking by leaving _freeCount at zero.

}

Expand Down