Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundsException in ByteBlockPool #1003

Open
1 task done
hidingmyname opened this issue Oct 29, 2024 · 0 comments
Open
1 task done

ArrayIndexOutOfBoundsException in ByteBlockPool #1003

hidingmyname opened this issue Oct 29, 2024 · 0 comments

Comments

@hidingmyname
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

A field with a very large number of small tokens can cause ArrayIndexOutOfBoundsException in ByteBlockPool due to an arithmetic overflow.

The issue was originally reported in Lucene(https://issues.apache.org/jira/browse/LUCENE-8614 and https://issues.apache.org/jira/browse/LUCENE-10441), where an arithmetic overflow occurs in the byteOffset calculation when BytesBlockPool advances to the next buffer on the last line of the nextBuffer() method.

Although the statuses of the two issue reports from Lucene remain open, the developers have, in fact, resolved this issue through PR.

The resolution in Lucene involves using Math.addExact to throw an ArithmeticException when the offset overflows in a ByteBlockPool. The fix code in ByteBlockPool as below:

- byteOffset += BYTE_BLOCK_SIZE;
+ byteOffset = Math.addExact(byteOffset, BYTE_BLOCK_SIZE);

A test case is presented in the Lucene repo. I have migrated this test case to Lucene.Net version and the test fails. See Steps To Reproduce.

Expected Behavior

Throw an ArithmeticException when the offset overflows in a ByteBlockPool.

Steps To Reproduce

The migrated test case is provided as below:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Core;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.QueryParsers.Simple;
using Lucene.Net.Search;
using Lucene.Net.Util;
using NUnit.Framework;

namespace TestProject1
{
    [TestFixture]
    public class Test_1
    {
        [Test]
        public void TestTooManyAllocs()
        {
            // Use a mock allocator that doesn't waste memory
            ByteBlockPool pool = new ByteBlockPool(new MockAllocator(0));
            pool.NextBuffer();

            bool throwsException = false;
            int maxIterations = int.MaxValue / ByteBlockPool.BYTE_BLOCK_SIZE + 1;

            for (int i = 0; i < maxIterations; i++)
            {
                try
                {
                    pool.NextBuffer();
                }
                catch (OverflowException)
                {
                    // The offset overflows on the last attempt to call NextBuffer()
                    throwsException = true;
                    break;
                }
            }

            Assert.That(throwsException, Is.True);
            Assert.That(pool.ByteOffset + ByteBlockPool.BYTE_BLOCK_SIZE < pool.ByteOffset, Is.True);
        }

        private class MockAllocator : ByteBlockPool.Allocator
        {
            private readonly byte[] buffer;

            public MockAllocator(int blockSize) : base(blockSize)
            {
                buffer = Array.Empty<byte>();
            }

            public override void RecycleByteBlocks(byte[][] blocks, int start, int end)
            {
                // No-op
            }

            public override byte[] GetByteBlock()
            {
                return buffer;
            }
        }
    }
}

Exceptions (if any)

Assert.That(throwsException, Is.True)
Expected: True
But was: False

Lucene.NET Version

4.8.0-beta00016

.NET Version

8.0.403

Operating System

Windows 10

Anything else?

No response

@paulirwin paulirwin added this to the 4.8.0-beta00018 milestone Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants