Optimizing parallelprocessing

# Enhancement Request
Your issue may already be reported!
Please check out our [active issues](https://github.com/caffeine-addictt/FastAPI-ToDoApp/issues) before creating one.

## Is Your Enhancement Request Related to an Issue?

At present, the Parallel Processing class is utilizing numpy to sort datasets into its chunk form where
- (Number of chunks == throttled max threads) <= specified max threads
- Chunk size == ( (dataset size % throttled max threads) + (0 or 1) )

Numpy is only used for calculating the chunks, which should not be the best solution for thread as the numpy library is huge and would be impractical for this use case.

This got me thinking: _Would it be more practical to drop numpy for pure python alternative or stick with numpy's C utilization?_

## Additional Context


To figure this out, I profiled a pure python solution with the numpy solution as found out that for a dataset of 10^6 entries:

![profiling result](https://github.com/caffeine-addictt/thread/assets/104479537/80f793d8-a90b-4d9a-bb62-f9511263e7b1)

profilingNP.py
```py
import time
import numpy

def profile(func):
  def wrapped(*args, **kwargs):
    iteration = 100
    total_time = 0

    for _ in range(iteration):
      start = time.perf_counter()
      result = func(*args, **kwargs)
      total_time += (time.perf_counter() - start)

    avg_time = round(total_time / iteration, 10)
    print(f'{func.__name__} took on average of {avg_time}s for {iteration} iterations')

    return result, avg_time
  return wrapped


dataset = list(range(10**6))
threads = 8

# numpy solution
@profile
def np():
  chunks = numpy.array_split(dataset, threads)
  return [ chunk.tolist() for chunk in chunks ]

@profile
def pure():
  length = len(dataset)
  chunk_count = length // threads
  overflow = length % threads

  i = 0
  final = []
  while i < length:
    chunk_length = chunk_count + int(overflow > 0)
    b = i + chunk_length

    final.append(dataset[i:b])
    overflow -= 1
    i = b

  return final


if __name__ == '__main__':
  npResult, npTime = np()
  pureResult, pureTime = pure()

  print(f'Pure python was {-1 * round(((pureTime - npTime) / npTime) * 100, 10)}% faster than the numpy solution')

  assert npResult == pureResult, 'There was an algorithm error'
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimizing parallelprocessing #24

Enhancement Request

Is Your Enhancement Request Related to an Issue?

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimizing parallelprocessing #24

Description

Enhancement Request

Is Your Enhancement Request Related to an Issue?

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions