LdaModel trains beyond size of corpus when using an iterable

#### Problem description

When streaming documents/bag of words to LdaModel via a custom iterable, LdaModel will train beyond the size of the corpus, with output like `19-07-05 22:53:43  PROGRESS: pass 0, at document #178000/50000` -- where the number left to the `/` is higher than the number right to it.

#### Steps/code/corpus to reproduce

```python
from gensim.models import LdaModel
import logging
logging.basicConfig(format='%(asctime)s  %(message)s', \
    datefmt='%y-%m-%d %H:%M:%S', level=logging.INFO)

class TestIterable:
    def __init__(self):
        self.bag_of_words = [(0,2), (3,1), (6,1), (100,2)]
        self.cursor = 0

    def __iter__(self):
        self.cursor = 0
        logging.info('TestIterable() __iter__ was called')
        return self

    def __next__(self):
        if self.cursor < 50000:
            self.cursor += 1
            return self.bag_of_words
        else:
            logging.info('TestIterable() returned StopIteration')
            raise StopIteration


corpus = TestIterable()
# uncommenting this part will make a list out of the corpus
# corpus = [document for document in corpus]

logging.info('performing lda training')
trained_model = LdaModel(corpus, num_topics=2)
```

Using the TestIterable() will result in LdaModel training indefinitively. Converting the TestIterable() corpus to a list will lead to the expected result of a proper training. 

I have not written too many iterables so far, and of course there could be a problem there. But as far as I could infer from the LdaModel documentation, all that is required is an interable -- and to the best of my knowledge, `corpus = TestIterable()` is a proper iterable, and iterator as well. 

Thanks a lot!

#### Versions

Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core
Python 3.6.4 (default, Apr 10 2018, 07:54:00) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]
NumPy 1.14.2
SciPy 1.0.1
gensim 3.7.3
FAST_VERSION 0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LdaModel trains beyond size of corpus when using an iterable #2553

Problem description

Steps/code/corpus to reproduce

Versions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development