Open
Description
Hi,
I encuontered this bug while trying to scarpe a specific site:
`
page = """
|
|
rules = {'ul':['ul'], 'table':['table']}
output = get_annotated_text(page, ParserConfig(annotation_rules=rules)) // {'text': ' * item1 * item5\n * item2 * item6\n * item3 * item7\n * item4 * item8\n', 'label': [(0, 85, 'table'), (0, 40, 'ul'), (11, 51, 'ul')]}
(start_index, end_index, annotation) = output['label'][1]
(output['text'][start_index:end_index]) //' * item1 * item5\n * item2 * item'
`
as can be seen, accessing the text of the relevant label isn't working as the offsets aren't accurate when viewing a table