Numbers are inverted in MWT and sentence text

Tokens containing multiple adjacent digits are inverted (character order is reversed) in MWT text and sentence comment text throughout the corpus, for example here:

```CoNLL-U
# sent_id = 5930
# text = מ5491 עד 1989 היה זה אזור אסור.
1-2	מ5491	_	_	_	_	_	_	_	_
1	מ	מ	ADP	ADP	_	2	case	_	_
2	1945	1945	NUM	NUM	_	7	nmod	_	_
3	עד	עד	ADP	ADP	_	4	case	_	_
4	1989	1989	NUM	NUM	_	7	nmod	_	_
5	היה	_	AUX	AUX	Gender=Masc|Number=Sing|Person=3|Polarity=Pos|Tense=Past|VerbType=Cop	7	cop	_	_
6	זה	זה	PRON	PRON	Gender=Masc|Number=Sing|Person=3	7	nsubj	_	_
7	אזור	אזור	NOUN	NOUN	Gender=Masc|Number=Sing	0	root	_	_
8	אסור	אסור	ADJ	ADJ	Gender=Masc|Number=Sing	7	amod	_	SpaceAfter=No
9	.	.	PUNCT	PUNCT	_	7	punct	_	_
```
https://github.com/UniversalDependencies/UD_Hebrew-HTB/blob/master/he_htb-ud-test.conllu#L6229-L6230

The second year number in this sentence is correct in both the tokens and the sentence text. The first year number is inverted in the MWT and sentence text, but not in the actual token. I suspect this only(?) happens if there is a MWT, but it's hard to be sure for numbers that aren't obviously year numbers without having the original underlying text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Numbers are inverted in MWT and sentence text #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Numbers are inverted in MWT and sentence text #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions