This repo contains the data and code for an investigation into the possibility of using a text's rhythmical pattern as a feature for author attribution.
Henry of Brussels (or in Dutch: Hendrik van Brussel) was a poet, active in the larger Brussels area between ca. 1250-ca. 1330. Of one particular text, we can be rather sure that it was written by him, since he incorporated his name in its final verse lines: Van de Coninc Saladijn (English: Of King Saladin).
Two other texts have been rather tentatively attributed to Henry:
-
His his translation of Roman de la Rose, in Dutch: De Brabantse Rose Most scholars agree on Henry being the author of this text, since the scribe mentions a certain 'Henry of Brussels', being the original translator of the text.
-
An abridged version of the story of Cassamus has also been attributed to Henry, based on the fact that it immediately follows the abovementioned Rose-text in the manuscript (and is even sort of interlaced with it).
-
Finally, the Roman van Limborch, an adventurous chivalric epic has also been attributed to him. However, the evidence for this attribution is very thin: one verse line mentions a certain "Heinriic" [...] "whom made this text".
On the basis of a stylometric investigation, Mike Kestemont has reopened the controversial issue surrounding the attribution of Middle Dutch texts to Henry of Brussels. Stylometric comparisons show that, based on the rhyme vocabulary used, there is a great similarity between the Cassamus, the Roman van Limborch and the Brabantse Rose. Due to its short length, it is more difficult to determine the authorship of Van de Coninc Saladijn.
Interestingly, it has been noted by Knuvelder that both the Roman van Limborch and the De Brabantse Rose share a "well-developed rhythm". Obviously, this is more or less an intuition of a scholar (who is of course well versed in Middle Dutch poetry). Yet it raises an intriguing question: would it be possible to -- in addition to the similar rhyme word vocabulary -- link these texts based on their rhythmical characteristics?
The data used in this project is fully automatic scanned Middle Dutch poetry. Note: the development of the scansion machine is the subject of previous projects. In essence, it's strings of 0's (unstressed) and 1's (stressed). For example:
Verse line | Stress pattern |
---|---|
het es gheseit dat in drome | [0][1][0, 1][0][0][1, 0] |
niet ne es dan ydele gome | [1][0][0][0][1, 0, 0][1, 0] |
loghene ende onwaerhede | [1, 0, 0][1, 0][0, 1, 0, 0] |
nochtan heeft men te meniger stede | [0, 1][0][1][0][1, 0, 0][1, 0] |
dicken die drome vonden waer | [1, 0][0][1, 0][1, 0][1] |
ende die dinghen ghescien daer naer | [1, 0][0][1, 0][0, 1][0][1] |