-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<citedRange unit="entry"><foreign>...</foreign></citedRange> #310
Comments
I would rather not allow the use of XML elements except for So, if @danbalogh is OK with that, I would suggest we only allow plain text in |
So it seems you are suggesting that I should encode as follows:
That could work, though I'd rather prefer a solution that doesn't force me to diverge form the usual pattern only because I want to see italics. I imagine that people will only want to see italics in cases where the @Unit of |
@arlogriffiths @danbalogh @manufrancis This can be made to work. But we first need to decide what are the criteria for determining whether the contents of We should at least have a way to specify unambiguously whether there is a single item or many. To remove ambiguities, I propose we use the plural form of |
My number one comment on this, you can probably guess, is that this is a fine detail to which we should not devote a lot of time and effort. Number two. @michaelnmmeyer, I'm completely OK with not permitting XML elements within citedRange at all; I'm also OK with permitting them in Three. Why not make the display of Four. I don't know in what way the current solution for determining singular/plural unit does not work. I don't recall the details, but if it doesn't work as expected, couldn't the display transformation be tweaked further? I would very much dislike a further complication of our already hellishly complicated reference encoding with the introduction of units like "pages" etc. In addition to the increased (practically doubled) compexity, I have the following concerns with Michaël's [edited] note that conversion to this in existing files can be automated. The smaller one: OK, so conversion can be automated, and we or Michaël makes the change in all existing files on date X. Can we realistically expect all encoders to switch to the new system consistently from date X onward, or would the auto-conversion have to be repeated regularly? The bigger one: if conversion can indeed be automated, then why can't the same algorithm that makes the conversion in the files be used in the display transformation to achieve the desired display without altering and complicating the code? |
For 3). We have a few cases where several entries are given (as in For 4). The problem is that the format of references is unrestricted, but that the app is still supposed to guess whether they refer to a single item or to multiple ones, and thus often produces "wrong" results. There is no way to fix this besides encoding the reference explicitly with |
Anything you and the PIs can agree on will be acceptable for me, so there is no need to go along with my wishes here. However, For 4) you have not answered my main concern: how can it be possible to automate replacing the value of But I repeat: anything is acceptable to me. |
For 4), manual corrections would indeed be needed. I would rather simplify the current encoding than complicate it. My position is that we would be much happier if we just abandoned all the
everywhere. |
Since I don't think we ever want to make those references machine-actionable to the level of |
Indeed, the main reason why we introduced I don't understand what people could be dissastified about now that we have the option @Unit="mixed" which gives complete freedom, doesn't it? Any "hacked" usage is probably due to people being unaware of the option "@Unit="mixed". I am flexible about any mix of the variables presented so far, as long as we leave the basics of the present system intact. Notably, I am willing to play along with the proposal to introduce explicit encoding of plural in the values of @Unit and the partially automated, partially manual path to implementing the change that Michaël proposes. But I am also able to accept sticking to singular values only in exchange for some loss of flexibilty elsewhere in order for the machine to be able to tell whether sg. or pl. is intended. |
To be noted that Dan's proposal is close to LaTeX's behaviour: you have a special case for citing pages (e.g. |
even if it's only 50% of our references that we're talking about, I insist that we need a structuring mechanism such as the one we have in place. |
Fair enough, let's forget about discarding the existing units. This takes us back to the point where we need solutions for the following details:
Anything I missed? For 1, my preferred solution would be to stick to the present units, and let the display transformation algorithm take care of plural display. Since this does not work perfectly in all circumstances, we need information about the cases where it does not (or is not expected to) work correctly, and assess whether any of those cases are systematic. For the systematic cases, it may be possible to add sub-rules for the transformation algorithm. For the non-systematic cases, we would then have to change the problematic citations to For 2, I think the best solution is to prescribe that In addition, regarding what I anticipate to be systematic cases in 1, I think it would make sense to prescribe in the EGD and EGC that the contents of |
I think it is important to make transformation rules simple enough and easy to remember, so that people can predict what the output will look like and so that they have a chance to remember them. Perhaps more importantly, they should not change (even for "improvements"), because this would inevitably introduce mistakes in existing entries. So, I propose to stick to the core of Dan's comment. We would have:
|
All of this is acceptable to me, provided that the PIs are happy with it. The one thing that worries me is that, at least for the Indian subcontinent, there is a huge number of references to ARIE appendices for which we had specifically required the format |
My current proposal is as follows. I'm numbering each item so that the rest of you can refer to them easily.
Opinions, please. |
Thanks Daniel! My general stand is: I am in favour of straightforward rules with as few exceptions as possible.
2.A. OK. I fully agree with "We should not need an idiosyncratic label for the sake of one publication, no matter how fundamental." 2.B. very OK for explicit plural labels.
4.A. I would dispense with automated plural detection (since with 2.B we will have values for explicit plural labels). 4.A.1 I can live without en dash, that is without machine en dash recognition (but those who like it would just need to type them their XMLs). 4.B. very OK. 4.C. OK for italicisation of contents of 4.D.1 Seems to me a case of idiosyncratic practice for very rare cases. Those who need to refer to several "entries" could just enter as many 4.E. [edit: newly added] Instead of explicit plurals, for some or all of the rarer units (to be discussed which), we could instead suggest using |
Thank you, Manu. Some comments on your comments.
|
I am wondering if a cost/benefit analysis has been made. What would be lose if we just kept our EGD as it is? How much autoamization can be brought to bear if we implement all these numerous changes? Since, as Manu says, TIME FLIES, I am not so eager to be obliged to manually adjust the @Unit values of thousands of <citedRange> elements, if this can be avoided either by automization of the ajdustment process or by not changing encoding rules at all.
Le 31 oct. 2024 à 15:53, manufrancis ***@***.******@***.***>> a écrit :
Thanks Daniel!
My general stand is:
I am in favour of straightforward rules with as few exceptions as possible.
I would in general be as explicit as possible so as to avoid machine plural recognition or machine en dash recognition. There will be, I guess, many unrecognised or unforeseeable cases.
I think Michaël should not devote time to "trivial" things such as en dash, italicisation of entries, etc. TIME FLIES.
1. I would rather avoid such an exception. Which would result in people not using @Unit<https://github.com/Unit> when they should have to.
2.A. OK. I fully agree with "We should not need an idiosyncratic label for the sake of one publication, no matter how fundamental."
2.B. very OK for explicit plural labels.
1. OK with no XML elements at all within <citedRange>. If someone feels the need to use <foreign> in <Bibl>, let him italicise words in free-text in the epigraphical lemma.
4.A. I would dispense with automated plural detection (since with 2.B we will have values for explicit plural labels).
4.A.1 I can live without en dash, that is without machine en dash recognition (but those who like it would just need to type them their XMLs).
4.B. very OK.
4.C. OK for italicisation of contents of <citedRange unit="entry"> (the more so as it does not require using <foreign>).
4.D.1 Seems to me a case of idiosyncratic practice for very rare cases. Those who need to refer to several "entries" could just enter as many <Bibl> as they have entries (or, if permitted, but I guess it is not, have several <citedRange unit="entry"> in a <Bibl>) (or be explicit in free-text in the epigraphical lemma).
—
Reply to this email directly, view it on GitHub<#310 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAGMAE7JKA7WKGJJDTXP3Q3Z6HVXLAVCNFSM6AAAAABIEW37TCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBZGM2DONBTG4>.
You are receiving this because you were assigned.Message ID: ***@***.***>
|
Keeping things as they are is also acceptable to me. My impression is that the main motive for revising the citation system has always been your (@arlogriffiths ) desire to get meticulously styled displays, with plural units when applicable and italics where desired but nowhere else.
That's it. I don't think there could be thousands of these in the corpus. I'm OK to live with the costs and also OK to live without the benefits, so whatever you PIs can agree on. |
I concur with Dan:
And I am ready to update accordingly my XML files (and have my team to update theirs). |
@arlogriffiths : the ball seems to be in your court. |
code:
<bibl><ptr target="bib:Goris1954_01"/><citedRange unit="volume">2</citedRange><citedRange unit="page">319</citedRange><citedRange unit="entry"><foreign>tanggung</foreign></citedRange></bibl>
display:
issue: The use of
<foreign>
in the above context does not yet have the desired effect in display.The text was updated successfully, but these errors were encountered: