-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Taxonomies populated during migration cannot contain term names longer than 255 characters #1017
Comments
I've used the title_length module for nodes with long titles. The module is relatively simple and could probably be adapted to extend the length of name fields. Of course, you could still get really long values. If you are comfortable trimming the value to the max length, use the substr process plugin:
You can even log all of your name values...
and come back after the fact to find any that were longer than 255 (and thus trimmed). |
@seth-shaw-unlv excellent, thanks for the suggestions. I assume this problem might not manifest itself for most 7.x instances, since having long values like those is probably an edge case. I know we've discussed some workflows that allow people working on migrations to do some sort of audit of their metadata prior to running the migration. Stuff like using the substr plugin are a good way to prevent errors from occurring but it would be good to document the problem and ways of preventing it from happening in the first place. @rosiel, @rtilla1 and other MIG groups, what do you think? |
@mjordan Worst case scenario is we have field to hold the long text and then trim the actual name to 255. Sort of like how a node's body gets trimmed for the summary blurb. I wouldn't do it unless it were a serious issue, but at least there's a way for folks to keep their long titles and not upset the drupal gods. |
@dannylamb are you suggesting that we add that extra field to the node (multiple copies of it), or can we add it to the term itself (one copy of it)? |
I'm saying stash it in an extra field on the term itself. Like |
Might be worth considering. Let's let the MIG folks weigh in to see what they think. |
I'm definitely in favor of having an extra field with more space in it for storing longer titles, but instead of using the standard required "title" field to be a truncated version, I'd suggest using that in a way that is the title of the node as opposed to the work that represents it, perhaps in the spirit of an ID or something. This is what we are doing with our current 7.x submission process at FSU, "titles" are automatically generated using a pattern of "type_timestamp_hash", for instance "honorsthesis_1549124815_ab134724" and this title ends up getting reused as a unique ID in the resulting MODS record. Having two different fields for the same title, with one being the "long title" might be confusing and appear as if its an alternate or variant title instead of the true title, while making the titles represent different things makes the difference clear. |
This was discussed at the Feb 6 CLAW call. @alxp proposed (forgive me if my paraphrasing is wrong) that a new field such as field_full_title be created to hold the actual titles/names, and the built-in fields be used rarely-if-at-all due to that restriction. I think the justification for not hacking the core Drupal tables was to make future upgrades safer. |
Just practicing some migrations using Islandora 8 1.0.0 and hit this again. I'm not sure how common it is in the wild to have corporate names, for example, that are longer than 255 but when it occurs, migrations stop dead in their tracks. |
@mjordan using the trim method I mention above will keep the migration moving, however, there is something to be said for letting the migration die to emphasize you have an issue to address. |
@seth-shaw-unlv I have to respectfully disagree with this. I do not want my migration of 100,00 objects to fail at object 99,999 because that one has an errant source metadata value. YYMV but I'd rather have that target object created but the problem logged. We could have it both ways too - a strict mode and a permissive mode. |
@mjordan I don't think we necessarily disagree; I use the trim method myself because I certainly don't want my migrations failing part way through. I just leave open the possibility that someone may prefer the hard fail. Documentation of options and implications is key, as usual. |
Gotcha. Different people will want to take different approaches. My personal preference would be to use a tool like that to find potential problems so they can be addressed prior to a migration. Over in #1021 I attempted to develop a tool to allow for practice migrations. This worked for some fields types, but not for taxonomies since they are referenced and therefore must exist. |
Taxonomy term names cannot exceed 255 characters (see
name
field below):The GUI form that is used to create terms probably enforces this constraint, but dynamically populated terms like those produced by the corporate, subject, geographic, and person migrations that are part of https://github.com/Islandora-Devops/migrate_7x_claw risk failing if the term "name" is longer than 255 characters. Here's an example of what happens:
You would hope that names for people, things, and places as expressed in MODS documents out in the wild are not longer than 255 characters, but they probably exist! Since we're populating term names from uncontolled text values in XML, we need a strategy for dealing with long term names.
The text was updated successfully, but these errors were encountered: