Skip to content

Include Custom Attributes in Doc.to_array() #5382

Discussion options

You must be logged in to vote

No, not easily, because it's possible to have custom attribute values that can't be represented as integers.

I think you might run into problems with .from_array() if you have appended values in your array (I don't think it has an option to ignore particular columns when loading), but if you manage the details for reloading the docs yourself, then it's certainly an option.

If you want to store large numbers of docs at once, have you looked at using DocBin with the store_user_data option? https://spacy.io/usage/saving-loading#docs The msgpack serialization of custom attributes should be pretty efficient for built-in python types.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / doc Feature: Doc, Span and Token objects
2 participants
Converted from issue

This discussion was converted from issue #5382 on December 11, 2020 00:20.