Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tantivy] Fix for schema deserialization error #902

Merged
merged 1 commit into from
Oct 5, 2020

Conversation

ppodolsky
Copy link
Contributor

There is a feature in serde_yaml that it cannot deal with borrowed strings. It makes impossible to deserialize yaml to Schema due to error invalid type: string \"field_name\", expected a borrowed string
This extra allocation is a tol-tol solution, but it's a far more easier way then rewriting whole serde_yaml and I hope deserializing schema is not a hot path.

@fulmicoton
Copy link
Collaborator

I don't understand anything about this bug.

Tantivy does not use Yaml. If you really need yaml serialization that's a feature request you want and you will need to explain why this is useful to you.

@fulmicoton fulmicoton closed this Oct 5, 2020
@ppodolsky
Copy link
Contributor Author

ppodolsky commented Oct 5, 2020

There was a one line test in the PR that did not work:
serde_yaml::from_str::<FieldEntry>(field_content).unwrap();
or similar
serde_yaml::from_str::<Schema>(schema_content).unwrap();

If you really need yaml serialization that's a feature request

You already have yaml serialization due to using serde crate :) There is just a small trouble with deserialization.

@ppodolsky
Copy link
Contributor Author

ppodolsky commented Oct 5, 2020

About the reasons - we are storing schemas in files, not in the code. Yaml is much easier for human to read and edit:

[{
  "name": "title",
  "type": "text",
  "options": {
    "indexing": {
      "record": "position",
      "tokenizer": "default"
    },
    "stored": false
  }
}, {
  "name": "body",
  "type": "text",
  "options": {
    "indexing": {
      "record": "position",
      "tokenizer": "default"
    },
    "stored": false
}]

vs

- name: text
  type: text
  options:
    indexing:
      record: position
      tokenizer: default
    stored: true
- name: body
  type: text
  options:
    indexing:
      record: position
      tokenizer: default
    stored: true

You can easily be mistaken with these curvy brackets while editing file manually

@fulmicoton
Copy link
Collaborator

fulmicoton commented Oct 5, 2020

I see... And I had missed the fact that the dependency was just a dev-dependency for the unit test! I think that is ok then.
I may or may not ditch the unit test in the future when I get annoyed by the extra compilation time.

@fulmicoton fulmicoton reopened this Oct 5, 2020
@fulmicoton fulmicoton merged commit 687a36a into quickwit-oss:main Oct 5, 2020
@ppodolsky
Copy link
Contributor Author

I may or may not ditch the unit test in the future when I get annoyed by the extra compilation time.

Sure, the test is mostly for illustrating the issue.

@fulmicoton
Copy link
Collaborator

Yes it is great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants