Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add entity linking script #243

Merged
merged 2 commits into from
Nov 12, 2020
Merged

Add entity linking script #243

merged 2 commits into from
Nov 12, 2020

Conversation

x389liu
Copy link
Member

@x389liu x389liu commented Nov 12, 2020

No description provided.

### Input Prep

Let us take MS MARCO passage dataset as an example. We need to download the MS MARCO passage dataset and convert the tsv collection into jsonl files by following the
detailed instruction [here](https://github.com/x389liu/pyserini/blob/master/docs/experiments-msmarco-passage.md#data-prep).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the original repo, not your clone?


### REL

First, we follow the github [instruction](https://github.com/informagi/REL#installation-from-source) to install REL and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub

@@ -171,3 +171,78 @@ Then we have sentences:
| 4 | If she wins, she will join Theresa May of Britain and Angela Merkel of Germany in the ranks of women who lead prominent Western democracies. |
| ... | ... |

## Entity Linking

Unfortunately, spaCy does not provide any pre-trained Entity Linking model currently. However, we found another great
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entity linking should be lowercase.

We adopt the convention of each sentence on its own line - please reformat?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants