Skip to content

Commit

Permalink
2023-08-25-e5_small_en (#13939)
Browse files Browse the repository at this point in the history
* Add model 2023-08-25-e5_small_en

* Add model 2023-08-25-e5_small_opt_en

* Add model 2023-08-25-e5_small_quantized_en

* Add model 2023-08-25-e5_base_en

* Add model 2023-08-25-e5_small_v2_opt_en

* Add model 2023-08-25-e5_base_opt_en

* Add model 2023-08-25-e5_base_quantized_en

* Add model 2023-08-25-e5_small_v2_en

* Add model 2023-08-25-e5_small_v2_quantized_en

* Add model 2023-08-25-e5_base_v2_en

* Add model 2023-08-25-e5_base_v2_opt_en

* Add model 2023-08-25-e5_base_v2_quantized_en

* Add model 2023-08-25-e5_large_v2_en

---------

Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com>
  • Loading branch information
jsl-models and ahmedlone127 authored Aug 25, 2023
1 parent 06f07da commit b1b99f5
Show file tree
Hide file tree
Showing 13 changed files with 874 additions and 0 deletions.
67 changes: 67 additions & 0 deletions docs/_posts/ahmedlone127/2023-08-25-e5_base_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
layout: model
title: E5 Base Sentence Embeddings
author: John Snow Labs
name: e5_base
date: 2023-08-25
tags: [en, open_source, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.0
spark_version: 3.0
supported: true
engine: onnx
annotator: E5Embeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/e5_base_en_5.1.0_3.0_1692963566674.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/e5_base_en_5.1.0_3.0_1692963566674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
embeddings =E5Embeddings.pretrained("e5_base","en") \
.setInputCols(["documents"]) \
.setOutputCol("instructor")

pipeline = Pipeline().setStages([document_assembler, embeddings])
```
```scala
val embeddings = E5Embeddings.pretrained("e5_base","en")
.setInputCols(["document"])
.setOutputCol("e5_embeddings")
val pipeline = new Pipeline().setStages(Array(document, embeddings))
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|e5_base|
|Compatibility:|Spark NLP 5.1.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents]|
|Output Labels:|[e5]|
|Language:|en|
|Size:|258.6 MB|
67 changes: 67 additions & 0 deletions docs/_posts/ahmedlone127/2023-08-25-e5_base_opt_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
layout: model
title: E5 Base Sentence Embeddings
author: John Snow Labs
name: e5_base_opt
date: 2023-08-25
tags: [en, open_source, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.0
spark_version: 3.0
supported: true
engine: onnx
annotator: E5Embeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/e5_base_opt_en_5.1.0_3.0_1692963694288.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/e5_base_opt_en_5.1.0_3.0_1692963694288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
embeddings =E5Embeddings.pretrained("e5_base","en") \
.setInputCols(["documents"]) \
.setOutputCol("instructor")

pipeline = Pipeline().setStages([document_assembler, embeddings])
```
```scala
val embeddings = E5Embeddings.pretrained("e5_base","en")
.setInputCols(["document"])
.setOutputCol("e5_embeddings")
val pipeline = new Pipeline().setStages(Array(document, embeddings))
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|e5_base_opt|
|Compatibility:|Spark NLP 5.1.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents]|
|Output Labels:|[e5]|
|Language:|en|
|Size:|258.7 MB|
67 changes: 67 additions & 0 deletions docs/_posts/ahmedlone127/2023-08-25-e5_base_quantized_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
layout: model
title: E5 Base Sentence Embeddings
author: John Snow Labs
name: e5_base_quantized
date: 2023-08-25
tags: [en, open_source, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.0
spark_version: 3.0
supported: true
engine: onnx
annotator: E5Embeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/e5_base_quantized_en_5.1.0_3.0_1692963757236.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/e5_base_quantized_en_5.1.0_3.0_1692963757236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
embeddings =E5Embeddings.pretrained("e5_base","en") \
.setInputCols(["documents"]) \
.setOutputCol("instructor")

pipeline = Pipeline().setStages([document_assembler, embeddings])
```
```scala
val embeddings = E5Embeddings.pretrained("e5_base","en")
.setInputCols(["document"])
.setOutputCol("e5_embeddings")
val pipeline = new Pipeline().setStages(Array(document, embeddings))
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|e5_base_quantized|
|Compatibility:|Spark NLP 5.1.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents]|
|Output Labels:|[e5]|
|Language:|en|
|Size:|67.1 MB|
68 changes: 68 additions & 0 deletions docs/_posts/ahmedlone127/2023-08-25-e5_base_v2_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
layout: model
title: E5 Base v2 Sentence Embeddings
author: John Snow Labs
name: e5_base_v2
date: 2023-08-25
tags: [e5, sentence_embeddings, en, open_source, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.0
spark_version: 3.0
supported: true
engine: onnx
annotator: E5Embeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/e5_base_v2_en_5.1.0_3.0_1692964050132.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/e5_base_v2_en_5.1.0_3.0_1692964050132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
embeddings =E5Embeddings.pretrained("e5_base_v2","en") \
.setInputCols(["documents"]) \
.setOutputCol("instructor")

pipeline = Pipeline().setStages([document_assembler, embeddings])
```
```scala
val embeddings = E5Embeddings.pretrained("e5_base_v2","en")
.setInputCols(["document"])
.setOutputCol("e5_embeddings")

val pipeline = new Pipeline().setStages(Array(document, embeddings))
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|e5_base_v2|
|Compatibility:|Spark NLP 5.1.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents]|
|Output Labels:|[e5]|
|Language:|en|
|Size:|258.7 MB|
68 changes: 68 additions & 0 deletions docs/_posts/ahmedlone127/2023-08-25-e5_base_v2_opt_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
layout: model
title: E5 Base v2 Sentence Embeddings
author: John Snow Labs
name: e5_base_v2_opt
date: 2023-08-25
tags: [e5, sentence_embeddings, en, open_source, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.0
spark_version: 3.0
supported: true
engine: onnx
annotator: E5Embeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/e5_base_v2_opt_en_5.1.0_3.0_1692964193495.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/e5_base_v2_opt_en_5.1.0_3.0_1692964193495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
embeddings =E5Embeddings.pretrained("e5_base_v2","en") \
.setInputCols(["documents"]) \
.setOutputCol("instructor")

pipeline = Pipeline().setStages([document_assembler, embeddings])
```
```scala
val embeddings = E5Embeddings.pretrained("e5_base_v2","en")
.setInputCols(["document"])
.setOutputCol("e5_embeddings")

val pipeline = new Pipeline().setStages(Array(document, embeddings))
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|e5_base_v2_opt|
|Compatibility:|Spark NLP 5.1.0+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents]|
|Output Labels:|[e5]|
|Language:|en|
|Size:|258.8 MB|
Loading

0 comments on commit b1b99f5

Please sign in to comment.