Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023-07-30-albert_embeddings_ALR_BERT_ro #13910

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
3fa6517
Add model 2023-07-30-albert_embeddings_ALR_BERT_ro
ahmedlone127 Jul 30, 2023
cde32e0
Add model 2023-07-30-albert_embeddings_albert_base_japanese_v1_ja
ahmedlone127 Jul 30, 2023
32c799e
Add model 2023-07-30-albert_embeddings_albert_large_arabic_ar
ahmedlone127 Jul 30, 2023
7cb92b3
Add model 2023-07-30-albert_embeddings_albert_fa_base_v2_fa
ahmedlone127 Jul 30, 2023
7748145
Add model 2023-07-30-albert_embeddings_albert_german_ner_de
ahmedlone127 Jul 30, 2023
827cd1d
Add model 2023-07-30-albert_embeddings_albert_fa_zwnj_base_v2_fa
ahmedlone127 Jul 30, 2023
4d93517
Add model 2023-07-30-albert_embeddings_marathi_albert_mr
ahmedlone127 Jul 30, 2023
f77835a
Add model 2023-07-30-albert_embeddings_albert_tiny_bahasa_cased_ms
ahmedlone127 Jul 30, 2023
5569cf9
Add model 2023-07-30-albert_embeddings_albert_base_bahasa_cased_ms
ahmedlone127 Jul 30, 2023
167c396
Add model 2023-07-30-albert_embeddings_fralbert_base_fr
ahmedlone127 Jul 30, 2023
d135b9e
Add model 2023-07-30-albert_embeddings_marathi_albert_v2_mr
ahmedlone127 Jul 30, 2023
d2cfe6a
Add model 2023-07-30-albert_embeddings_albert_base_arabic_ar
ahmedlone127 Jul 30, 2023
6c6beef
Add model 2023-07-30-albert_embeddings_albert_large_bahasa_cased_ms
ahmedlone127 Jul 30, 2023
42e5f55
Add model 2023-07-30-camembert_embeddings_das22_10_camembert_pretrain…
ahmedlone127 Jul 30, 2023
f0f5261
Add model 2023-07-30-camembert_embeddings_zhenghuabin_generic_model_fr
ahmedlone127 Jul 30, 2023
15e0255
Add model 2023-07-30-camembert_embeddings_das22_10_camembert_pretrain…
ahmedlone127 Jul 30, 2023
ed624b5
Add model 2023-07-30-camembert_embeddings_camembert_mlm_fr
ahmedlone127 Jul 30, 2023
3bef5bf
Add model 2023-07-30-camembert_embeddings_edge2992_generic_model_fr
ahmedlone127 Jul 30, 2023
9cc5153
Add model 2023-07-30-camembert_embeddings_elusive_magnolia_generic_mo…
ahmedlone127 Jul 30, 2023
939c740
Add model 2023-07-30-camembert_embeddings_zhenghuabin_generic_model_fr
ahmedlone127 Jul 30, 2023
17ba326
Add model 2023-07-30-camembert_embeddings_camembert_aux_amandes_mt
ahmedlone127 Jul 30, 2023
f49a9bc
Add model 2023-07-30-camembert_embeddings_elliotsmith_generic_model_fr
ahmedlone127 Jul 30, 2023
5c9a8d6
Add model 2023-07-30-camembert_embeddings_dianeshan_generic_model_fr
ahmedlone127 Jul 30, 2023
9ed8525
fixed wrong version
ahmedlone127 Jul 31, 2023
574699e
Add model 2023-07-31-camembert_embeddings_ankitkupadhyay_generic_mode…
ahmedlone127 Jul 31, 2023
b9d18d2
Add model 2023-07-31-camembert_embeddings_devtrent_generic_model_fr
ahmedlone127 Jul 31, 2023
61e1750
Add model 2023-07-31-camembert_embeddings_eduardopds_generic_model_fr
ahmedlone127 Jul 31, 2023
4e3150c
Add model 2023-07-31-camembert_embeddings_adeiMousa_generic_model_fr
ahmedlone127 Jul 31, 2023
d1a13b7
Add model 2023-07-31-camembert_embeddings_ericchchiu_generic_model_fr
ahmedlone127 Jul 31, 2023
0887eb5
Add model 2023-07-31-camembert_embeddings_Sebu_generic_model_fr
ahmedlone127 Jul 31, 2023
f8905a7
Add model 2023-07-31-camembert_embeddings_Weipeng_generic_model_fr
ahmedlone127 Jul 31, 2023
9e81bb3
Add model 2023-07-31-camembert_embeddings_codingJacob_generic_model_fr
ahmedlone127 Jul 31, 2023
6c3267c
Add model 2023-07-31-camembert_embeddings_SummFinFR_fr
ahmedlone127 Jul 31, 2023
7d2cf8e
Add model 2023-07-31-camembert_embeddings_MYX4567_generic_model_fr
ahmedlone127 Jul 31, 2023
1688272
Add model 2023-07-31-camembert_embeddings_Katster_generic_model_fr
ahmedlone127 Jul 31, 2023
10e443e
Add model 2023-07-31-camembert_embeddings_MYX4567_generic_model_fr
ahmedlone127 Jul 31, 2023
3fa03d5
Add model 2023-07-31-camembert_embeddings_JonathanSum_generic_model_fr
ahmedlone127 Jul 31, 2023
7e9c823
Add model 2023-07-31-camembert_embeddings_Leisa_generic_model_fr
ahmedlone127 Jul 31, 2023
ad58b0e
Add model 2023-07-31-camembert_embeddings_adam1224_generic_model_fr
ahmedlone127 Jul 31, 2023
8941058
Add model 2023-07-31-camembert_embeddings_est_roberta_et
ahmedlone127 Jul 31, 2023
e50073c
Add model 2023-07-31-camembert_embeddings_generic2_fr
ahmedlone127 Jul 31, 2023
2f1fe4d
Add model 2023-07-31-camembert_embeddings_ysharma_generic_model_2_fr
ahmedlone127 Jul 31, 2023
9ee72c6
Add model 2023-07-31-camembert_embeddings_DoyyingFace_generic_model_fr
ahmedlone127 Jul 31, 2023
fc188fc
Add model 2023-07-31-camembert_embeddings_Henrywang_generic_model_fr
ahmedlone127 Jul 31, 2023
8e355ee
Add model 2023-07-31-camembert_embeddings_xkang_generic_model_fr
ahmedlone127 Jul 31, 2023
1c33bfd
Add model 2023-07-31-camembert_embeddings_wangst_generic_model_fr
ahmedlone127 Jul 31, 2023
90c9145
Add model 2023-07-31-camembert_embeddings_seyfullah_generic_model_fr
ahmedlone127 Jul 31, 2023
24c00f5
Add model 2023-07-31-camembert_embeddings_tnagata_generic_model_fr
ahmedlone127 Jul 31, 2023
755e135
Add model 2023-07-31-camembert_embeddings_yancong_generic_model_fr
ahmedlone127 Jul 31, 2023
107a644
Add model 2023-07-31-camembert_embeddings_safik_generic_model_fr
ahmedlone127 Jul 31, 2023
c689ebb
Add model 2023-07-31-camembert_embeddings_tpanza_generic_model_fr
ahmedlone127 Jul 31, 2023
83409d4
Add model 2023-07-31-camembert_embeddings_peterhsu_generic_model_fr
ahmedlone127 Jul 31, 2023
0a17c4f
Add model 2023-07-31-camembert_embeddings_pgperrone_generic_model_fr
ahmedlone127 Jul 31, 2023
054ad09
Add model 2023-07-31-camembert_embeddings_osanseviero_generic_model_fr
ahmedlone127 Jul 31, 2023
59c8fb7
Add model 2023-07-31-camembert_embeddings_lijingxin_generic_model_fr
ahmedlone127 Jul 31, 2023
6a16bdb
Add model 2023-08-01-camembert_embeddings_kaushikacharya_generic_mode…
ahmedlone127 Aug 1, 2023
da6cea2
Add model 2023-08-01-camembert_embeddings_new_generic_model_fr
ahmedlone127 Aug 1, 2023
5dced86
Add model 2023-08-01-camembert_embeddings_mbateman_generic_model_fr
ahmedlone127 Aug 1, 2023
f16a0fb
Add model 2023-08-01-camembert_embeddings_lijingxin_generic_model_2_fr
ahmedlone127 Aug 1, 2023
061ce2e
Add model 2023-08-01-camembert_embeddings_katrin_kc_generic_model_fr
ahmedlone127 Aug 1, 2023
364a258
Add model 2023-08-01-camembert_embeddings_linyi_generic_model_fr
ahmedlone127 Aug 1, 2023
614636a
Add model 2023-08-01-camembert_embeddings_lewtun_generic_model_fr
ahmedlone127 Aug 1, 2023
35a8099
Add model 2023-08-01-camembert_embeddings_joe8zhang_generic_model_fr
ahmedlone127 Aug 1, 2023
0d63cc6
Add model 2023-08-01-camembert_embeddings_sloberta_sl
ahmedlone127 Aug 1, 2023
bb679a5
Add model 2023-08-01-camembert_embeddings_generic_model_test_fr
ahmedlone127 Aug 1, 2023
feaec8d
Add model 2023-08-01-camembert_embeddings_jcai1_generic_model_fr
ahmedlone127 Aug 1, 2023
4d16433
Add model 2023-08-01-camembert_embeddings_umberto_commoncrawl_cased_v…
ahmedlone127 Aug 1, 2023
0dcc83e
Add model 2023-08-01-camembert_embeddings_DataikuNLP_camembert_base_fr
ahmedlone127 Aug 1, 2023
b570905
Add model 2023-08-01-camembert_embeddings_umberto_wikipedia_uncased_v…
ahmedlone127 Aug 1, 2023
0f181ce
Add model 2023-08-01-camembert_base_oscar_4gb_fr
ahmedlone127 Aug 1, 2023
abd307b
Add model 2023-08-01-camembert_embeddings_distilcamembert_base_fr
ahmedlone127 Aug 1, 2023
05aa016
Add model 2023-08-01-camembert_base_wikipedia_4gb_fr
ahmedlone127 Aug 1, 2023
70f3bbd
Add model 2023-08-01-camembert_base_ccnet_fr
ahmedlone127 Aug 1, 2023
d1af927
Add model 2023-08-01-camembert_base_oscar_4gb_fr
ahmedlone127 Aug 1, 2023
ffb1f02
Add model 2023-08-01-camembert_embeddings_hackertec_generic_fr
ahmedlone127 Aug 1, 2023
b3cd1b5
Add model 2023-08-01-camembert_base_ccnet_fr
ahmedlone127 Aug 1, 2023
4639142
Add model 2023-08-01-camembert_embeddings_h4d35_generic_model_fr
ahmedlone127 Aug 1, 2023
4c1072f
Add model 2023-08-01-camembert_embeddings_bertweetfr_base_fr
ahmedlone127 Aug 1, 2023
75a8aa8
Add model 2023-08-01-camembert_base_ccnet_4gb_fr
ahmedlone127 Aug 1, 2023
eb6082c
Add model 2023-08-01-camembert_base_ccnet_4gb_fr
ahmedlone127 Aug 1, 2023
edc9bcd
Add model 2023-08-01-xlmroberta_embeddings_fairlex_fscs_minilm_xx
ahmedlone127 Aug 1, 2023
f5e3a68
Add model 2023-08-01-xlmroberta_embeddings_fairlex_cail_minilm_zh
ahmedlone127 Aug 1, 2023
91c5ebe
Add model 2023-08-01-camembert_base_fr
ahmedlone127 Aug 1, 2023
af69086
Add model 2023-08-01-camembert_base_opt_fr
ahmedlone127 Aug 1, 2023
0b2b966
Add model 2023-08-01-camembert_base_quantized_fr
ahmedlone127 Aug 1, 2023
07a9536
Add model 2023-08-02-albert_base_uncased_en
ahmedlone127 Aug 2, 2023
9bfe82b
Add model 2023-08-02-albert_base_uncased_opt_en
ahmedlone127 Aug 2, 2023
9ed1c86
Add model 2023-08-02-albert_base_uncased_quantized_en
ahmedlone127 Aug 2, 2023
fe011d1
Add model 2023-08-02-albert_large_uncased_en
ahmedlone127 Aug 2, 2023
5482cf1
Add model 2023-08-02-albert_large_uncased_en
ahmedlone127 Aug 2, 2023
dbee304
Add model 2023-08-02-albert_large_uncased_opt_en
ahmedlone127 Aug 2, 2023
b055479
Add model 2023-08-02-albert_large_uncased_quantized_en
ahmedlone127 Aug 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
layout: model
title: Romanian ALBERT Embeddings (from dragosnicolae555)
author: John Snow Labs
name: albert_embeddings_ALR_BERT
date: 2023-07-30
tags: [albert, embeddings, ro, open_source, onnx]
task: Embeddings
language: ro
edition: Spark NLP 5.0.2
spark_version: 3.0
supported: true
engine: onnx
annotator: AlbertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained ALBERT Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `ALR_BERT` is a Romanian model orginally trained by `dragosnicolae555`.

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_embeddings_ALR_BERT_ro_5.0.2_3.0_1690752767725.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_embeddings_ALR_BERT_ro_5.0.2_3.0_1690752767725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = AlbertEmbeddings.pretrained("albert_embeddings_ALR_BERT","ro") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["Îmi place Spark NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = AlbertEmbeddings.pretrained("albert_embeddings_ALR_BERT","ro")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("Îmi place Spark NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```

{:.nlu-block}
```python
import nlu
nlu.load("ro.embed.ALR_BERT").predict("""Îmi place Spark NLP""")
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|albert_embeddings_ALR_BERT|
|Compatibility:|Spark NLP 5.0.2+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentence, token]|
|Output Labels:|[bert]|
|Language:|ro|
|Size:|51.7 MB|
|Case sensitive:|false|
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
layout: model
title: Arabic ALBERT Embeddings (Base)
author: John Snow Labs
name: albert_embeddings_albert_base_arabic
date: 2023-07-30
tags: [albert, embeddings, ar, open_source, onnx]
task: Embeddings
language: ar
edition: Spark NLP 5.0.2
spark_version: 3.0
supported: true
engine: onnx
annotator: AlbertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained ALBERT Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `albert-base-arabic` is a Arabic model orginally trained by `asafaya`.

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_embeddings_albert_base_arabic_ar_5.0.2_3.0_1690753212237.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_embeddings_albert_base_arabic_ar_5.0.2_3.0_1690753212237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = AlbertEmbeddings.pretrained("albert_embeddings_albert_base_arabic","ar") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["أنا أحب شرارة NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = AlbertEmbeddings.pretrained("albert_embeddings_albert_base_arabic","ar")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("أنا أحب شرارة NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```

{:.nlu-block}
```python
import nlu
nlu.load("ar.embed.albert").predict("""أنا أحب شرارة NLP""")
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|albert_embeddings_albert_base_arabic|
|Compatibility:|Spark NLP 5.0.2+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentence, token]|
|Output Labels:|[bert]|
|Language:|ar|
|Size:|42.0 MB|
|Case sensitive:|false|
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
layout: model
title: Malay ALBERT Embeddings (Base)
author: John Snow Labs
name: albert_embeddings_albert_base_bahasa_cased
date: 2023-07-30
tags: [albert, embeddings, ms, open_source, onnx]
task: Embeddings
language: ms
edition: Spark NLP 5.0.2
spark_version: 3.0
supported: true
engine: onnx
annotator: AlbertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained ALBERT Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `albert-base-bahasa-cased` is a Malay model orginally trained by `malay-huggingface`.

## Predicted Entities



{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_embeddings_albert_base_bahasa_cased_ms_5.0.2_3.0_1690753174981.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_embeddings_albert_base_bahasa_cased_ms_5.0.2_3.0_1690753174981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python
documentAssembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")

tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")

embeddings = AlbertEmbeddings.pretrained("albert_embeddings_albert_base_bahasa_cased","ms") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])

data = spark.createDataFrame([["Saya suka Spark NLP"]]).toDF("text")

result = pipeline.fit(data).transform(data)
```
```scala
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")

val embeddings = AlbertEmbeddings.pretrained("albert_embeddings_albert_base_bahasa_cased","ms")
.setInputCols(Array("document", "token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))

val data = Seq("Saya suka Spark NLP").toDF("text")

val result = pipeline.fit(data).transform(data)
```

{:.nlu-block}
```python
import nlu
nlu.load("ms.embed.albert_base_bahasa_cased").predict("""Saya suka Spark NLP""")
```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|albert_embeddings_albert_base_bahasa_cased|
|Compatibility:|Spark NLP 5.0.2+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[sentence, token]|
|Output Labels:|[bert]|
|Language:|ms|
|Size:|42.9 MB|
|Case sensitive:|false|
Loading