Skip to content

2023-01-29-legclf_other_definitional_provisions_clause_en #13428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
70cb97a
Add model 2023-01-29-legclf_other_definitional_provisions_clause_en
Mary-Sci Jan 29, 2023
8349ff1
Add model 2023-01-29-legclf_due_authorization_clause_en
Mary-Sci Jan 29, 2023
b9a0e8f
Add model 2023-01-29-legclf_due_authorization_clause_en
Mary-Sci Jan 29, 2023
a2092eb
Add model 2023-01-29-legclf_assignment_and_subletting_clause_en
Mary-Sci Jan 29, 2023
bfe22f1
Add model 2023-01-29-legclf_titles_and_subtitles_clause_en
Mary-Sci Jan 29, 2023
00652a0
Add model 2023-01-29-legclf_no_material_adverse_effect_clause_en
Mary-Sci Jan 29, 2023
f882b62
Add model 2023-01-29-legclf_subcontractors_clause_en
Mary-Sci Jan 29, 2023
558ae3c
Add model 2023-01-29-legclf_military_leave_clause_en
Mary-Sci Jan 29, 2023
30d6102
Add model 2023-01-29-legclf_fringe_benefits_clause_en
Mary-Sci Jan 29, 2023
ec44d81
Add model 2023-01-29-legclf_cusip_numbers_clause_en
Mary-Sci Jan 29, 2023
bf8a54d
Update 2023-01-29-legclf_due_authorization_clause_en.md
Mary-Sci Jan 29, 2023
17458fb
Update 2023-01-29-legclf_assignment_and_subletting_clause_en.md
Mary-Sci Jan 29, 2023
5d2ed2d
Update 2023-01-29-legclf_cusip_numbers_clause_en.md
Mary-Sci Jan 29, 2023
1d39867
Update 2023-01-29-legclf_fringe_benefits_clause_en.md
Mary-Sci Jan 29, 2023
fbbf6f2
Update 2023-01-29-legclf_military_leave_clause_en.md
Mary-Sci Jan 29, 2023
e79e155
Update 2023-01-29-legclf_no_material_adverse_effect_clause_en.md
Mary-Sci Jan 29, 2023
68784b9
Update 2023-01-29-legclf_other_definitional_provisions_clause_en.md
Mary-Sci Jan 29, 2023
4276e9b
Update 2023-01-29-legclf_subcontractors_clause_en.md
Mary-Sci Jan 29, 2023
72e0e5b
Update 2023-01-29-legclf_titles_and_subtitles_clause_en.md
Mary-Sci Jan 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
layout: model
title: Legal Assignment And Subletting Clause Binary Classifier
author: John Snow Labs
name: legclf_assignment_and_subletting_clause
date: 2023-01-29
tags: [en, legal, classification, assignment, subletting, clauses, assignment_and_subletting, licensed, tensorflow]
task: Text Classification
language: en
edition: Legal NLP 1.0.0
spark_version: 3.0
supported: true
engine: tensorflow
annotator: LegalClassifierDLModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

This model is a Binary Classifier (True, False) for the `assignment-and-subletting` clause type. To use this model, make sure you provide enough context as an input. Adding Sentence Splitters to the pipeline will make the model see only sentences, not the whole text, so it's better to skip it, unless you want to do Binary Classification as sentence level.

If you have big legal documents, and you want to look for clauses, we recommend you to split the documents using any of the techniques available in our Legal NLP Workshop Tokenization & Splitting Tutorial (link [here](https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings_JSL/Legal/1.Tokenization_Splitting.ipynb)), namely:
- Paragraph splitting (by multiline);
- Splitting by headers / subheaders;
- etc.

Take into consideration the embeddings of this model allows up to 512 tokens. If you have more than that, consider splitting in smaller pieces (you can also check the same tutorial link provided above).

This model can be combined with any of the other 200+ Legal Clauses Classifiers you will find in Models Hub, getting as an output a series of True/False values for each of the legal clause model you have added.

## Predicted Entities

`assignment-and-subletting`, `other`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_assignment_and_subletting_clause_en_1.0.0_3.0_1674993574865.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_assignment_and_subletting_clause_en_1.0.0_3.0_1674993574865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python

document_assembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

embeddings = nlp.BertSentenceEmbeddings.pretrained("sent_bert_base_cased", "en")\
.setInputCols("document")\
.setOutputCol("sentence_embeddings")

doc_classifier = legal.ClassifierDLModel.pretrained("legclf_assignment_and_subletting_clause", "en", "legal/models")\
.setInputCols(["sentence_embeddings"])\
.setOutputCol("category")

nlpPipeline = nlp.Pipeline(stages=[
document_assembler,
embeddings,
doc_classifier])

df = spark.createDataFrame([["YOUR TEXT HERE"]]).toDF("text")

model = nlpPipeline.fit(df)

result = model.transform(df)

```

</div>

## Results

```bash

+-------+
|result|
+-------+
|[assignment-and-subletting]|
|[other]|
|[other]|
|[assignment-and-subletting]|

```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|legclf_assignment_and_subletting_clause|
|Compatibility:|Legal NLP 1.0.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[sentence_embeddings]|
|Output Labels:|[class]|
|Language:|en|
|Size:|22.7 MB|

## References

Legal documents, scrapped from the Internet, and classified in-house

## Benchmarking

```bash
label precision recall f1-score support
assignment-and-subletting 1.00 0.96 0.98 26
other 0.97 1.00 0.99 38
accuracy - - 0.98 64
macro-avg 0.99 0.98 0.98 64
weighted-avg 0.98 0.98 0.98 64
```
120 changes: 120 additions & 0 deletions docs/_posts/Mary-Sci/2023-01-29-legclf_cusip_numbers_clause_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
layout: model
title: Legal Cusip Numbers Clause Binary Classifier
author: John Snow Labs
name: legclf_cusip_numbers_clause
date: 2023-01-29
tags: [en, legal, classification, cusip, numbers, clauses, cusip_numbers, licensed, tensorflow]
task: Text Classification
language: en
edition: Legal NLP 1.0.0
spark_version: 3.0
supported: true
engine: tensorflow
annotator: LegalClassifierDLModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

This model is a Binary Classifier (True, False) for the `cusip-numbers` clause type. To use this model, make sure you provide enough context as an input. Adding Sentence Splitters to the pipeline will make the model see only sentences, not the whole text, so it's better to skip it, unless you want to do Binary Classification as sentence level.

If you have big legal documents, and you want to look for clauses, we recommend you to split the documents using any of the techniques available in our Legal NLP Workshop Tokenization & Splitting Tutorial (link [here](https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings_JSL/Legal/1.Tokenization_Splitting.ipynb)), namely:
- Paragraph splitting (by multiline);
- Splitting by headers / subheaders;
- etc.

Take into consideration the embeddings of this model allows up to 512 tokens. If you have more than that, consider splitting in smaller pieces (you can also check the same tutorial link provided above).

This model can be combined with any of the other 200+ Legal Clauses Classifiers you will find in Models Hub, getting as an output a series of True/False values for each of the legal clause model you have added.

## Predicted Entities

`cusip-numbers`, `other`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_cusip_numbers_clause_en_1.0.0_3.0_1674994284758.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_cusip_numbers_clause_en_1.0.0_3.0_1674994284758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python

document_assembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

embeddings = nlp.BertSentenceEmbeddings.pretrained("sent_bert_base_cased", "en")\
.setInputCols("document")\
.setOutputCol("sentence_embeddings")

doc_classifier = legal.ClassifierDLModel.pretrained("legclf_cusip_numbers_clause", "en", "legal/models")\
.setInputCols(["sentence_embeddings"])\
.setOutputCol("category")

nlpPipeline = nlp.Pipeline(stages=[
document_assembler,
embeddings,
doc_classifier])

df = spark.createDataFrame([["YOUR TEXT HERE"]]).toDF("text")

model = nlpPipeline.fit(df)

result = model.transform(df)

```

</div>

## Results

```bash

+-------+
|result|
+-------+
|[cusip-numbers]|
|[other]|
|[other]|
|[cusip-numbers]|

```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|legclf_cusip_numbers_clause|
|Compatibility:|Legal NLP 1.0.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[sentence_embeddings]|
|Output Labels:|[class]|
|Language:|en|
|Size:|22.7 MB|

## References

Legal documents, scrapped from the Internet, and classified in-house

## Benchmarking

```bash
label precision recall f1-score support
cusip-numbers 0.93 0.96 0.95 28
other 0.97 0.95 0.96 37
accuracy - - 0.95 65
macro-avg 0.95 0.96 0.95 65
weighted-avg 0.95 0.95 0.95 65
```
120 changes: 120 additions & 0 deletions docs/_posts/Mary-Sci/2023-01-29-legclf_due_authorization_clause_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
layout: model
title: Legal Due Authorization Clause Binary Classifier
author: John Snow Labs
name: legclf_due_authorization_clause
date: 2023-01-29
tags: [en, legal, classification, tax, treatment, clauses, due_authorization, licensed, tensorflow]
task: Text Classification
language: en
edition: Legal NLP 1.0.0
spark_version: 3.0
supported: true
engine: tensorflow
annotator: LegalClassifierDLModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

This model is a Binary Classifier (True, False) for the `due-authorization` clause type. To use this model, make sure you provide enough context as an input. Adding Sentence Splitters to the pipeline will make the model see only sentences, not the whole text, so it's better to skip it, unless you want to do Binary Classification as sentence level.

If you have big legal documents, and you want to look for clauses, we recommend you to split the documents using any of the techniques available in our Legal NLP Workshop Tokenization & Splitting Tutorial (link [here](https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings_JSL/Legal/1.Tokenization_Splitting.ipynb)), namely:
- Paragraph splitting (by multiline);
- Splitting by headers / subheaders;
- etc.

Take into consideration the embeddings of this model allows up to 512 tokens. If you have more than that, consider splitting in smaller pieces (you can also check the same tutorial link provided above).

This model can be combined with any of the other 200+ Legal Clauses Classifiers you will find in Models Hub, getting as an output a series of True/False values for each of the legal clause model you have added.

## Predicted Entities

`due-authorization`, `other`

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/legal/models/legclf_due_authorization_clause_en_1.0.0_3.0_1674993500619.zip){:.button.button-orange}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/legal/models/legclf_due_authorization_clause_en_1.0.0_3.0_1674993500619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}

```python

document_assembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

embeddings = nlp.BertSentenceEmbeddings.pretrained("sent_bert_base_cased", "en")\
.setInputCols("document")\
.setOutputCol("sentence_embeddings")

doc_classifier = legal.ClassifierDLModel.pretrained("legclf_due_authorization_clause", "en", "legal/models")\
.setInputCols(["sentence_embeddings"])\
.setOutputCol("category")

nlpPipeline = nlp.Pipeline(stages=[
document_assembler,
embeddings,
doc_classifier])

df = spark.createDataFrame([["YOUR TEXT HERE"]]).toDF("text")

model = nlpPipeline.fit(df)

result = model.transform(df)

```

</div>

## Results

```bash

+-------+
|result|
+-------+
|[due-authorization]|
|[other]|
|[other]|
|[due-authorization]|

```

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|legclf_due_authorization_clause|
|Compatibility:|Legal NLP 1.0.0+|
|License:|Licensed|
|Edition:|Official|
|Input Labels:|[sentence_embeddings]|
|Output Labels:|[class]|
|Language:|en|
|Size:|22.7 MB|

## References

Legal documents, scrapped from the Internet, and classified in-house

## Benchmarking

```bash
label precision recall f1-score support
due-authorization 0.98 1.00 0.99 61
other 1.00 0.99 1.00 106
accuracy - - 0.99 167
macro-avg 0.99 1.00 0.99 167
weighted-avg 0.99 0.99 0.99 167
```
Loading