-
Notifications
You must be signed in to change notification settings - Fork 730
Description
(BTW, JSL team does great work--thank you!)
Description
On Mac M1, BertEmbeddings.pretrained()
crashes with error:
no jnitensorflow in java.library.path
I recognize that Tensorflow and SparkNLP on Mac M1 is a long, ongoing discussion and I have read many, many online posts, issues, PRs, etc. I am posting this issue because JSL installation instructions imply that all should work on a Mac M1. I am hoping to consolidate and clarify discussions in this issue.
Is it truly a bug or just one slightly wrong java/scala/spark/JSL env var for me?
I cross posted this issue w/ a known working JSL demo project at maziyarpanahi/spark-nlp-starter#1.
Thank you for any help with this.
Expected Behavior
It should not crash.
Current Behavior
Startup:
$ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-m1_2.12:4.2.3
...
com.johnsnowlabs.nlp#spark-nlp-m1_2.12 added as a dependency
...
found com.johnsnowlabs.nlp#tensorflow-m1_2.12;0.4.3 in central
:: resolution report :: resolve 995ms :: artifacts dl 43ms
:: modules in use:
com.amazonaws#aws-java-sdk-bundle;1.11.828 from central in [default]
...
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.2.2
/_/
Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
Type in expressions to have them evaluated.
Type :help for more information.
Code:
scala> import com.johnsnowlabs.nlp.SparkNLP
scala> val spark = SparkNLP.start(m1 = true)
scala> import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
scala> val explainDocumentPipeline = PretrainedPipeline("explain_document_ml")
explain_document_ml download started this may take some time.
Approximate size to download 9.2 MB
Download done! Loading the resource.
explainDocumentPipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline =
PretrainedPipeline(explain_document_ml,en,public/models,false,None)
scala> val annotations = explainDocumentPipeline.annotate(
"We are very happy about SparkNLP")
annotations: Map[String,Seq[String]] = Map(
document -> List(We are very happy about SparkNLP),
spell -> ArraySeq(We, are, very, happy, about, SparkNLP),
pos -> ArrayBuffer(PRP, VBP, RB, JJ, IN, NNP),
lemmas -> ArraySeq(We, be, very, happy, about, SparkNLP),
token -> ArraySeq(We, are, very, happy, about, SparkNLP),
stems -> ArraySeq(we, ar, veri, happi, about, sparknlp),
sentence -> ArraySeq(We are very happy about SparkNLP))
scala> import com.johnsnowlabs.nlp.annotator.BertEmbeddings
import com.johnsnowlabs.nlp.annotator.BertEmbeddings
scala> val electra = BertEmbeddings.pretrained("electra_base_uncased", "en")
electra_base_uncased download started this may take some time.
Approximate size to download 389.1 MB
Download done! Loading the resource.
java.lang.UnsatisfiedLinkError: no jnitensorflow in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
...
Caused by: java.lang.UnsatisfiedLinkError: Could not find jnitensorflow in class, module, and library paths.
at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1705)
... 97 more
Possible Solution
Steps to Reproduce
- See above.
Context
I want to develop and test w/ an IDE on local Mac M1 then deploy to Databricks.
Your Environment
Hardware:
Model Name: MacBook Pro
Model Identifier: MacBookPro17,1
Chip: Apple M1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 16 GB
Software:
System Version: macOS 12.6.1 (21G217)
Kernel Version: Darwin 21.6.0
Boot Volume: Macintosh HD
java -version
openjdk version "1.8.0_292"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_292-b10)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.292-b10, mixed mode)
scala> SparkNLP.version()
res1: String = 4.2.3
scala> spark.version
res2: String = 3.2.2