Skip to content

Mac M1: jnitensorflow error with BertEmbeddings.pretrained #13079

@rwoodard-prog

Description

@rwoodard-prog

(BTW, JSL team does great work--thank you!)

Description

On Mac M1, BertEmbeddings.pretrained() crashes with error:

no jnitensorflow in java.library.path

I recognize that Tensorflow and SparkNLP on Mac M1 is a long, ongoing discussion and I have read many, many online posts, issues, PRs, etc. I am posting this issue because JSL installation instructions imply that all should work on a Mac M1. I am hoping to consolidate and clarify discussions in this issue.

Is it truly a bug or just one slightly wrong java/scala/spark/JSL env var for me?

I cross posted this issue w/ a known working JSL demo project at maziyarpanahi/spark-nlp-starter#1.

Thank you for any help with this.

Expected Behavior

It should not crash.

Current Behavior

Startup:

$ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-m1_2.12:4.2.3
...
com.johnsnowlabs.nlp#spark-nlp-m1_2.12 added as a dependency
...
	found com.johnsnowlabs.nlp#tensorflow-m1_2.12;0.4.3 in central
:: resolution report :: resolve 995ms :: artifacts dl 43ms
	:: modules in use:
	com.amazonaws#aws-java-sdk-bundle;1.11.828 from central in [default]
...
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.2.2
      /_/

Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
Type in expressions to have them evaluated.
Type :help for more information.

Code:

scala> import com.johnsnowlabs.nlp.SparkNLP

scala> val spark = SparkNLP.start(m1 = true)

scala> import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

scala> val explainDocumentPipeline = PretrainedPipeline("explain_document_ml")
explain_document_ml download started this may take some time.
Approximate size to download 9.2 MB
Download done! Loading the resource.
explainDocumentPipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline =
   PretrainedPipeline(explain_document_ml,en,public/models,false,None)

scala> val annotations = explainDocumentPipeline.annotate(
    "We are very happy about SparkNLP")
annotations: Map[String,Seq[String]] = Map(
    document -> List(We are very happy about SparkNLP), 
    spell -> ArraySeq(We, are, very, happy, about, SparkNLP), 
    pos -> ArrayBuffer(PRP, VBP, RB, JJ, IN, NNP), 
    lemmas -> ArraySeq(We, be, very, happy, about, SparkNLP), 
    token -> ArraySeq(We, are, very, happy, about, SparkNLP), 
    stems -> ArraySeq(we, ar, veri, happi, about, sparknlp), 
    sentence -> ArraySeq(We are very happy about SparkNLP))

scala> import com.johnsnowlabs.nlp.annotator.BertEmbeddings
import com.johnsnowlabs.nlp.annotator.BertEmbeddings

scala> val electra = BertEmbeddings.pretrained("electra_base_uncased", "en")
electra_base_uncased download started this may take some time.
Approximate size to download 389.1 MB
Download done! Loading the resource.
java.lang.UnsatisfiedLinkError: no jnitensorflow in java.library.path
  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
...
Caused by: java.lang.UnsatisfiedLinkError: Could not find jnitensorflow in class, module, and library paths.
  at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1705)
  ... 97 more

Possible Solution

Steps to Reproduce

  1. See above.

Context

I want to develop and test w/ an IDE on local Mac M1 then deploy to Databricks.

Your Environment

Hardware:

  Model Name:	MacBook Pro
  Model Identifier:	MacBookPro17,1
  Chip:	Apple M1
  Total Number of Cores:	8 (4 performance and 4 efficiency)
  Memory:	16 GB

Software:

  System Version:	macOS 12.6.1 (21G217)
  Kernel Version:	Darwin 21.6.0
  Boot Volume:	Macintosh HD

java -version
openjdk version "1.8.0_292"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_292-b10)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.292-b10, mixed mode)

scala> SparkNLP.version()
res1: String = 4.2.3

scala> spark.version
res2: String = 3.2.2

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions