Spark NLP 4.2.7: Patch release
π’ Overview
Spark NLP 4.2.7 π comes with some important bug fixes and improvements. As a result, we highly recommend to update to this latest version if you are using Spark NLP 4.2.x.
As always, we would like to thank our community for their feedback, questions, and feature requests. π
π β Bug Fixes & Enhancements
- Fix
outputAnnotatorType
issue in pipelines withFinisher
annotator. This change addsoutputAnnotatorType
toAnnotatorTransformer
to avoid loadingoutputAnnotatorType
attribute when a stage in pipeline does not use it. - Fix the wrong sentence index calculation in metadata by annotators in the pipeline when
setExplodeSentences
param was set totrue
in SentenceDetector annotator - Fix the issue in
Tokenizer
when a custom pattern is used withlookahead/-behinds
and it has0 width
matches. This led to indexes not being calculated correctly - Fix missing to output embeddings in
.fullAnnotate()
method whenparseEmbeddings
param was set toTrue/true
- Fix broken links to the Python API pages, as the generation of the PyDocs was slightly changed in a previous release. This makes the Python APIs accessible from the Annotators and Transformers pages like before
- Change default values of
explodeEntities
andmergeEntities
parameters totrue
in GraphExctraction annotator - Better error handling when there are empty paths/relations in
GraphExctraction
annotator. New message will better guide the user on how to configureGraphExtraction
to output meaningful relationships - Removed the duplicated definition of method
setWeightedDistPath
fromContextSpellCheckerApproach
π Documentation
- TF Hub & HuggingFace to Spark NLP
- Models Hub with new models
- Spark NLP documentation
- Spark NLP Scala APIs
- Spark NLP Python APIs
- Spark NLP Workshop notebooks
- Spark NLP publications
- Spark NLP in Action
- Spark NLP training certification notebooks for Google Colab and Databricks
- Spark NLP Display for visualization of different types of annotations
- Discussions Engage with other community members, share ideas, and show off how you use Spark NLP!
Installation
Python
#PyPI
pip install spark-nlp==4.2.7
Spark Packages
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x (Scala 2.12):
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.2.7
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.2.7
GPU
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.2.7
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.2.7
M1
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-m1_2.12:4.2.7
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-m1_2.12:4.2.7
AArch64
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.2.7
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.2.7
Maven
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.12</artifactId>
<version>4.2.7</version>
</dependency>
spark-nlp-gpu:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.12</artifactId>
<version>4.2.7</version>
</dependency>
spark-nlp-m1:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-m1_2.12</artifactId>
<version>4.2.7</version>
</dependency>
spark-nlp-aarch64:
<!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64 -->
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-aarch64_2.12</artifactId>
<version>4.2.7</version>
</dependency>
FAT JARs
-
CPU on Apache Spark 3.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-4.2.7.jar
-
GPU on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-4.2.7.jar
-
M1 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-m1-assembly-4.2.7.jar
-
AArch64 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-aarch64-assembly-4.2.7.jar
What's Changed
@dcecchini @Cabir40 @agsfer @gadde5300 @bunyamin-polat @rpranab @jdobes-cz @josejuanmartinez @diatrambitas @maziyarpanahi
Full Changelog: 4.2.6...4.2.7