Skip to content

Spark NLP 4.2.7: Patch release

Compare
Choose a tag to compare
@maziyarpanahi maziyarpanahi released this 12 Jan 16:46
· 672 commits to master since this release

πŸ“’ Overview

Spark NLP 4.2.7 πŸš€ comes with some important bug fixes and improvements. As a result, we highly recommend to update to this latest version if you are using Spark NLP 4.2.x.

As always, we would like to thank our community for their feedback, questions, and feature requests. πŸŽ‰


πŸ› ⭐ Bug Fixes & Enhancements

  • Fix outputAnnotatorType issue in pipelines with Finisher annotator. This change adds outputAnnotatorType to AnnotatorTransformer to avoid loading outputAnnotatorType attribute when a stage in pipeline does not use it.
  • Fix the wrong sentence index calculation in metadata by annotators in the pipeline when setExplodeSentences param was set to true in SentenceDetector annotator
  • Fix the issue in Tokenizer when a custom pattern is used with lookahead/-behinds and it has 0 width matches. This led to indexes not being calculated correctly
  • Fix missing to output embeddings in .fullAnnotate() method when parseEmbeddings param was set to True/true
  • Fix broken links to the Python API pages, as the generation of the PyDocs was slightly changed in a previous release. This makes the Python APIs accessible from the Annotators and Transformers pages like before
  • Change default values of explodeEntities and mergeEntities parameters to true in GraphExctraction annotator
  • Better error handling when there are empty paths/relations in GraphExctractionannotator. New message will better guide the user on how to configure GraphExtraction to output meaningful relationships
  • Removed the duplicated definition of method setWeightedDistPath from ContextSpellCheckerApproach

πŸ“– Documentation


Installation

Python

#PyPI

pip install spark-nlp==4.2.7

Spark Packages

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x (Scala 2.12):

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.2.7

pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.2.7

GPU

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.2.7

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.2.7

M1

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-m1_2.12:4.2.7

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-m1_2.12:4.2.7

AArch64

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.2.7

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.2.7

Maven

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp_2.12</artifactId>
    <version>4.2.7</version>
</dependency>

spark-nlp-gpu:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-gpu_2.12</artifactId>
    <version>4.2.7</version>
</dependency>

spark-nlp-m1:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-m1_2.12</artifactId>
    <version>4.2.7</version>
</dependency>

spark-nlp-aarch64:

<!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64 -->
<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-aarch64_2.12</artifactId>
    <version>4.2.7</version>
</dependency>

FAT JARs

What's Changed

@dcecchini @Cabir40 @agsfer @gadde5300 @bunyamin-polat @rpranab @jdobes-cz @josejuanmartinez @diatrambitas @maziyarpanahi

Full Changelog: 4.2.6...4.2.7