DOCS-12790 (#32)

jdestefano-mongo · web-flow · commit 7d9ce60c8573 · 2019-07-25T12:13:07.000-04:00
* DOCS-12790 - Spark 2.4.1 release notes and updates.

* DOCS-12790 - A few more edits.

* DOCS-12790 - Update landing page.
diff --git a/conf.py b/conf.py
@@ -62,9 +62,9 @@
 }
 
 source_constants = {
-    'current-version': '2.4.0',
-    'spark-core-version': '2.4.0',
-    'spark-sql-version': '2.4.0'
+    'current-version': '2.4.1',
+    'spark-core-version': '2.4.1',
+    'spark-sql-version': '2.4.1'
 }
 
 intersphinx_mapping = {}
diff --git a/source/configuration.txt b/source/configuration.txt
@@ -96,6 +96,10 @@ The following options for reading from MongoDB are available:
 
      - Required. The collection name from which to read data.
 
+   * - ``batchSize``
+
+     -  Size of the internal batches used within the cursor.
+
    * - ``localThreshold``
 
      - The threshold (in milliseconds) for choosing a server from
@@ -449,6 +453,12 @@ The following options for writing to MongoDB are available:
 
      - Required. The collection name to write data to
 
+   * - ``extendedBsonTypes``
+
+     - Enables extended BSON types when writing data to MongoDB.
+
+       *Default*: ``true``
+
    * - ``localThreshold``
 
      - The threshold (milliseconds) for choosing a server from multiple
@@ -555,7 +565,7 @@ share the MongoClient across threads.
    * - System Property name
      - Description
 
-   * - ``spark.mongodb.keep_alive_ms``
-     - The length of time to keep a MongoClient available for sharing.
+   * - ``mongodb.keep_alive_ms``
+     - The length of time to keep a ``MongoClient`` available for sharing.
 
        *Default*: 5000
diff --git a/source/index.txt b/source/index.txt
@@ -53,6 +53,10 @@ versions of Apache Spark and MongoDB:
 
 .. admonition:: Announcements
 
+   - **Jun 06, 2019**, `MongoDB Connector for Spark versions v2.4.1,
+     v2.3.3, v2.2.7, and v2.1.6
+     <https://www.mongodb.com/products/spark-connector>`_ Released.
+
    - **Dec 07, 2018**, `MongoDB Connector for Spark versions v2.4.0,
      v2.3.2, v2.2.6, and v2.1.5
      <https://www.mongodb.com/products/spark-connector>`_ Released.
diff --git a/source/python/aggregation.txt b/source/python/aggregation.txt
@@ -23,7 +23,7 @@ to use when creating a DataFrame.
 .. code-block:: none
 
    pipeline = "{'$match': {'type': 'apple'}}"
-   df = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("pipeline", pipeline).load()
+   df = spark.read.format("mongo").option("pipeline", pipeline).load()
    df.show()
 
 In the ``pyspark`` shell, the operation prints the following output:
diff --git a/source/python/filters-and-sql.txt b/source/python/filters-and-sql.txt
@@ -29,7 +29,7 @@ source:
 
 .. code-block:: python
 
-   df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()
+   df = spark.read.format("mongo").load()
 
 The following example includes only
 records in which the ``qty`` field is greater than or equal to ``10``.
diff --git a/source/python/read-from-mongodb.txt b/source/python/read-from-mongodb.txt
@@ -22,7 +22,7 @@ from within the ``pyspark`` shell.
 
 .. code-block:: python
 
-   df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()
+   df = spark.read.format("mongo").load()
 
 Spark samples the records to infer the schema of the collection.
 
@@ -47,5 +47,5 @@ To read from a collection called ``contacts`` in a database called
 
 .. code-block:: python
 
-   df = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("uri",
+   df = spark.read.format("mongo").option("uri",
    "mongodb://127.0.0.1/people.contacts").load()
diff --git a/source/python/write-to-mongodb.txt b/source/python/write-to-mongodb.txt
@@ -28,7 +28,7 @@ by using the ``write`` method:
 
 .. code-block:: python
 
-   people.write.format("com.mongodb.spark.sql.DefaultSource").mode("append").save()
+   people.write.format("mongo").mode("append").save()
 
 The above operation writes to the MongoDB database and collection
 specified in the :ref:`spark.mongodb.output.uri<pyspark-shell>` option
@@ -83,5 +83,5 @@ To write to a collection called ``contacts`` in a database called
 
 .. code-block:: python
 
-   people.write.format("com.mongodb.spark.sql.DefaultSource").mode("append").option("database",
+   people.write.format("mongo").mode("append").option("database",
    "people").option("collection", "contacts").save()
diff --git a/source/release-notes.txt b/source/release-notes.txt
@@ -4,6 +4,31 @@ Release Notes
 
 .. default-domain:: mongodb
 
+MongoDB Connector for Spark `2.4.1`_
+------------------------------------
+
+*Released on June 6, 2019*
+
+- Ensures nullable fields or container types accept ``null`` values.
+- Added ``ReadConfig.batchSize`` property. For more information, see
+  :ref:`spark-input-conf`.
+- Renamed system property ``spark.mongodb.keep_alive_ms`` to
+  ``mongodb.keep_alive_ms``.
+- Added ``MongoDriverInformation`` to the default ``MongoClient``.
+- Updated to latest Java driver (3.10.+)
+- Updated ``PartitionerHelper.matchQuery`` to no longer include ``$ne``/``$exists``
+  checks.
+- Added logging support for partitioners and their queries.
+- Added ``WriteConfig.extendedBsonTypes`` setting so users can disable
+  extended BSON types when writing. For more information, see
+  :ref:`spark-output-conf`.
+- Added Java spi can now use short form: ``spark.read.format("mongo")``.
+- ``spark.read.format("mongo")`` can be used in place of
+  ``spark.read.format("com.mongodb.spark.sql")`` and
+  ``spark.read.format("com.mongodb.spark.sql.DefaultSource")``.
+
+.. _2.4.1: https://github.com/mongodb/mongo-spark/compare/2.4.0...r2.4.1
+
 MongoDB Connector for Spark `2.4.0`_
 ------------------------------------
 
diff --git a/source/scala/datasets-and-sql.txt b/source/scala/datasets-and-sql.txt
@@ -103,15 +103,15 @@ Alternatively, you can use ``SparkSession`` methods to create DataFrames:
    ) // ReadConfig used for configuration
 
    val df4 = sparkSession.read.mongo() // SparkSession used for configuration
-   sqlContext.read.format("com.mongodb.spark.sql").load()
+   sqlContext.read.format("mongo").load()
 
    // Set custom options
    import com.mongodb.spark.config._
 
    val customReadConfig = ReadConfig(Map("readPreference.name" -> "secondaryPreferred"), Some(ReadConfig(sc)))
    val df5 = sparkSession.read.mongo(customReadConfig)
 
-   val df6 = sparkSession.read.format("com.mongodb.spark.sql").options(customReadConfig.asOptions).load()
+   val df6 = sparkSession.read.format("mongo").options(customReadConfig.asOptions).load()
 
 .. _scala-dataset-filters:
 
@@ -260,7 +260,7 @@ to MongoDB using the DataFrameWriter directly:
 .. code-block:: scala
 
    centenarians.write.option("collection", "hundredClub").mode("overwrite").mongo()
-   centenarians.write.option("collection", "hundredClub").mode("overwrite").format("com.mongodb.spark.sql").save()
+   centenarians.write.option("collection", "hundredClub").mode("overwrite").format("mongo").save()
 
 DataTypes
 ---------

Original file line number	Diff line number	Diff line change
`@@ -62,9 +62,9 @@`
`62`	`62`	`}`
`63`	`63`
`64`	`64`	`source_constants = {`
`65`		`- 'current-version': '2.4.0',`
`66`		`- 'spark-core-version': '2.4.0',`
`67`		`- 'spark-sql-version': '2.4.0'`
	`65`	`+ 'current-version': '2.4.1',`
	`66`	`+ 'spark-core-version': '2.4.1',`
	`67`	`+ 'spark-sql-version': '2.4.1'`
`68`	`68`	`}`
`69`	`69`
`70`	`70`	`intersphinx_mapping = {}`