Skip to content

Commit c3f27b2

Browse files
kjmrknsnsrowen
authored andcommitted
[MINOR][DOCS] Fix typos
## What changes were proposed in this pull request? Fix Typos. This PR is the complete version of apache#23145. ## How was this patch tested? NA Closes apache#23185 from kjmrknsn/docUpdate. Authored-by: Keiji Yoshida <kjmrknsn@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
1 parent 2b2c94a commit c3f27b2

File tree

8 files changed

+13
-13
lines changed

8 files changed

+13
-13
lines changed

docs/configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -498,7 +498,7 @@ Apart from these, the following properties are also available, and may be useful
498498
<td>
499499
Reuse Python worker or not. If yes, it will use a fixed number of Python workers,
500500
does not need to fork() a Python process for every task. It will be very useful
501-
if there is large broadcast, then the broadcast will not be needed to transferred
501+
if there is a large broadcast, then the broadcast will not need to be transferred
502502
from JVM to Python worker for every task.
503503
</td>
504504
</tr>

docs/graphx-programming-guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -522,7 +522,7 @@ val joinedGraph = graph.joinVertices(uniqueCosts,
522522

523523
A key step in many graph analytics tasks is aggregating information about the neighborhood of each
524524
vertex.
525-
For example, we might want to know the number of followers each user has or the average age of the
525+
For example, we might want to know the number of followers each user has or the average age of
526526
the followers of each user. Many iterative graph algorithms (e.g., PageRank, Shortest Path, and
527527
connected components) repeatedly aggregate properties of neighboring vertices (e.g., current
528528
PageRank Value, shortest path to the source, and smallest reachable vertex id).
@@ -700,7 +700,7 @@ a new value for the vertex property, and then send messages to neighboring verti
700700
super step. Unlike Pregel, messages are computed in parallel as a
701701
function of the edge triplet and the message computation has access to both the source and
702702
destination vertex attributes. Vertices that do not receive a message are skipped within a super
703-
step. The Pregel operators terminates iteration and returns the final graph when there are no
703+
step. The Pregel operator terminates iteration and returns the final graph when there are no
704704
messages remaining.
705705

706706
> Note, unlike more standard Pregel implementations, vertices in GraphX can only send messages to

docs/ml-datasource.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ displayTitle: Data sources
55
---
66

77
In this section, we introduce how to use data source in ML to load data.
8-
Beside some general data sources such as Parquet, CSV, JSON and JDBC, we also provide some specific data sources for ML.
8+
Besides some general data sources such as Parquet, CSV, JSON and JDBC, we also provide some specific data sources for ML.
99

1010
**Table of Contents**
1111

docs/ml-features.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -359,7 +359,7 @@ Assume that we have the following DataFrame with columns `id` and `raw`:
359359
~~~~
360360
id | raw
361361
----|----------
362-
0 | [I, saw, the, red, baloon]
362+
0 | [I, saw, the, red, balloon]
363363
1 | [Mary, had, a, little, lamb]
364364
~~~~
365365

@@ -369,7 +369,7 @@ column, we should get the following:
369369
~~~~
370370
id | raw | filtered
371371
----|-----------------------------|--------------------
372-
0 | [I, saw, the, red, baloon] | [saw, red, baloon]
372+
0 | [I, saw, the, red, balloon] | [saw, red, balloon]
373373
1 | [Mary, had, a, little, lamb]|[Mary, little, lamb]
374374
~~~~
375375

@@ -1302,15 +1302,15 @@ need to know vector size, can use that column as an input.
13021302
To use `VectorSizeHint` a user must set the `inputCol` and `size` parameters. Applying this
13031303
transformer to a dataframe produces a new dataframe with updated metadata for `inputCol` specifying
13041304
the vector size. Downstream operations on the resulting dataframe can get this size using the
1305-
meatadata.
1305+
metadata.
13061306

13071307
`VectorSizeHint` can also take an optional `handleInvalid` parameter which controls its
13081308
behaviour when the vector column contains nulls or vectors of the wrong size. By default
13091309
`handleInvalid` is set to "error", indicating an exception should be thrown. This parameter can
13101310
also be set to "skip", indicating that rows containing invalid values should be filtered out from
13111311
the resulting dataframe, or "optimistic", indicating that the column should not be checked for
13121312
invalid values and all rows should be kept. Note that the use of "optimistic" can cause the
1313-
resulting dataframe to be in an inconsistent state, me:aning the metadata for the column
1313+
resulting dataframe to be in an inconsistent state, meaning the metadata for the column
13141314
`VectorSizeHint` was applied to does not match the contents of that column. Users should take care
13151315
to avoid this kind of inconsistent state.
13161316

docs/ml-pipeline.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ In addition to the types listed in the Spark SQL guide, `DataFrame` can use ML [
6262

6363
A `DataFrame` can be created either implicitly or explicitly from a regular `RDD`. See the code examples below and the [Spark SQL programming guide](sql-programming-guide.html) for examples.
6464

65-
Columns in a `DataFrame` are named. The code examples below use names such as "text," "features," and "label."
65+
Columns in a `DataFrame` are named. The code examples below use names such as "text", "features", and "label".
6666

6767
## Pipeline components
6868

docs/mllib-linear-methods.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,7 @@ In `spark.mllib`, the first class $0$ is chosen as the "pivot" class.
272272
See Section 4.4 of
273273
[The Elements of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) for
274274
references.
275-
Here is an
275+
Here is a
276276
[detailed mathematical derivation](http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297).
277277

278278
For multiclass classification problems, the algorithm will output a multinomial logistic regression
@@ -350,7 +350,7 @@ known as the [mean squared error](http://en.wikipedia.org/wiki/Mean_squared_erro
350350
<div class="codetabs">
351351

352352
<div data-lang="scala" markdown="1">
353-
The following example demonstrate how to load training data, parse it as an RDD of LabeledPoint.
353+
The following example demonstrates how to load training data, parse it as an RDD of LabeledPoint.
354354
The example then uses LinearRegressionWithSGD to build a simple linear model to predict label
355355
values. We compute the mean squared error at the end to evaluate
356356
[goodness of fit](http://en.wikipedia.org/wiki/Goodness_of_fit).

docs/security.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@ Configuration for SSL is organized hierarchically. The user can configure the de
337337
which will be used for all the supported communication protocols unless they are overwritten by
338338
protocol-specific settings. This way the user can easily provide the common settings for all the
339339
protocols without disabling the ability to configure each one individually. The following table
340-
describes the the SSL configuration namespaces:
340+
describes the SSL configuration namespaces:
341341

342342
<table class="table">
343343
<tr>

docs/sparkr.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -296,7 +296,7 @@ head(agg(rollup(df, "cyl", "disp", "gear"), avg(df$mpg)))
296296

297297
### Operating on Columns
298298

299-
SparkR also provides a number of functions that can directly applied to columns for data processing and during aggregation. The example below shows the use of basic arithmetic functions.
299+
SparkR also provides a number of functions that can be directly applied to columns for data processing and during aggregation. The example below shows the use of basic arithmetic functions.
300300

301301
<div data-lang="r" markdown="1">
302302
{% highlight r %}

0 commit comments

Comments
 (0)