You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/reference/asciidoc/core/spark.adoc
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -860,13 +860,13 @@ jssc.start() <4>
860
860
<4> launch stream job
861
861
862
862
[float]
863
-
[[spark-write-dyn]]
863
+
[[spark-streaming-write-dyn]]
864
864
==== Writing to dynamic/multi-resources
865
865
866
866
For cases when the data being written to {es} needs to be indexed under different buckets (based on the data content) one can use the `es.resource.write` field which accepts a pattern that is resolved from the document content, at runtime. Following the aforementioned <<cfg-multi-writes,media example>>, one could configure it as follows:
867
867
868
868
[float]
869
-
[[spark-write-dyn-scala]]
869
+
[[spark-streaming-write-dyn-scala]]
870
870
===== Scala
871
871
872
872
[source,scala]
@@ -887,7 +887,7 @@ ssc.start()
887
887
For each document/object about to be written, {eh} will extract the +media_type+ field and use its value to determine the target resource.
888
888
889
889
[float]
890
-
[[spark-write-dyn-java]]
890
+
[[spark-streaming-write-dyn-java]]
891
891
===== Java
892
892
893
893
As expected, things in Java are strikingly similar:
@@ -912,7 +912,7 @@ jssc.start();
912
912
<1> Save each object based on its resource pattern, +media_type+ in this example
913
913
914
914
[float]
915
-
[[spark-write-meta]]
915
+
[[spark-streaming-write-meta]]
916
916
==== Handling document metadata
917
917
918
918
{es} allows each document to have its own http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/\_document\_metadata.html[metadata]. As explained above, through the various <<cfg-mapping, mapping>> options one can customize these parameters so that their values are extracted from their belonging document. Further more, one can even include/exclude what parts of the data are sent back to {es}. In Spark, {eh} extends this functionality allowing metadata to be supplied _outside_ the document itself through the use of http://spark.apache.org/docs/latest/programming-guide.html#working-with-key-value-pairs[_pair_ ++RDD++s].
@@ -924,7 +924,7 @@ Thus a +DStream+'s keys can be a +Map+ containing the +Metadata+ for each docume
924
924
This sounds more complicated than it is, so let us see some examples.
925
925
926
926
[float]
927
-
[[spark-write-meta-scala]]
927
+
[[spark-streaming-write-meta-scala]]
928
928
===== Scala
929
929
930
930
Pair ++DStreams++s, or simply put ++DStreams++s with the signature +DStream[(K,V)]+ can take advantage of the +saveToEsWithMeta+ methods that are available either through the _implicit_ import of +org.elasticsearch.spark.streaming+ package or +EsSparkStreaming+ object.
@@ -990,7 +990,7 @@ ssc.start()
990
990
<7> The +DStream+ is configured to index the data accordingly using the +saveToEsWithMeta+ method
991
991
992
992
[float]
993
-
[[spark-write-meta-java]]
993
+
[[spark-streaming-write-meta-java]]
994
994
===== Java
995
995
996
996
In a similar fashion, on the Java side, +JavaEsSparkStreaming+ provides +saveToEsWithMeta+ methods that are applied to +JavaPairDStream+ (the equivalent in Java of +DStream[(K,V)]+).
0 commit comments