Skip to content

Commit 2780d53

Browse files
committed
feat(data-format): add support for MANUAL data format (in addition to current DUMP format)
closes #5
1 parent 1a301b8 commit 2780d53

File tree

8 files changed

+492
-170
lines changed

8 files changed

+492
-170
lines changed

README.md

Lines changed: 36 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ The library reads all the metadata it needs from the entity classes (index name,
1414
* Support for **JUnit 4** via `LoadEsDataRule`, `DeleteEsDataRule`
1515
* Support for **JUnit Jupiter** via `@LoadEsDataConfig` / `@LoadEsDataExtension` or `@DeleteEsDataConfig` / `@DeleteEsDataExtension`
1616
* Built-in support for **gzipped data**
17+
* Multiple data formats(dump, manual)
1718
* Written in **Java 8**
1819
* Based on **Spring (Data, Test)**
1920

@@ -26,7 +27,7 @@ The library reads all the metadata it needs from the entity classes (index name,
2627

2728
## Installation & Usage
2829

29-
The library is splitted into 2 independent sub-modules, both are available on [JCenter](https://bintray.com/bintray/jcenter?filterByPkgName=spring-esdata-loader) and [Maven Central](https://search.maven.org/search?q=spring-esdata-loader):
30+
The library is split into 2 independent sub-modules, both are available on [JCenter](https://bintray.com/bintray/jcenter?filterByPkgName=spring-esdata-loader) and [Maven Central](https://search.maven.org/search?q=spring-esdata-loader):
3031

3132
* `spring-esdata-loader-junit4` for testing with **JUnit 4**
3233
* `spring-esdata-loader-junit-jupiter` for testing with **JUnit Jupiter**
@@ -80,25 +81,28 @@ To get started,
8081
* [junit4](/junit4) - if your are using **JUnit 4**
8182
* [junit-jupiter](/junit-jupiter) - if you are using **JUnit Jupiter**
8283

83-
## Data Format
84+
## Supported Data Formats
85+
86+
`spring-esdata-loader` currently supports 2 formats to load data into Elasticsearch: **DUMP** and **MANUAL**.
87+
88+
### Dump data format
8489

85-
The data to be loaded must follow the appropriate format.
8690
Here is an example:
8791
```json
88-
{"_index":"author","_type":"Author","_id":"5","_score":1,"_source":{"id":"5","firstName":"firstName5","lastName":"lastName5"}}
89-
{"_index":"author","_type":"Author","_id":"8","_score":1,"_source":{"id":"8","firstName":"firstName8","lastName":"lastName8"}}
90-
{"_index":"author","_type":"Author","_id":"9","_score":1,"_source":{"id":"9","firstName":"firstName9","lastName":"lastName9"}}
91-
{"_index":"author","_type":"Author","_id":"10","_score":1,"_source":{"id":"10","firstName":"firstName10","lastName":"lastName10"}}
92+
{"_index":"author","_type":"Author","_id":"1","_score":1,"_source":{"id":"1","firstName":"firstName1","lastName":"lastName1"}}
9293
{"_index":"author","_type":"Author","_id":"2","_score":1,"_source":{"id":"2","firstName":"firstName2","lastName":"lastName2"}}
94+
{"_index":"author","_type":"Author","_id":"3","_score":1,"_source":{"id":"3","firstName":"firstName3","lastName":"lastName3"}}
9395
{"_index":"author","_type":"Author","_id":"4","_score":1,"_source":{"id":"4","firstName":"firstName4","lastName":"lastName4"}}
96+
{"_index":"author","_type":"Author","_id":"5","_score":1,"_source":{"id":"5","firstName":"firstName5","lastName":"lastName5"}}
9497
{"_index":"author","_type":"Author","_id":"6","_score":1,"_source":{"id":"6","firstName":"firstName6","lastName":"lastName6"}}
95-
{"_index":"author","_type":"Author","_id":"1","_score":1,"_source":{"id":"1","firstName":"firstName1","lastName":"lastName1"}}
9698
{"_index":"author","_type":"Author","_id":"7","_score":1,"_source":{"id":"7","firstName":"firstName7","lastName":"lastName7"}}
97-
{"_index":"author","_type":"Author","_id":"3","_score":1,"_source":{"id":"3","firstName":"firstName3","lastName":"lastName3"}}
99+
{"_index":"author","_type":"Author","_id":"8","_score":1,"_source":{"id":"8","firstName":"firstName8","lastName":"lastName8"}}
100+
{"_index":"author","_type":"Author","_id":"9","_score":1,"_source":{"id":"9","firstName":"firstName9","lastName":"lastName9"}}
101+
{"_index":"author","_type":"Author","_id":"10","_score":1,"_source":{"id":"10","firstName":"firstName10","lastName":"lastName10"}}
98102

99103
```
100104
You can use a tool like [elasticdump](https://npmjs.com/package/elasticdump) (*requires* [NodeJS](https://nodejs.org/)) to extract existing data
101-
from your Elasticseach server, and them dump them into a JSON file.
105+
from your Elasticsearch server, and them dump them into a JSON file.
102106

103107
```
104108
$ npx elasticdump --input=http://localhost:9200/my_index --output=my_index_data.json
@@ -108,11 +112,32 @@ $ npx elasticdump --input=http://localhost:9200/my_index --output=my_index_data.
108112
109113
> If you change the `--output` part above into `--output=$ | gzip my_data.json.gz` the data will be automatically gzipped
110114
115+
### Manual data format
116+
117+
In this format, you specify your target data directly (no metadata like `_index`, `_source`, ...), as an Array of JSON objects.
118+
119+
This is more suitable when you create test data from scratch (as opposed to dumping existing ones from a ES server) because it is easier to tweak later on to accommodate future modifications in tests. _(Thanks to @DPorcheron for the idea 💡!)_
120+
121+
Here is an example:
122+
```json
123+
[
124+
{"id":"1","firstName":"firstName1","lastName":"lastName1"},
125+
{"id":"2","firstName":"firstName2","lastName":"lastName2"},
126+
{"id":"3","firstName":"firstName3","lastName":"lastName3"},
127+
{"id":"4","firstName":"firstName4","lastName":"lastName4"},
128+
{"id":"5","firstName":"firstName5","lastName":"lastName5"},
129+
{"id":"6","firstName":"firstName6","lastName":"lastName6"},
130+
{"id":"7","firstName":"firstName7","lastName":"lastName7"},
131+
{"id":"8","firstName":"firstName8","lastName":"lastName8"},
132+
{"id":"9","firstName":"firstName9","lastName":"lastName9"},
133+
{"id":"10","firstName":"firstName10","lastName":"lastName10"}
134+
]
135+
```
111136
## Contributing
112137

113138
Contributions are always welcome! Just fork the project, work on your feature/bug fix, and submit it.
114139
You can also contribute by creating issues. Please read the [contribution guidelines](.github/CONTRIBUTING.md) for more information.
115140

116141
## License
117142

118-
Copyright (c) 2019 Tine Kondo. Licensed under the MIT License (MIT)
143+
Copyright (c) 2019 Tine Kondo. Licensed under the MIT License (MIT)
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
package com.github.spring.esdata.loader.core;
2+
3+
/**
4+
* Enum defining the data formats supported by the tool to load data into Elasticsearch.
5+
*/
6+
public enum EsDataFormat {
7+
/***
8+
* Format of data as dumped from an existing Elasticsearch server (using tools like `elasticdump` for e.g).
9+
*/
10+
DUMP,
11+
/**
12+
* Format of data as manually created by User. Must be an array of JSON objects, each representing the actual content to put into related ES index.
13+
*/
14+
MANUAL,
15+
16+
UNKNOWN
17+
}

core/src/main/java/com/github/spring/esdata/loader/core/IndexData.java

Lines changed: 91 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -6,102 +6,111 @@
66
* Data is basically represented by:
77
* <ul>
88
* <li><code>esEntityClass</code>: mapping class, used to create the corresponding index and mapping</li>
9-
* <li><code>location</code>: path to the JSON file that contains the actual data to import</li>
9+
* <li><code>location</code>: path to the JSON file that contains the actual data to import (can be gzipped)</li>
1010
* <li><code>nbMaxItems</code> (<i>optional</i>): how many max items to load (<code>all</code> <i>by default</i> )</li>
1111
* <li><code>nbSkipItems</code> (<i>optional</i>): how many items to skip (<code>0</code> <i>by default</i> )</li>
12+
* <li><code>format</code> (<i>optional</i>): format of the data to import (<code>null</code> <i>by default</i>, will be detected from JSON file content )</li>
1213
* </ul>
1314
*
1415
* @author tinesoft
15-
*
1616
*/
1717
public class IndexData {
1818

19-
final Class<?> esEntityClass;
20-
final String location;
21-
final boolean gzipped;
22-
final Long nbMaxItems;
23-
final Long nbSkipItems;
19+
final Class<?> esEntityClass;
20+
final String location;
21+
final boolean gzipped;
22+
final Long nbMaxItems;
23+
final Long nbSkipItems;
24+
final EsDataFormat format;
25+
26+
/**
27+
* @param esEntityClass mapping class of the data to be indexed in ES
28+
* @param location path to the file that contains data (as JSON) to be indexed
29+
* @param gzipped whether or not the data is gzipped (true by default)
30+
* @param nbMaxItems maximum number of items to load
31+
* @param nbSkipItems number of items to skip
32+
* @param format format of the data to load
33+
*/
34+
public IndexData(final Class<?> esEntityClass, final String location, final boolean gzipped, final Long nbMaxItems,
35+
final Long nbSkipItems, final EsDataFormat format) {
36+
this.esEntityClass = esEntityClass;
37+
this.location = location;
38+
this.gzipped = gzipped;
39+
this.nbMaxItems = nbMaxItems;
40+
this.nbSkipItems = nbSkipItems;
41+
this.format = format;
42+
}
43+
44+
/**
45+
* Builds a new {@link IndexData} using provided parameters.
46+
*
47+
* @param esEntityClass mapping class of the data to be indexed in ES
48+
* @param location path to the file that contains data (as JSON) to be indexed
49+
* @return a new {@link IndexData}
50+
*/
51+
public static IndexData of(final Class<?> esEntityClass, final String location) {
52+
return of(esEntityClass, location, Long.MAX_VALUE, 0L, null);
53+
}
2454

25-
/**
26-
*
27-
* @param esEntityClass
28-
* mapping class of the data to be indexed in ES
29-
* @param location
30-
* path to the file that contains data (as JSON) to be indexed
31-
* @param gzipped
32-
* whether or not the data is gzipped (true by default)
33-
* @param nbMaxItems
34-
* maximum number of items to load
35-
* @param nbSkipItems
36-
* number of items to skip
37-
*/
38-
public IndexData(final Class<?> esEntityClass, final String location, final boolean gzipped, final Long nbMaxItems,
39-
final Long nbSkipItems) {
40-
this.esEntityClass = esEntityClass;
41-
this.location = location;
42-
this.gzipped = gzipped;
43-
this.nbMaxItems = nbMaxItems;
44-
this.nbSkipItems = nbSkipItems;
45-
}
55+
/**
56+
* Builds a new {@link IndexData} using provided parameters.
57+
*
58+
* @param esEntityClass mapping class of the data to be indexed in ES
59+
* @param location path to the file that contains data (as JSON) to be indexed
60+
* @param nbMaxItems maximum number of items to load
61+
* @return a new {@link IndexData}
62+
*/
63+
public static IndexData of(final Class<?> esEntityClass, final String location, final Long nbMaxItems) {
64+
return of(esEntityClass, location, nbMaxItems, 0L, null);
65+
}
4666

47-
/**
48-
* Builds a new {@link IndexData} using provided parameters.
49-
*
50-
* @param esEntityClass mapping class of the data to be indexed in ES
51-
* @param location path to the file that contains data (as JSON) to be indexed
52-
* @return a new {@link IndexData}
53-
*/
54-
public static IndexData of(final Class<?> esEntityClass, final String location) {
55-
return of(esEntityClass, location, Long.MAX_VALUE, 0L);
56-
}
67+
/**
68+
* Builds a new {@link IndexData} using provided parameters.
69+
*
70+
* @param esEntityClass mapping class of the data to be indexed in ES
71+
* @param location path to the file that contains data (as JSON) to be indexed
72+
* @param nbMaxItems maximum number of items to load
73+
* @param nbSkipItems number of items to skip
74+
* @param format format of the data to load
75+
* @return a new {@link IndexData}
76+
*/
77+
public static IndexData of(final Class<?> esEntityClass, final String location, final Long nbMaxItems,
78+
final Long nbSkipItems, final EsDataFormat format) {
79+
boolean gzipped = location.toLowerCase().endsWith(".gz");
80+
return new IndexData(esEntityClass, location, gzipped, nbMaxItems, nbSkipItems, format);
81+
}
5782

58-
/**
59-
* Builds a new {@link IndexData} using provided parameters.
60-
*
61-
* @param esEntityClass mapping class of the data to be indexed in ES
62-
* @param location path to the file that contains data (as JSON) to be indexed
63-
* @param nbMaxItems maximum number of items to load
64-
* @return a new {@link IndexData}
65-
*/
66-
public static IndexData of(final Class<?> esEntityClass, final String location, final Long nbMaxItems) {
67-
return of(esEntityClass, location, nbMaxItems, 0L);
68-
}
83+
/**
84+
* Builds a new {@link IndexData} using provided parameter.
85+
*
86+
* @param a {@link LoadEsData} to construct the data from
87+
* @return a new {@link IndexData}
88+
*/
89+
public static IndexData of(final LoadEsData a) {
90+
return of(a.esEntityClass(), a.location(), a.nbMaxItems(), a.nbSkipItems(), a.format());
91+
}
6992

70-
/**
71-
* Builds a new {@link IndexData} using provided parameters.
72-
*
73-
* @param esEntityClass mapping class of the data to be indexed in ES
74-
* @param location path to the file that contains data (as JSON) to be indexed
75-
* @param nbMaxItems maximum number of items to load
76-
* @param nbSkipItems number of items to skip
77-
* @return a new {@link IndexData}
78-
*/
79-
public static IndexData of(final Class<?> esEntityClass, final String location, final Long nbMaxItems,
80-
final Long nbSkipItems) {
81-
boolean gzipped = location.toLowerCase().endsWith(".gz");
82-
return new IndexData(esEntityClass, location, gzipped, nbMaxItems, nbSkipItems);
83-
}
93+
public Class<?> getEsEntityClass() {
94+
return this.esEntityClass;
95+
}
8496

85-
/**
86-
* Builds a new {@link IndexData} using provided parameter.
87-
*
88-
* @param a {@link LoadEsData} to construct the data from
89-
* @return a new {@link IndexData}
90-
*/
91-
public static IndexData of(final LoadEsData a) {
92-
return of(a.esEntityClass(), a.location(), a.nbMaxItems(), a.nbSkipItems());
93-
}
97+
public String getLocation() {
98+
return this.location;
99+
}
94100

95-
public Class<?> getEsEntityClass() {
96-
return this.esEntityClass;
97-
}
101+
public boolean isGzipped() {
102+
return this.gzipped;
103+
}
98104

99-
public String getLocation() {
100-
return this.location;
101-
}
105+
public Long getNbMaxItems() {
106+
return this.nbMaxItems;
107+
}
102108

103-
public boolean isGzipped() {
104-
return this.gzipped;
105-
}
109+
public Long getNbSkipItems() {
110+
return this.nbSkipItems;
111+
}
106112

107-
}
113+
public EsDataFormat getFormat() {
114+
return this.format;
115+
}
116+
}

core/src/main/java/com/github/spring/esdata/loader/core/LoadEsData.java

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
11
package com.github.spring.esdata.loader.core;
22

3-
import java.lang.annotation.ElementType;
4-
import java.lang.annotation.Repeatable;
5-
import java.lang.annotation.Retention;
6-
import java.lang.annotation.RetentionPolicy;
7-
import java.lang.annotation.Target;
3+
import java.lang.annotation.*;
84

95
/**
106
* {@code @LoadEsData} is a {@linkplain Repeatable repeatable} annotation
@@ -46,4 +42,11 @@
4642
*/
4743
long nbSkipItems() default 0;
4844

45+
/**
46+
* Format of the data to load. If unspecified the library will try to guess the right format from the content of file at {@link #location()}.
47+
*
48+
* @return format of the data to load
49+
*/
50+
EsDataFormat format() default EsDataFormat.UNKNOWN;
51+
4952
}

0 commit comments

Comments
 (0)