If we have to support delta-import then a column(like last_modified) with timestamp is required, so that Solr can find out the deltas.
Python script to generate sample data using Faker, sqlalchemy & pandas.
Steps:
1. Create database in mysql `create database solr_test`.
2. Create `user` table with above schema. You can use `sample-data-generator/create-user-table.sql` to do that.
3. In `sample-data-generator/random-data-generator.py` set number of users you want to create and run the script. `python random-data-generator.py`.
4. Check the solr_test.user table for records.
- Download desired version of Solr
- Unzip to your desired location
- Download Mysql connector and place the *.jar in contrib/dataimporthandler/lib
- Start Solr -
./bin/solr start - Stop Solr -
./bin/solr stop - Restart Solr -
./bin/solr restart - Create a new core user -
./bin/solr create -c user
-
Start Solr
-
Create a new core
user, this will create a directoryuser(server/solr/user) -
You need to create
data-config.xmlinsideserver/solr/user/conf, here we need to add database configuration and entity config.<dataConfig><dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/solr_test" user="root" password="root" /> -
You also need to edit
server/solr/user/conf/solrconfig.xmland add following line blocks:<lib dir="../../contrib/dataimporthandler/lib" regex=".*\.jar" /> <lib dir="../../dist/" regex="solr-dataimporthandler-.*\.jar" />` <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler> -
You also need to add field mapping in
server/solr/user/conf/managed.schema<field name="first_name" type="text_general" indexed="true" stored="true" /> <field name="last_name" type="text_general" indexed="true" stored="true" /> <field name="job" type="text_general" indexed="true" stored="true" /> <field name="country" type="text_general" indexed="true" stored="true" /> -
Now restart the solr server.
-
Make a POST request for delta-import
curl 'http://localhost:8983/solr/user/dataimport?_=1610196841059&indent=on&wt=json' \ -H 'Connection: keep-alive' \ -H 'Accept: application/json, text/plain, */*' \ -H 'DNT: 1' \ -H 'X-Requested-With: XMLHttpRequest' \ -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36' \ -H 'Content-type: application/x-www-form-urlencoded' \ -H 'Origin: http://localhost:8983' \ -H 'Sec-Fetch-Site: same-origin' \ -H 'Sec-Fetch-Mode: cors' \ -H 'Sec-Fetch-Dest: empty' \ -H 'Referer: http://localhost:8983/solr/' \ -H 'Accept-Language: en-IN,en-US;q=0.9,en;q=0.8,hi-IN;q=0.7,hi;q=0.6,en-GB;q=0.5' \ --data-raw 'command=delta-import&verbose=false&clean=false&commit=true&core=user&name=dataimport' \ --compressed -
Now query to find all Neurosurgeons living in India
GET -
http://localhost:8983/solr/user/select?q=country%3AIndia%20AND%20job%3ANeurosurgeon{ "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"country:India AND job:Neurosurgeon"}}, "response":{"numFound":2,"start":0,"numFoundExact":true,"docs":[ { "country":["India"], "last_name":["Pacheco"], "id":"86813", "job":["Neurosurgeon"], "first_name":["Jesus"], "_version_":1688414340002086913 }, { "country":["India"], "last_name":["Clayton"], "id":"89267", "job":["Neurosurgeon"], "first_name":["Cheryl"], "_version_":1688414345436856320 } ] } }
