Thursday, July 7, 2016

Configure Solr Data Import Handler with Neo4j

Overview

This article explain how solr can be configured to fetch data from neo4j database.

Install "movie" sample database in neo4j (3.0.1)

  1. Click "Favourite" icon (star mark)
  2. Click on "Movie Graph" under "Example Graphs"
  3. Run the automatically generated query ":play movie-graph"
  4. Follow the instruction and create the database 

Configuring Solr (6.0.0)

Create a core

You would need to create a core in order to be able to index and search.
To create a core use following command
     solr create -c <name>
     i.e. solr create -c Movie -d  basic_configs

Note that new folder will create as /solr-6.0.0/server/solr/Movie . And solr interface would look like below.

Import solr-dataimporthandler jars 

Add following entry to /solr-6.0.0/server/solr/Movie/conf/solrconfig.xml

<lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-dataimporthandler-.*\.jar" /> 

Configure dataImportHandler as a requestHandler

Add following entry to /solr-6.0.0/server/solr/Movie/conf/solrconfig.xml
 
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>

Note: db-data-config.xml is responsible for data import handler configurations.

db-data-config.xml configurations

Create db-data-config.xml in /solr-6.0.0/server/solr/Movie/conf/ and add following content


<dataConfig>
    <dataSource driver="org.neo4j.jdbc.Driver" url="jdbc:neo4j://localhost:7474" user="neo4j" password="rahal" />
    <document>
        <entity name="movie"
            query="MATCH (n:Movie) RETURN n.tagline AS tagline, n.title as title, n.released as released">
            <field column="tagline" name="Tagline" />
            <field column="title" name="Title" />
            <field column="released" name="Released" />
           
        </entity>
    </document>
</dataConfig>

Modify the managed-schema file

Add following fields to the /solr-6.0.0/server/solr/Movie/conf/managed-schema

    <uniqueKey>Title</uniqueKey>
    <field name="Tagline" type="string" indexed="true" stored="true"/>
    <field name="Title" type="string" indexed="true" stored="true"/>
    <field name="Released" type="string" indexed="true" stored="true"/>

Remove below two line segments

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

<uniqueKey>id</uniqueKey>

Add Neo4j JDBC and dependencies

Get neo4j jdbc and it's dependencies using following maven config.

  <dependencies>
     <dependency>
            <groupId>org.neo4j</groupId>
              <artifactId>neo4j-jdbc</artifactId>
              <version>2.0.0-M06</version>
      </dependency>
  </dependencies>
 
  <repositories>
        <repository>
          <id>neo4j-public</id>
          <url>http://m2.neo4j.org/content/groups/public</url>
        </repository>
    </repositories>

You can find required dependencies using eclipse IDE and by creating a maven project using above maven config.




So following are the required jars.
  1. httpclient-4.3.2.jar
  2. httpcore-4.3.1.jar
  3. httpmime-4.3.jar
  4. jackson-core-asl-1.9.12.jar
  5. jackson-mapper-asl-1.9.12.jar
  6. neo4j-cypher-dsl-1.9.RC2.jar
  7. neo4j-jdbc-2.0.0-M06.jar
  8. org.restlet-2.2.2.jar
  9. org.restlet.ext.httpclient-2.2.2.jar

Importing the data

Use solr interface to import data as follows