Hot questions for Using Neo4j in nodes

Question:

Let's say I have a property "name" of nodes in neo4j. Now I want to enforce that there is maximally one node for a given name by identifying all nodes with the same name. More precisely: If there are three nodes where name is "dog", I want them to be replaced by just one node with name "dog", which:

  1. Gathers all properties from all the original three nodes.
  2. Has all arcs that were attached to the original three nodes.

The background for this is the following: In my graph, there are often several nodes of the same name which should considered as "equal" (although some have richer property information than others). Putting a.name = b.name in a WHERE clause is extremely slow.

EDIT: I forgot to mention that my Neo4j is of version 2.3.7 currently (I cannot update it).

SECOND EDIT: There is a known list of labels for the nodes and for the possible arcs. The type of the nodes is known.

THIRD EDIT: I want to call above "node collapse" procedure from Java, so a mixture of Cypher queries and procedural code would also be a useful solution.


Answer:

I think you need something like a synonym of nodes.

1) Go through all nodes and create a node synonym:

MATCH (N)
WITH N
  MERGE (S:Synonym {name: N.name})
  MERGE (S)<-[:hasSynonym]-(N)
RETURN count(S);

2) Remove the synonyms with only one node:

MATCH (S:Synonym)
WITH S
MATCH (S)<-[:hasSynonym]-(N)
WITH S, count(N) as count
WITH S WHERE count = 1
DETACH DELETE S;

3) Transport properties and relationships for the remaining synonyms (with apoc):

MATCH (S:Synonym)
WITH S
MATCH (S)<-[:hasSynonym]-(N)
WITH [S] + collect(N) as nodesForMerge
CALL apoc.refactor.mergeNodes( nodesForMerge );

4) Remove Synonym label:

MATCH (S:Synonym)<-[:hasSynonym]-(N)
CALL apoc.create.removeLabels( [S], ['Synonym'] );

Question:

I am trying to insert about 2 million nodes into Neo4j and having trouble with performance.

I am using neo4j enterprise 2.2.0 with a server extension written in java. My computer has an ssd, 32gb ram, Intel Core i7 cpu and is running Windows 8. I run a standalone version of the server and start it by running Neo4j.bat in the bin-folder.

It takes about 25 seconds to insert 10 000 nodes with no relationships right now (I will need to add relations later, but one problem at the time).

I think this is a matter of configuration so I played around with settings a bit, but no change in performance. What I find weird is that even if I set the initmemory and maxmemory settings to 15000 in neo4j-wrapper.conf the java process only allocate 3gb maximum.

I attached my code and configurations below, does anyone have a clue what I am doing wrong? What performance should I expect when inserting a large graph?

Code for inserting
for (Thing t : things) {
    List<ValuePair> properties = parseThing(t);
    String uid = createUid(t);

    try (Transaction tx = graphDb.beginTx()) {

        Node node = graphDb.createNode();
        node.setProperty("uid", uid);

        for (ValuePair vp : properties) {
            node.setProperty(vp.getName(), vp.getValue());
        }

        tx.success();
    }
}

(First I was adding a DynamicLabel when creating the nodes, but it was even slower. Is it possible to use labels if you want good performance when inserting nodes?)

Configurations

neo4j.properties

################################################################
# Neo4j
#
# neo4j.properties - database tuning parameters
#
################################################################

# Enable this to be able to upgrade a store from an older version.
#allow_store_upgrade=true

# The amount of memory to use for mapping the store files, in bytes (or
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server, then it is generally recommended
# to leave about 2-4 gigabytes for the operating system, give the JVM enough
# heap to hold all your transaction state and query context, and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j, and is heuristically set to 75% of RAM minus the max Java heap size.
dbms.pagecache.memory=4g

# Enable this to specify a parser other than the default one.
#cypher_parser_version=2.0

# Keep logical logs, helps debugging but uses more disk space, enabled for
# legacy reasons To limit space needed to store historical logs use values such
# as: "7 days" or "100M size" instead of "true".
#keep_logical_logs=7 days

# Autoindexing

# Enable auto-indexing for nodes, default is false.
#node_auto_indexing=true

# The node property keys to be auto-indexed, if enabled.
#node_keys_indexable=name,age

# Enable auto-indexing for relationships, default is false.
#relationship_auto_indexing=true

# The relationship property keys to be auto-indexed, if enabled.
#relationship_keys_indexable=name,age

# Enable shell server so that remote clients can connect via Neo4j shell.
#remote_shell_enabled=true
# The network interface IP the shell will listen on (use 0.0.0 for all interfaces).
#remote_shell_host=127.0.0.1
# The port the shell will listen on, default is 1337.
#remote_shell_port=1337

# The type of cache to use for nodes and relationships.
cache_type=hpc

cache.memory_ratio=70

# Maximum size of the heap memory to dedicate to the cached nodes.
node_cache_size=2g
#relationship_cache_size=6g

# Maximum size of the heap memory to dedicate to the cached relationships.
#relationship_cache_size=

# Enable online backups to be taken from this database.
online_backup_enabled=true

# Port to listen to for incoming backup requests.
online_backup_server=127.0.0.1:6362


# Uncomment and specify these lines for running Neo4j in High Availability mode.
# See the High availability setup tutorial for more details on these settings
# http://neo4j.com/docs/2.2.0/ha-setup-tutorial.html

# ha.server_id is the number of each instance in the HA cluster. It should be
# an integer (e.g. 1), and should be unique for each cluster instance.
#ha.server_id=

# ha.initial_hosts is a comma-separated list (without spaces) of the host:port
# where the ha.cluster_server of all instances will be listening. Typically
# this will be the same for all cluster instances.
#ha.initial_hosts=192.168.0.1:5001,192.168.0.2:5001,192.168.0.3:5001

# IP and port for this instance to listen on, for communicating cluster status
# information iwth other instances (also see ha.initial_hosts). The IP
# must be the configured IP address for one of the local interfaces.
#ha.cluster_server=192.168.0.1:5001

# IP and port for this instance to listen on, for communicating transaction
# data with other instances (also see ha.initial_hosts). The IP
# must be the configured IP address for one of the local interfaces.
#ha.server=192.168.0.1:6001

# The interval at which slaves will pull updates from the master. Comment out
# the option to disable periodic pulling of updates. Unit is seconds.
ha.pull_interval=10

# Amount of slaves the master will try to push a transaction to upon commit
# (default is 1). The master will optimistically continue and not fail the
# transaction even if it fails to reach the push factor. Setting this to 0 will
# increase write performance when writing through master but could potentially
# lead to branched data (or loss of transaction) if the master goes down.
#ha.tx_push_factor=1

# Strategy the master will use when pushing data to slaves (if the push factor
# is greater than 0). There are two options available "fixed" (default) or
# "round_robin". Fixed will start by pushing to slaves ordered by server id
# (highest first) improving performance since the slaves only have to cache up
# one transaction at a time.
#ha.tx_push_strategy=fixed

# Policy for how to handle branched data.
#branched_data_policy=keep_all

# Clustering timeouts
# Default timeout.
#ha.default_timeout=5s

# How often heartbeat messages should be sent. Defaults to ha.default_timeout.
#ha.heartbeat_interval=5s

# Timeout for heartbeats between cluster members. Should be at least twice that of ha.heartbeat_interval.
#heartbeat_timeout=11s

neo4j-server.properties

################################################################
# Neo4j
#
# neo4j-server.properties - runtime operational settings
#
################################################################

#***************************************************************
# Server configuration
#***************************************************************

# location of the database directory
org.neo4j.server.database.location=data/graph.db

# Low-level graph engine tuning file
org.neo4j.server.db.tuning.properties=conf/neo4j.properties

# Database mode
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the neo4j.properties config file, then uncomment this line:
#org.neo4j.server.database.mode=HA

# Let the webserver only listen on the specified IP. Default is localhost (only
# accept local connections). Uncomment to allow any connection. Please see the
# security section in the neo4j manual before modifying this.
#org.neo4j.server.webserver.address=0.0.0.0

# Require (or disable the requirement of) auth to access Neo4j
dbms.security.auth_enabled=true

#
# HTTP Connector
#

# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474

#
# HTTPS Connector
#

# Turn https-support on/off
org.neo4j.server.webserver.https.enabled=true

# https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7473

# Certificate location (auto generated if the file does not exist)
org.neo4j.server.webserver.https.cert.location=conf/ssl/snakeoil.cert

# Private key location (auto generated if the file does not exist)
org.neo4j.server.webserver.https.key.location=conf/ssl/snakeoil.key

# Internally generated keystore (don't try to put your own
# keystore there, it will get deleted when the server starts)
org.neo4j.server.webserver.https.keystore.location=data/keystore

# Comma separated list of JAX-RS packages containing JAX-RS resources, one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
#org.neo4j.server.thirdparty_jaxrs_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged

org.neo4j.server.thirdparty_jaxrs_classes=my.project.package=/mypath

#*****************************************************************
# HTTP logging configuration
#*****************************************************************

# HTTP logging is disabled. HTTP logging can be enabled by setting this
# property to 'true'.
org.neo4j.server.http.log.enabled=false

# Logging policy file that governs how HTTP log output is presented and
# archived. Note: changing the rollover and retention policy is sensible, but
# changing the output format is less so, since it is configured to use the
# ubiquitous common log format
org.neo4j.server.http.log.config=conf/neo4j-http-logging.xml

#*****************************************************************
# Administration client configuration
#*****************************************************************

# location of the servers round-robin database directory. possible values:
# - absolute path like /var/rrd
# - path relative to the server working directory like data/rrd
# - commented out, will default to the database data directory.
org.neo4j.server.webadmin.rrdb.location=data/rrd

neo4j-wrapper.conf

#********************************************************************
# Property file references
#********************************************************************

wrapper.java.additional=-Dorg.neo4j.server.properties=conf/neo4j-server.properties
wrapper.java.additional=-Djava.util.logging.config.file=conf/logging.properties
wrapper.java.additional=-Dlog4j.configuration=file:conf/log4j.properties

#********************************************************************
# JVM Parameters
#********************************************************************

wrapper.java.additional.1=-XX:+UseConcMarkSweepGC
wrapper.java.additional.2=-XX:+CMSClassUnloadingEnabled
wrapper.java.additional.3=-XX:-OmitStackTraceInFastThrow
wrapper.java.additional.4=-XX:hashCode=5

# Remote JMX monitoring, uncomment and adjust the following lines as needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/7/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
#     http://docs.oracle.com/javase/7/docs/technotes/guides/management/security-windows.html
#wrapper.java.additional=-Dcom.sun.management.jmxremote.port=3637
#wrapper.java.additional=-Dcom.sun.management.jmxremote.authenticate=true
#wrapper.java.additional=-Dcom.sun.management.jmxremote.ssl=false
#wrapper.java.additional=-Dcom.sun.management.jmxremote.password.file=conf/jmx.password
#wrapper.java.additional=-Dcom.sun.management.jmxremote.access.file=conf/jmx.access

# Some systems cannot discover host name automatically, and need this line configured:
#wrapper.java.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME

# Uncomment the following lines to enable garbage collection logging
#wrapper.java.additional=-Xloggc:data/log/neo4j-gc.log
#wrapper.java.additional=-XX:+PrintGCDetails
#wrapper.java.additional=-XX:+PrintGCDateStamps
#wrapper.java.additional=-XX:+PrintGCApplicationStoppedTime
#wrapper.java.additional=-XX:+PrintPromotionFailure
#wrapper.java.additional=-XX:+PrintTenuringDistribution

# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
wrapper.java.initmemory=15000
wrapper.java.maxmemory=15000

#********************************************************************
# Wrapper settings
#********************************************************************
# path is relative to the bin dir
wrapper.pidfile=../data/neo4j-server.pid

#********************************************************************
# Wrapper Windows NT/2000/XP Service Properties
#********************************************************************
# WARNING - Do not modify any of these properties when an application
#  using this configuration file has been installed as a service.
#  Please uninstall the service before modifying this section.  The
#  service can then be reinstalled.

# Name of the service
wrapper.name=neo4j

# User account to be used for linux installs. Will default to current
# user if not set.
wrapper.user=

#********************************************************************
# Other Neo4j system properties
#********************************************************************
wrapper.java.additional=-Dneo4j.ext.udc.source=zip

wrapper.java.additional=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -Xdebug-Xnoagent-Djava.compiler=NONE-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005

You will make me very happy if you help me solve this!


Answer:

You need to create more than one node in the transaction, otherwise the transaction overhead consumes most of the time.

Please try it this way:

try (Transaction tx = graphDb.beginTx()) {

    for (Thing t : things) {

        List<ValuePair> properties = parseThing(t);
        String uid = createUid(t);

        Node node = graphDb.createNode();
        node.setProperty("uid", uid);

        for (ValuePair vp : properties) {
            node.setProperty(vp.getName(), vp.getValue());
        }
    }

    tx.success();
}

Question:

I've got a simple graph model: 1 User has N SocialUser.

I'm wondering if there is any way through spring-data-neo4j to automatically delete all SocialUser referenced when I remove an User entity.

This is what I've got so far:

Domain:

@NodeEntity
public class User implements IdentifiableEntity<String> {

   @GraphId
   private Long nodeId;
   // ...

   @RelatedTo(type = "HAS", direction = Direction.OUTGOING)
   Set<SocialUser> socialUsers = new HashSet<>();
}

@NodeEntity
public class SocialUser implements BasicNodeEntity {

   @GraphId
   private Long nodeId;
   //...

   @RelatedTo(type = "HAS", direction = Direction.INCOMING)
   User user;
}

Data:

What I've tried:

In both cases, only User is deleted:

At the moment I've encapsulated the deletion of both entities in a @Transactional method in the User service. Something like this:

   @Autowired
   Neo4jOperations template;

   @Transactional
   public void delete(String userId) throws Exception {
      User user = get(userId);
      if (user == null) throw new ResourceNotFoundException("user not found");
      Set<SocialUser> socialUsers = template.fetch(user.getSocialUsers());
      for (SocialUser socialUser : socialUsers) template.delete(socialUser);
      userRepository.delete(user);
   }

But I'm thinking it's probably not the best way to achieve it. Also I've thought that it could be better to directly execute a Cypher statement to delete all referenced nodes..

Anyone can advise me how to deal with this? Any help would be greatly appreciated. Thanks!


Answer:

I know it's been a while, but after being a time working with SDN and neo4j, it seems to be that the best way to accomplish this is using a Cypher query.

MATCH (user:User{id:'userId'})-[has:HAS]->(socialUser:SocialUser)
DELETE user, has, socialUser

With SDN, we can take advantadge of repositores:

@Repository
public interface UserRepository extends Neo4jRepository<User> {

    @Query("MATCH (user:User{id:{id}})-[has:HAS]->(socialUser:SocialUser) DELETE user, has, socialUser")
    void delete(String id);
}

Hope it helps other people

Question:

I realize that a similar question was asked here, but I'm trying to understand the reasoning behind the approach of SDN to create a label matching my class name, and also creating the same label prefixed by an underscore.

So, for example, I have a Patient class. When I create my @NodeEntity decorated Patient class through my Neo4j repository and then query it back through the Neo4j web console, I see Patient and _Patient as the labels.

As an extension to this question, say I have the following inheritance hierarchy of classes representing nodes:

@NodeEntity
public class Patient extends Person {
   //class definition here
}

@NodeEntity
public abstract class Person {
    //class definition here
}

When I save an instance of Patient to the database, it will have three labels: Person, Patient, _Patient. Why wouldn't my node also have an _Patient label?


Answer:

When your hierarchy has more than 1 (not abstract) classes the label prefixed with _ allows SDN to determine the type properly.

E.g. when you similar hierarchy

Person
Patient extends Person
EbolaPatient extends Patient

then let's say you save instance of class Patient then the node will have _Patient label, when you save EbolaPatient instance it will have _EbolaPatient label.

If you then retrieve the nodes (e.g. as a collection using findAll on a person repository) SDN will correctly instantiate the entities as Patient and EbolaPatient.

Other option how to implement this would be to find a label that is most down the class hierarchy, which is a lot more complicated than having additional prefixed label.

To see details how this is implemented see LabelBasedNodeTypeRepresentationStrategy class in SDN project.

Question:

Summary: With SDN4, I'm persisting 10 objects of which half have the same content only the id's to which they are linked differ. The linking ID's are set as @Transient. Still, two objects are created with the same content instead of one with two links to it. How can I avoid this behavior?

Detail: We have two domain objects defined and the information sources via CSV, they look as follows:

Domain object A CSV:

key,name
1,test1
3,test3

POJO A:

@Transient
private int key;    
private String name;

@Relationship(type = "HAS_CERTIFICATION", direction = "OUTGOING")
private Set<B> bObject = new HashSet<>();

public void setName(String name) {
    this.name = name;
}

@Relationship(type = "HAS_CERTIFICATION", direction = "OUTGOING")
public void hasCertification(B b) {
    bObject.add(b);
    b.getA().add(this);
}

Domain object B:

foreignKey,name,value
1,ISO9001,TRUE
1,ISO14001,TRUE
3,ISO9001,TRUE
3,ISO14001,TRUE

POJO B:

@Transient
private int foreignKey;
private String name;
private String value;

@Relationship(type = "HAS_CERTIFICATION", direction = "INCOMING")
private Set<A> a = new HashSet<>();

public void setName(String name) {
    this.name = name;
}

public void setValue(String value) {
    this.value = value;
}

@Relationship(type = "HAS_CERTIFICATION", direction = "INCOMING")
public Set<A> getA() {
    return a;
}

These CSV files are parsed and loaded into SDN4 within their respective POJO's (A and B).

Now we loop over these objects and add the relationships:

private void aHasCertification(
        Optional<List<B>> b,
        Optional<List<A>> a) {
    for (A aObj : a()) {
        for (B bObj : b()) {
            if(bObj.getForeignKey() == aObj.getKey()) {
                aObj.hasCertification(bObj);
            }
        }
    }
}

Then the root repository, repositoryA, is used to save the loaded objects. repositoryA.save(domainObjectA);

Now when I query the database; match n return n;

for each of the A objects there will be two ISO9001 and two ISO14001 objects. Instead of what I would expect, one of each with two links to A:1 and A:3.


Answer:

If I understand you correctly, instead of

you're expecting

?

The OGM has no way of knowing that two instances of B with the same "name" are the same node. What you will need to do is load the B node by property and if it exists, use it to relate to A, otherwise create it. I suspect you'll need to process the CSV data a bit further instead of modelling your objects with almost a 1:1 mapping to a CSV row.

Question:

I am currently working on an enterprise-architecture-management-system for my bachelor thesis in a insurance company. The company wants to display all self-made applications (Application-Node) in a Neo4J-Graphdatabase with Webservice-dependencies as relationships. Alongside the applications, the company also wants to persist the [Maven] versions (Version-Node) of an application (HAS_VERSION-Relationship), because new versions can have new dependencies or lose old ones (USES-Relationship). And here is my point. I want to provide unique Version-subnodes for each application instance. Such as:

APPLICATION Foo HAS_VERSION 1.0; 1.1; 2.0 and APPLICATION Bar HAS_VERSION 1.0; 2.0; 3.0; But the Version nodes 1.0 and 2.0 for both applications should be seperate nodes. So this example should display altogether 8 nodes (2 applications and 6 versions) in the graphdatabase. Is their a straightforward way to implement this with spring-data-neo4j? I already tried to use the @Unique annotation on the Version-Set in my Application POJO-Class. But this results in unique nodes for each version. So the example above displays 6 nodes (2 applications and 4 versions) in the database. And both applications have HAS_VERSION-relationships to the version nodes 1.0, 2.0. But I want explictly unique Version-"subnodes" for each application, because a USES-relationship to a version of an depending application should be directly visible in the database. With the unique Version-node it is not.

I also implemented a self-made way to do this. But it wasn't very efficient, because i do use many write and read-operations on the database. And some applications have many versions. So the @Fetch annotation is very resource-wasting. And now I am curious if spring-data-neo4j already provides a solution for my problem. Is their something to solve this efficiently?


Answer:

Yes, here's an example with Spring Data Neo4j 4 (currently in version M1). Note how you can control the persistence horizon or "depth" so be able to control fetching of related entities.

Application.java

@NodeEntity
public class Application {

    Long id;
    String name;
    @Relationship(type="HAS_VERSION", direction = "OUTGOING")
    Set<Version> versions = new HashSet<>();

    public Application() {
    }

    public Application(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public void addVersion(Version version) {
        versions.add(version);
    }

    public Set<Version> getVersions() {
        return versions;
    }

    public Long getId() {
        return id;
    }
}

Version.java

@NodeEntity
public class Version {

    Long id;

    String versionNumber;

    @Relationship(type = "HAS_VERSION", direction = "INCOMING")
    Application application;


    public Version() {
    }

    public Version(String versionNumber) {
        this.versionNumber = versionNumber;
    }

    public String getVersionNumber() {
        return versionNumber;
    }

    public void setVersionNumber(String versionNumber) {
        this.versionNumber = versionNumber;
    }

    public Application getApplication() {
        return application;
    }

    public void setApplication(Application application) {
        this.application = application;
    }

    public Long getId() {
        return id;
    }
}

Repositories

public interface VersionRepository extends GraphRepository<Version> {

}

public interface ApplicationRepository extends GraphRepository<Application>{

}

And a test:

     @Test
                public void testDomain() {
                    Application foo = new Application("Foo");
                    Version foo10 = new Version("1.0"); //create a NEW version
                    Version foo11 = new Version("1.1"); //create a NEW version
                    Version foo20 = new Version("2.0"); //create a NEW version
                    foo.addVersion(foo10);
                    foo.addVersion(foo11);
                    foo.addVersion(foo20);
                    applicationRepository.save(foo);

                    Application bar = new Application("Bar");
                    Version bar10 = new Version("1.0"); //create a NEW version
                    Version bar20 = new Version("2.0"); //create a NEW version
                    Version bar30 = new Version("3.0"); //create a NEW version
                    bar.addVersion(bar10);
                    bar.addVersion(bar20);
                    bar.addVersion(bar30);
                    applicationRepository.save(bar);

                    session.clear();

                    assertEquals(2, applicationRepository.count()); //2 apps
                    assertEquals(6, versionRepository.count()); //6 versions

                    session.clear();

                    Application barWithoutVersionsLoaded = 
applicationRepository.findOne(bar.getId(),0); 

       //Depth=0 will only load the application and properties on the application node 
       //but not its related objects. 
       //If you don't specify depth, it defaults to 1
                    assertEquals("Bar", barWithoutVersionsLoaded.getName());
                    assertEquals(0, barWithoutVersionsLoaded.getVersions().size()); //No versions loaded

                }

Question:

Data model:

I have a tree structure stored in neo4j, where nodes of :Node type can be parents for nodes of the same type.

:Node nodes displayed on the right. The root of the tree (displayed in red) shares some attributes with the leafs, so there is an abstract class called AbstractNode:

public abstract class AbstractNode {
    private Long id;
    @NotEmpty
    private String code;
    @Relationship(type = "SUBTREE_OF", direction = Relationship.INCOMING)
    private Set<Node> children;

    <getters & setters omitted>
}

Class for the parent node:

public class CodeSet extends AbstractNode {
    @Relationship(type = "SUBTREE_OF", direction = Relationship.OUTGOING)
    private Application parent;

    <getters and setters omitted>
}

Class for the child node:

public class Node extends AbstractNode {
    @NotEmpty
    private String description;
    @NotEmpty
    private String type;
    @NotEmpty
    private String name;
    @NotNull
    @Relationship(type = "SUBTREE_OF", direction = Relationship.OUTGOING)
    private AbstractNode parent;

    <getters and setters omitted>
}

Service layer:

This method is used for retrieving node info to the specifies depth:

@Transactional(readOnly = true)
    public Node findById(Long id, int depth) throws EntityNotFoundException {
        Node entity = nodeRepository.findOne(id, depth);
        if (entity == null) {
        throw new EntityNotFoundException(String.format("Node %d not found", id));
        } else {
            return entity;
        }
    }

Problem: When fetching the :Node nodes, the ones that have parents of the same type has these parents at the list of children, what is obviously wrong and causes other problems. See the screenshot of a debugger for the described data set:

How to resolve this?


Answer:

With Spring Data Neo4j (SDN) when there is a combination of incoming and outgoing relationships of the same type you need to annotate both field and setter/getter of incoming relationship field, otherwise you will end up with incorrect mapping.

This is said in the SDN documentation:

The direction attribute on a @Relationship defaults to OUTGOING. Any fields or methods backed by an INCOMING relationship must be explicitly annotated with an INCOMING direction.

There is also an issue/feature request created in neo4j-ogm (mapping library used in spring data neo4j 4+) about this.

Question:

I need to create 50,000 nodes (:Person) with some relationship with 50 nodes (:office), it is guaranteed that :office elements does not have repetitive elements, but :Person has.

My code looks like this:

CREATE INDEX ON :office(rc)
CREATE INDEX ON :Person(numDoc)
USING PERIODIC COMMIT 500 LOAD CSV FROM 
"file:///path/to/file" AS csvLine
MATCH (x:office{ rc:"345421"})
MERGE (n:Person { numDoc: toint(csvLine[1]) })
CREATE (n) -[:Afil]-> (x)

It actually works, but it takes so much time (around 2 hours). I would like to know a efficiently way to make this query


Answer:

Please make the spelling / capitalization consistent. Try to use Neo4j 2.3.0

CREATE INDEX ON :Office(rc);

CREATE INDEX ON :Person(numDoc);

// make sure the indexes are online

:schema await

MATCH (x:Office{ rc:"345421"})
LOAD CSV FROM  "file:///path/to/file" AS csvLine
MERGE (n:Person { numDoc: toInt(csvLine[1]) })
CREATE (n) -[:Afil]-> (x);

If you only create 50k people and 50k relationships you don't need periodic commit. And then you can match the office once upfront and not for every row.

You can check your query by prepending it with EXPLAIN if it uses the indexes you created.

Question:

I'm using Spring Data Neo4j and I'd like to pull a list of owners. An owner can be a :Person (human) or an :Entity (company). I'm not sure what Type<T> should be. I could use in the GraphRepository<Type> interface.

My Query is:

MATCH ()-[r:OWNED_BY]->(o) RETURN

And this is the code I tried:

public interface Owners extends GraphRepository<Object> {

    @Query("start o=MATCH ()-[r:OWNED_BY]->(o) RETURN o;")
    Iterable<Object> getOwners();

}

I had an idea that I could perhaps extend a common base class, such as PersistentBaseObject with an id and a name, or an interface like HasIdAndName. Not sure how I'd integrate this though,


Answer:

Yes,you could extend a common base class, perhaps like this-

public class Owner {

    Long id;
   String name;
...
}

public class Person extends Owner {

    private int age;
...
}
public class Entity extends Owner {

    private String location;
...
}

And add a matching repository for Owner

public interface OwnerRepository extends GraphRepository<Owner> {

}

which will allow you to do stuff such as ownerRepository.findAll() But since you're using a @Query, there is no reason you can't put this method on even the PersonRepository (at least in SDN 4. I'm not sure about SDN 3.x)

 @Query("start o=MATCH ()-[r:OWNED_BY]->(o) RETURN o;")
 Iterable<Owner> getOwners();

Note however, that now your Person and Entity nodes are going to have an extra Owner label.

EDIT:

The additional label can be avoided by changing Owner to an interface. Then the @Query on a repository returning a collection of Owner should still work.

Question:

I have cypher query which should delete relationship between 2 nodes

MATCH (t:User) - [r:LINKED_TO] - (p:Movie) 
WHERE ID (t) = {0}, ID (p) = {5} 
DELETE r 
RETURN r, t

after run I have error like

Invalid input ',': expected whitespace, '.', node labels, '[', "=~", IN, STARTS, ENDS, CONTAINS, IS, '^', '*', '/', '%', '+', '-', '=', "<>", "!=", '<', '>', "<=", ">=", AND, XOR, OR, LOAD CSV, START, MATCH, UNWIND, MERGE, CREATE, SET, DELETE, REMOVE, FOREACH, WITH, CALL, RETURN, UNION, ';' or end of input (line 1, column 67 (offset: 66))

What is the problem? I can not fix it:(


Answer:

You need to specify the second term in your where clause with AND. Also, once you get past that you would have an error trying to return r - after all, you just deleted it :)

MATCH (t:User) - [r:LINKED_TO] - (p:Movie) 
WHERE ID (t) = {0}
AND ID (p) = {5} 
DELETE r 
RETURN t,p

Question:

I've created 1 Million of nodes with Cypher but during the creation of these nodes I forgot to insert INDEX ON to create indexes.

Now, how do I create indexes for all the nodes? I tried with CREATE INDEX ON :user(userID) and I received this message: Added 1 index, statement executed in 32 ms.

But I expected to get a message saying "Added 1000000 indexes ... " since I have 1 million of nodes with user label and UserID property.


Answer:

When you do a CREATE INDEX you only create one index. The million entries are going to be things within that one index.

Cypher indexes are built in the background; so when this command successfully executes, that means the index is there, but it probably isn't fully built until some time later.

See this blog post for more details, notably this part:

You create a schema index with CREATE INDEX ON :Label(property) e.g. CREATE INDEX ON :Person(name).

You list available schema indexes (and constraints) and their status (POPULATING,ONLINE,FAILED) with :schema in the browser or schema in the shell.

Always make sure that the indexes and constraints you want to use in your operations are ONLINE otherwise they won’t be used and your queries will be slow.

My guess is that right after you execute CREATE INDEX, if you were to check the status of the index you'd find it in the POPULATING state.

Question:

I have a neo4j database up and running. I also have a process that runs every 5 minutes and what it does is to create nodes of type "Point".

"Point" has the following properties:

pointId, cameraId, classId, groupId, datetime

Nodes of type "Point" relate to themself if: pointId & cameraId & classId & groupId are the same.

Is it possible to somehow get all the nodes "Point" that relate themself and based on that group of nodes create a new node "Line" where "Line"-[:CONTAINS]->"Point"?

UPDATE: Following image shows what I have and what I need. For simplicity, I've just defined the property "camera", if a node Point shares the camera then it needs to be grouped.


Answer:

Yes, it's possible.

You need to collect all the points for each pair of these properties. Then create the Line node and then create a relationship between the created line and all the grouped points.

Add the required properties to the line node in the following query.

MATCH (p:Point)
WITH p.pointId as pointId, p.cameraId as cameraId, p.classId as classId, p.groupId as groupId, collect(p) as related_points
CREATE (line:Line)
WITH line, related_points
UNWIND related_points as point
CREATE (line)-[:CONTAINS]->(point)

Question:

I have started using neo4j and I have several versions of a graph in my neo4j database (the only thing that changes is the timestamp at the top node).

I was wondering how to get only the relations to that one node. I currently use this:

"START n=node(*) MATCH (n)-[r]->(m) RETURN n,r,m;"

But this just displays all of them. I know I have to change the n=node(*) but I don't know to what. (the name of the top node is: Info) so maybe something like

"START n=node(i:Info{timeStamp:'20/04/2018'}) MATCH (n)-[r]->(m) RETURN n,r,m;"

but that would just give me the relations to that one node... and I need the whole graph


Answer:

Do this:

MATCH (n:Info)-[r]->(m)
WHERE n.timeStamp = '20/04/2018'
RETURN n, r, m;

For quicker access to the top node, you should also create an index on :Info(timeStamp):

CREATE INDEX ON :Info(timeStamp);

[UPDATED]

To also get all the relationships and nodes to depth 2, you could do this:

MATCH (n:Info)-[r1]->(m1)-[r2]->(m2)
WHERE n.timeStamp = '20/04/2018'
RETURN n, r1, m1, r2, m2;

To get all the relationships and nodes up to an arbitrary depth (say, 5), you can do this (each returned path will be one of the matching paths from n to a child node):

MATCH path=(n:Info)-[r*..5]->(m)
WHERE n.timeStamp = '20/04/2018'
RETURN path;

You could also just use [r*] for an unbounded variable-length search, but that can cause the server to run out of memory or take an extremely long time to finish.

Question:

I try to write a function in Java that finds nodes in my Neo4j graph database with same names and labels.

@UserFunction(value = "boris.example")
@Description("finds nodes with same name")

   public  ResourceIterator<Node> passName(@Name("nodeId") long nodeId)
   {
        Node nodeX = null;

        Node node = db.getNodeById( nodeId ); //Find node with spcific ID
        Object nameVal = node.getProperties("name"); //Get its name
        Label label = Label.label("VersionableObject"); //Decl. of label

        // Find nodes by label and name
        ResourceIterator<Node> nodes = db.findNodes(label, "name", nameVal); 
        nodes.close();

        return nodes;
   }

This function causes following exception an refuses to start Neo4j:

"Argument ResourceIterator<Node> at position 0 in passName with type Label cannot be converted to a Neo4j type: Don't know how to map org.neo4j.graphdb.ResourceIterator to the Neo4j Type System."

Then I tried to convert ResourceIterator to Stream using

Stream<Node> nodesStream = nodes.stream();

Same Error - Neo4j can't map Stream..

ResourceIterator<Node> nodes = db.findNodes(label, "name", nameVal);    

        while (nodes.hasNext()) {

            nodeX = nodes.next();

        }
        nodes.close();

return nodeX;

Output in Neo4j empty, even though it should find 3 nodes.

Converting the Stream to Array or List did not work as well.

Finding one node and displaying itself or its properties works, e.g.:

Node node = db.getNodeById( nodeId );
Map<String, Object> propertyMap = node.getProperties("name");

EDIT

After changing @UserFunction to @Procedure with mode.READ:

package org.neo4j.example.procedure_template;


import org.neo4j.graphdb.ResourceIterator;
import java.util.stream.Stream;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Label;
import org.neo4j.graphdb.Node;
import org.neo4j.logging.Log;
import org.neo4j.procedure.*;
import org.neo4j.procedure.Description;
import org.neo4j.procedure.Name;


public class FindNode {

    @Context
    public GraphDatabaseService db;

    @Context
    public Log log;


    public FindNode() {
    }

    @Procedure(value = "boris.getAllNodesWithProperty", mode = Mode.READ)
    @Description("boris.getAllNodesWithProperty - finds Node by ID and return defined Property")

       public Stream<Node> passName(@Name("nodeId") long nodeId)

       {
        Node node = db.getNodeById( nodeId );
        Object nameVal = node.getProperties("name");
        Label label = Label.label("VersionableObject");

        ResourceIterator<Node> nodes = db.findNodes(label, "name", nameVal);
        Stream<Node> nodesStream = nodes.stream();

        return nodesStream;
       }

} 

In this case i get: "Procedures with zero output fields must be declared as VOID".

Any ideas? Thank you!


Answer:

In Neo4j you can extend Cypher by creating some custom user defined function or procedure.

A user defined function is just a convertor, it's readonly, and return a single type that can be : long, Long, double, Double, boolean, Boolean, String, Node, Relationship, Path, Map<String, Object, or List<T>

A procedure can takes arguments, perform operations on the database, and return results (as a Stream).

In your code, you have defined a user function, and I think what you want is a procedure. That's why you have the cast error from ResourceIterator to the Neo4j Type System.

You have to make those changes :

  • replace @UserFunction(value = "boris.example") by @Procedure(value = "boris.example", mode = Mode.READ)
  • Change the return type of your method to Stream<Node>

Cheers.

Edit

A procedure must return a pojo or a simple type. So you have to wrap your Node in a pojo like this :

public class FindNode {

    @Context
    public GraphDatabaseService db;

    @Context
    public Log log;


    @Procedure(value = "boris.getAllNodesWithProperty", mode = Mode.READ)
    @Description("boris.getAllNodesWithProperty - finds Node by ID and return defined Property")
    public Stream<NodeResult> passName(@Name("nodeId") long nodeId)

    {
        Node node = db.getNodeById( nodeId );
        Object nameVal = node.getProperty("name");
        Label label = Label.label("VersionableObject");

        ResourceIterator<Node> nodes = db.findNodes(label, "name", nameVal);

        return nodes.stream().map( item -> new NodeResult(item));
    }

    public class NodeResult {

        public final Node node;

        public NodeResult(Node node) {
            this.node = node;
        }
    }


}

Question:

I have the follwing structure

        firstNode = graphDb.createNode();
        firstNode.setProperty( "person", "Andy " ); 
        Label myLabel = DynamicLabel.label("A");
        firstNode.addLabel(myLabel);
        secondNode = graphDb.createNode();
        secondNode.setProperty( "person", "Bobby" );
        Label myLabel1 = DynamicLabel.label("B");
        secondNode.addLabel(myLabel1);
        ThirdNode = graphDb.createNode();
        ThirdNode.setProperty( "person", "Chris " );
        Label myLabel2 = DynamicLabel.label("C");
        ThirdNode.addLabel(myLabel2);....

        relationship = firstNode.createRelationshipTo( secondNode, RelTypes.emails );
        relationship.setProperty( "relationship", "email " );
        relationship = firstNode.createRelationshipTo( ThirdNode, RelTypes.emails );
        relationship.setProperty( "relationship", "email " );
        relationship = secondNode.createRelationshipTo( ThirdNode, RelTypes.emails );
        relationship.setProperty( "relationship", "email " );
        relationship = secondNode.createRelationshipTo( FourthNode, RelTypes.emails );
        relationship.setProperty( "relationship", "email " );

firstNode is linked to second and third by the relation "emails". Similarly, second node is connected to third, fourth, first.

I want for each node output somethinglike this: secondNode=[firstNode, FouthNode, ThirdNode], firstNode=[second, third], third=...

I tried something like this:

try{
        ExecutionEngine engine = new ExecutionEngine(graphDb);
        ExecutionResult result = engine.execute("MATCH (secondNode{person:'Bobby'})<-[:emails]-(node)RETURN node");

        System.out.println(result.dumpToString());
        tx1.success();
    } 

I got the output :Node[0]{person:"Andy "}

Im am very new to cypher. How to write match statement for this? Is this possible?


Answer:

  • Your label should be something like :Person not :A, :B, :C
  • You want to aggregate by your first node.
  • You should use uppercase re-types

try something like this:

MATCH (sender:Person)-[:EMAILS]->(receiver) 
RETURN sender,collect(receiver) as receivers

Question:

I'm using Spring Data Neo4j 4.0.0 with Neo4j 2.2.1 and I'm trying to create a relationship between two nodes with the exact same labels.

So, I have a NodeEntity class and I have a variable inside with the same Type as the class itself, and annotate it as Relationship. But, when I save the object to the database using the save() method of the repository object, the relationship can't be created.

Thank you in advance and your suggestion would be really appreciated!

EDIT

Here is the node entity classes

public class ArchitectureUnitState extends UnitState {

    public ArchitectureUnitState()
    {
        super();
    }

    public ArchitectureUnitState(String name, String description, String parentArchitectureUnitName)
    {
        super(name, description);
        this.parentArchitectureUnitName = parentArchitectureUnitName;
    }

    @Relationship(type="PART_OF", direction = Relationship.OUTGOING)
    private ArchitectureUnitState architectureUnitState;

    @Relationship(type="STATE_OF", direction = Relationship.OUTGOING)
    private ArchitectureUnit architectureUnit;

    @Transient
    private String parentArchitectureUnitName;

    public void partOf(ArchitectureUnitState architectureUnitState) {
        this.architectureUnitState = architectureUnitState;
    }

    public void stateOf(ArchitectureUnit architectureUnit) {
        this.architectureUnit = architectureUnit;
    }

    public void childOf(String parentArchitectureUnitName) {
        this.parentArchitectureUnitName = parentArchitectureUnitName;
    }

    public String getParentName() {
        return parentArchitectureUnitName;
    }
}

@NodeEntity
public class UnitState {
    @GraphId
    protected Long id;

    private String name;
    private String description;

    public UnitState() {

    }

    public UnitState(String name, String description) {
        this.name = name;
        this.description = description;
    }

    public void setName(String name) {
        this.name = name;
    }

    public void setDescription(String description) {
        this.description = description;
    }

    public String getName() {
        return name;
    }

    public String getDescription() {
        return description;
    }
}

So, the sequence is: I created the ArchitectureUnitState objects, map one to another, then save with the save() method of the ArchitectureUnitStateRepository.

If I do like this, the PART_OF relationships aren't created, although I see in the debugging that the values are there.

My workaround right now is I save all the ArchitectureUnitState nodes first, retrieve them again from the database, map one to another, then save it again. This way, the relationships can be created, but I need to save two times.


Answer:

Here's my test case using your classes above.

    @Test
        public void testArchitectureState() {
            ArchitectureUnitState state1 = new ArchitectureUnitState("one","desc one","root");
            ArchitectureUnitState state2 = new ArchitectureUnitState("two","desc two","root");
            ArchitectureUnit unit1 = new ArchitectureUnit("unit1");
            ArchitectureUnit unit2 = new ArchitectureUnit("unit2");
            state1.partOf(state2);
            state1.stateOf(unit1);
            state2.stateOf(unit2);
            architectureUnitStateRepository.save(state1);

            state1 = architectureUnitStateRepository.findByName("one");
            assertEquals("two", state1.getArchitectureUnitState().getName());
            assertEquals("unit1", state1.getArchitectureUnit().getName());

            state2 = architectureUnitStateRepository.findByName("two");
            assertNull(state2.getArchitectureUnitState()); 
            assertEquals("unit2", state2.getArchitectureUnit().getName());

} 

It does pass as expected, and the nodes created in the graph appear to indicate the same.

Note that assertNull(state2.getArchitectureUnitState()); holds true because the direction of the relation is specified as OUTGOING. There is no outgoing PART_OF relation from state2, so none will be loaded.

If I change the test to

@Test
    public void testArchitectureBothWays() {
        ArchitectureUnitState state1 = new ArchitectureUnitState("one","desc one","root");
        ArchitectureUnitState state2 = new ArchitectureUnitState("two","desc two","root");
        ArchitectureUnit unit1 = new ArchitectureUnit("unit1");
        ArchitectureUnit unit2 = new ArchitectureUnit("unit2");
        state1.partOf(state2);
        state2.partOf(state1);
        state1.stateOf(unit1);
        state2.stateOf(unit2);
        architectureUnitStateRepository.save(state1);

        state1 = architectureUnitStateRepository.findByName("one");
        assertEquals("two", state1.getArchitectureUnitState().getName());
        assertEquals("unit1", state1.getArchitectureUnit().getName());


        state2 = architectureUnitStateRepository.findByName("two");
        assertEquals("one",state2.getArchitectureUnitState().getName());
        assertEquals("unit2", state2.getArchitectureUnit().getName());
    }

then we have a relationship in both directions and now state2 has a relationship to state1.

Question:

import org.neo4j.driver.v1.AuthTokens;
import org.neo4j.driver.v1.Driver;
import org.neo4j.driver.v1.GraphDatabase;
import org.neo4j.driver.v1.Session;


public class adding {

    static Driver driver;

    public static void main(String args[]) {
        driver = GraphDatabase.driver("bolt://localhost:7687", AuthTokens.basic("neo4j","neo4j"));
        Session session = driver.session();
        session.run("CREATE (n:Person {username: 'bob'})");
        session.run("CREATE (n:Person {username: 'tom'})");
        session.run("CREATE (n:Person {username: 'bob'})"); // I don't want this ran

    }
}

Using neo4j with java how do I forbid the creation of the same node again? For example, you can see in the above code that there are 2 bob properties.

How do I validate in java so that I know a bob node already exists.

MATCH (n:Person {username: 'bob'}) RETURN n in neo4j would show that there is already a bob node. so I'd just put an if condition over the third bob to be sure that no duplicate created. but not sure how to write that in java. And this current code is just a simplified version. My main code has user input rather than hard coded creations


Answer:

I think StatementResult is what you are looking for to run the Match query.

And you could do something like this.

import org.neo4j.driver.v1.AuthTokens;
import org.neo4j.driver.v1.Driver;
import org.neo4j.driver.v1.GraphDatabase;
import org.neo4j.driver.v1.Session;
import org.neo4j.driver.v1.StatementResult;

import java.util.Arrays;
import java.util.List;


public class adding {

    static Driver driver;

    public static void main(String args[]) {
        driver = GraphDatabase.driver("bolt://localhost:7687", AuthTokens.basic("neo4j","neo4j"));
        Session session = driver.session();

        List<String> userList = Arrays.asList("bob","tom","bob");
        userList.forEach(user->{
            StatementResult result = session.run( "MATCH (a:Person {username: '"+user+"'}) RETURN a");
            //Check if the node exists and create only if node doesn't exists
            if(!result.hasNext()){
                session.run("CREATE (n:Person {username: '"+user+"'})");
            }
        });
    }
}

Also, you may need to create constraints to avoid duplicates.

And if you like to get the result in a POJO class with Spring data then you need @QueryResult annotation to map the results to the POJO.

Question:

How can I find a path between given two nodes in neo4j Java API, the product of all weights of the path is maximum in all paths between two nodes, what can I do? In my graph database, there are two elements: one is node, another is relationship, and all have a name property, but the relationship have an extra property: weight (double type, and values in (0,1]). My code as follows: How do I modify?

public static ArrayList<Path> getAllOptPaths(Long startNodeId, Long endNodeId, GraphDatabaseService db){
    ArrayList<Path> optPathsBetweenTwoNodes = new ArrayList<Path>();
    try (Transaction tx = db.beginTx()){
        Node node1 = db.getNodeById(startNodeId);
        Node node2 = db.getNodeById(endNodeId);

        PathExpander<Object> pathExpander = PathExpanders.allTypesAndDirections();
        CostEvaluator<Double> costEvaluator = CommonEvaluators.doubleCostEvaluator("Cost");

        // find all paths between given two nodes
        PathFinder<WeightedPath> dijkstraPathsFinder = GraphAlgoFactory.dijkstra(pathExpander, costEvaluator);
        WeightedPath path = dijkstraPathsFinder.findSinglePath(node1, node2);

        optPathsBetweenTwoNodes.add(path);

        tx.success();
    } catch (Exception e) {
        e.printStackTrace();
    }
    return optPathsBetweenTwoNodes;       
}

Answer:

Exploring paths is also possible in a cypher query. This can deliver you the result of a weighted path calculation (and thus the path with the minimum weight on it) as a result.

This webpage contains an example of a weighted path query. Also, the Neo4j reduce function will help you to specify how to calculate (and weigh) your paths.

Question:

I have made a simple java code to make relations among more than 600 thousand existing nodes, the relation data is taken from 1 million rows, the problem is it takes almost forever to finish creating relations, only 7930 relations are created at the moment of writing this post.

How this problem could be solved?

public void createRealtionBetweenNodes(Postgresql object){

 Map<Integer, String[][]> data = object.getRelationData();

try (Driver driver = org.neo4j.driver.v1.GraphDatabase.driver( "bolt://localhost:7687", AuthTokens.basic( "user", "password" ) );
                Session session = driver.session()){

for(int incrementer: data.keySet()){

String [][] dataholder = data.get(incrementer);

session.run( "match (x:Node{id:{id1}}),(y:Node{id:{id2}}) create (x)-[:Knows{ID:{KID}}]->(y);", parameters("id1",dataholder[0][0],"id2",dataholder[0][1],"KID",dataholder[0][2]));

    }
  }
}

Answer:

I would guess,based on the data you provide, that Node.id is not indexed. If it's not, and there are a lot of Node nodes, database hits could be a lot until it matches the id. Try creating an index and running this again.

Question:

I'm having an issue with rendering a Jung graph - the graph is creating duplicate nodes for some reason.

I load the nodes (two different types into a custom Vertex class (MovieVertex extends RootNode implements NodeInfo and PersonVertex extends RootNode implements NodeInfo) - RootNode has a name field that I display on the vertex label using the following code:

      DirectedSparseGraph<NodeInfo,String> g = new DirectedSparseGraph<NodeInfo, String>();
// Code to read node data from a Neo4j graph database
      List<Map<String, Object>> nodes = grapher.read(cql);

      try (Session session = grapher.driver.session()){
             StatementResult result = session.run(cql2);
                while (result.hasNext()) {
                  Record record = result.next();
                  String targetNode = record.get(1).get("title").toString();
                  String sourceNode = record.get(0).get("name").toString();
                  String tagline = record.get(1).get("tagline").toString();
                  String released = record.get(1).get("released").toString();
                  int born = record.get(0).get("born").asInt();
                  String rel = sourceNode + "-ACTED_IN-"+ targetNode;
                  //The problem is probably here - it is creating duplicate vertices for the same data
                 MovieVertex mv = new MovieVertex(targetNode,tagline,released);
                 PersonVertex pv = new PersonVertex(sourceNode,born);
                  g.addVertex(pv);

                  g.addVertex(mv);
                  g.addEdge(rel, pv, mv);
                  ISOMLayout<NodeInfo,String> layout = new ISOMLayout<NodeInfo,String>(g);

                VisualizationViewer<NodeInfo,String> vv =
          new VisualizationViewer<NodeInfo,String>(layout, new Dimension(800,800));

                vv.getRenderContext().setVertexLabelTransformer(new ToStringLabeller());
               JFrame frame = new JFrame();
               frame.getContentPane().add(vv);
               frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
               frame.pack();
               frame.setVisible(true);
            } 

Edit: I added the following code to the Vertex custom classes to address the problem, updated graph example:

    @Override
    public boolean equals(Object o) {

        if (o == this) return true;
        if (!(o instanceof MovieVertex)) {
            return false;
        }

        MovieVertex mv = (MovieVertex) o;

        return new EqualsBuilder()
                .append(title, mv.title)
                .append(tagline, mv.tagline)
                .append(released, mv.released)
                .isEquals();
    }

  @Override
    public int hashCode() {
        return new HashCodeBuilder(17, 37)
                .append(title)
                .append(tagline)
                .append(released)
                .toHashCode();
    }


Answer:

This is a common problem with any custom Java class that is used as a Map key (as JUNG Vertices are). Unless you override the equals and hashCode method in your custom Vertex class, you will get duplicates. You would see the same problem if you added your MovieVertex or PersonVertex instances to a java.util.Set or used them as keys in a Map. Perhaps you can use your name field to compute the value for hashCode and to determine when the Vertices are equal.

Question:

I have a list of 10.000 names in a Java program. I would like to create/merge a node for each of them in a Neo4j 3.3.0 database.

I know that I can contact the database through

<dependency>
  <groupId>org.neo4j.driver</groupId>
  <artifactId>neo4j-java-driver</artifactId>
  <version>1.4.4</version>
</dependency>

and send Cypher queries. I would like to avoid sending thousands of individual queries to the database. I read about the possibility of reading CSV files, but it seems strange to first write CSV file from Java, make it available through http to give it to the database.


Answer:

You can use the FOREACH function to process all the names passed in via a single list parameter. This is very similar to @BrunoPeres' answer, but perhaps a bit more readable.

try ( Session session = driver.session() )
{
    List<String> list = new LinkedList<>();
    list.add("Jon");
    list.add("Doe");
    list.add("Bruno");

    session.writeTransaction( new TransactionWork<String>()
    {
        @Override
        public String execute( Transaction tx )
        {
            StatementResult result = tx.run(
                    "FOREACH(name IN $names | CREATE (p:Person) SET p.name = name)",
                    parameters( "names", list ) );
        }
    });
}

NOTE: The FOREACH function can only accept (after the |) Cypher clauses that write to the DB (like CREATE and SET).

Question:

I have source node and destination node I want to put restriction on nodes and relation types in the path. I am using Neo4j Java API.

Consider following toy example,

We have three person nodes A, B & C.

Source Node: A & Destination Node: B. There are many other kind of paths may exists between them. I want to restrict paths to specific format like-

(person) -[worksAt]-> (company) -[CompetitorOf]-> (company) <-[worksAt]- (person)

This can be very easily achieved from cypher query, but I want to know is there any way we can do it using Java API.

NOTE:

  1. Kindly do not suggest putting restriction on path length, that doesn't solve the problem. I want to restrict the node and relation types in path.
  2. Example mentioned above is toy example. Graph I am trying to work is more complex and there are many possible paths not feasible to traverse and validate individual paths.

Answer:

It's not really clear from your question what you're actually trying to compute. Do you have A and B and want to find if their companies are competitors? Do you have C and want to find who among their friends work at competing companies?

Anyway, if you're using the traversal API (you're talking about paths), you can write a custom PathExpander which will use the last relationship in the Path to determine the next type of relationship to traverse.

If you're just traversing the relationships manually, I don't really see the problem: just call Node.getRelationships(RelationshipType, Direction) with the proper parameters at each step.

Contrary to what you do in Cypher, you don't declare the pattern you're looking for in the path, you just compute the path to follow the wanted pattern.

Question:

On Neo4j, programatically I run BFS as follows:

public Traverser runBFSPaths(Node startNode) {
    TraversalDescription myTraversal = graphDb.traversalDescription()
       .breadthFirst()
       .relationships(relationshipType)
       .evaluator(Evaluators.excludeStartPosition());
    return myTraversal.traverse(startNode);
}

If I have multiple labels of nodes (NodeType), how can I restrict the BFS above, only to one nodeType (without using Cypher)? Or is it much easier to express this in Cypher?


Answer:

You may want to look at the path expander procedures in APOC Procedures. These use bfs expansion by default, and let you specify node labels for whitelist, blacklist, or for end nodes you're interested in. You can also specify the relationships to use for the expansion.

Question:

I have given query like this

WITH ['1000Anthem.txt','1007AW.txt','100Art.txt'] as NDS 
UNWIND RANGE(0, size(NDS)-2) as i 
UNWIND RANGE(i+1, size(NDS)-1) as j 
WITH NDS, NDS[i] as N1, NDS[j] as N2 
MATCH path = (N1)-[*]-(N2) 
WHERE length(path)+1 <=size(NDS) 
 AND ALL(n in nodes(path) WHERE n in NDS) 
RETURN path    

I got the following error

Type mismatch: N1 already defined with conflicting type String (expected Node) (line 2, column 15 (offset: 224)) "MATCH path = (N1)-[*]-(N2) WHERE length(path)+1 <=size(NDS) AND ALL(n in nodes(path) WHERE n in NDS) RETURN path"


Answer:

Your N1 and N2 variables are bound to strings from your list.

The MATCH afterwards is attempting to use them as nodes, which isn't possible. A string is not a node.

If you want to lookup a node where one of its properties is equal to the string, you'll need a different approach, using different variables for the nodes, and a predicate in the WHERE clause to filter only nodes where the node's property is equal to the string.

EDIT

You haven't provided any context into what these nodes are supposed to be, no labels, and non-descriptive variable names, so I'm just going to make a wild guess and say these are nodes labeled :File, with the property name.

Your lookup and collecting of the nodes at the start of the query would be something like:

WITH ['1000Anthem.txt','1007AW.txt','100Art.txt'] as NDS 
MATCH (f:File)
WHERE f.name in NDS
WITH collect(f) as NDS
...

If you have an index on :File(name), then the index will be used to speed up that lookup. Your NDS variable at this point will be a collection of nodes instead of a collection of strings, so the remaining parts of your query will be syntactically correct.

Question:

I have a initial node of user, that can own multiple items and that items can be also organized into groups (each into multiple groups). Therefore the user can have two corresponding relationships — owning the group and owning the item. The items have an extra relationship to the groups.

I would like to find the items, that are owned by the user, but are not any group.

Is there a way how to select these item in one traversal method, or should I get all items owned by the user and then iterate over them to find out these without the relationship to groups?

Edit: I am sorry, that my question was not clear enough. By the traversal method I meant using the Neo4j Traversal Framework in Java. For example:

TraversalDescription td = db.traversalDescription()
    .breadthFirst()
    .relationships(OWNS, Direction.BOTH)
     //IS THERE A WAY HOW TO SAY THE NEXT RELATIONSHIP (EDGE) DOES NOT EXIST?
    .evaluator(Evaluators.excludeStartPosition())
    .uniqueness(Uniqueness.NODE_GLOBAL);

Traverser t = td.traverse(userNode);
    for (Path p : t) {
        //OR: SHOULD I LOOP TROUGH ALL THE RELATIONSHIPS OF THE END NODE - ITEM
        System.out.println(p.endNode().getProperty("name"));
    }

The two comments in the code are the places where I expected either a method for the traversalDescription or iterating through all relationships of the endNodes.


Answer:

In Cypher:

MATCH (me:Person{name:'me'})-[:OWNS]->(i:Item)
WHERE NOT (i)-[:PART_OF]->()  // not part of a group
RETURN i

In Java API:

Node me = graphDb.findNode(Label.label("Person"), "name", "me");
Iterable<Relationship> owns = me.getRelationships(RelationshipType.withName("OWNS"), Direction.OUTGOING);
Stream<Node> nodes= StreamSupport.stream(owns.spliterator, false)
    .filter(r -> r.getEndNode().hasRelationship(RelationshipType.withName("PART_OF"),OUTGOING))
    .map(r -> r.getEndNode());

Question:

I just want to make sure I don't load in memory all the nodes of the db, only those called with nodes.next() in the iterator. This is what I have:

    try (Transaction tx = graphDB.beginTx()) {
        Node node = graphDB.getNodeById(1);
        ResourceIterator<Node> nodes = graphDB.traversalDescription().evaluator(Evaluators.all()).traverse(node).nodes().iterator();

        while (nodes.hasNext()) {
            Node node = nodes.next();
            // do stuff with the node...

       }
   }

Is it correct?


Answer:

Yes that's correct.

Only the nodes that are pulled by nodes.next() are loaded.

During iteration it pulls the data as needed to fullfill the Iterator.

Question:

I'm using neo4j OGM for communicating with my neo4j database. i'm trying to store a json collection with data. This collection contains a lot of duplicate data. Is there a way to ask the OGM to filter the duplicate data for me so that my graph does not contains duplicate data?

If the OGM does not contains this functionality what is the best way to filter this data or what is the best way to check if data already exist in the database?


Answer:

You can use the Cypher MERGE clause (instead of CREATE) to avoid creating duplicates. You should read the documentation to understand how to use MERGE correctly.

Question:

I am currently usring Spring and neo4j. One mission is to display the graph using linkurious. However, how can I tell Spring through spring-data-neo4j the labels of the nodes? I need the labels to color the graph in linkurious. If using findAll() defined in the graph repository, only node properties will be returned?

Any suggestion?

UPDATE

I tried to use @QueryResult, but there's something wrong with the respond. To be more specific:

I define

@QueryResult
public class NodeWithLabel {
    GLNode glNode;
    ArrayList<String> labels;
}

then in the repository, I have

@Query("MATCH (n:GLNode) RETURN n AS glNode, labels(n) as labels")
Collection<NodeWithLabel> getAllNodesWithLabel();

Finally, I will get a result with ArrayList<E>, so the spring mvc will respond empty like [{},{},{},{}]. Normally, such as the embedded findAll() function, a LinkedHashSet<E> should be returned, in this case, the spring mvc can send back a json respond.


Answer:

SDN 4.0 does not map nodes/relations to domain entities in a @QueryResult. The code you've posted will work with SDN 4.1

If you want to achieve the same in SDN 4.0, you can do this:

@QueryResult
public class NodeWithLabel {
    Long id;
    Map<String,Object> node;
    ArrayList<String> labels;
}


@Query("MATCH (n:GLNode) RETURN ID(n) as id, labels(n) as labels, {properties : n} as node")
Collection<NodeWithLabel> getAllNodesWithLabel();

Note: Strongly recommend that you plan to upgrade to SDN 4.1

Question:

I use the neo4j java core api and want to update 10 million nodes. I thought it will be better to do it with multithreading but the performance is not that good (35 minutes for setting properties).

To explain: Each node "Person" has at least one relation "POINTSREL" to a "Point" node, which has the property "Points". I want to sum up the points from the "Point" node and set it as property to the "Person" node.

Here is my code:

Transaction transaction = service.beginTx();
ResourceIterator<Node> iterator = service.findNodes(Labels.person);
transaction.success();
transaction.close();

ExecutorService executor = Executors.newFixedThreadPool(5);

while(iterator.hasNext()){
    executor.execute(new MyJob(iterator.next()));
}

//wait until all threads are done
executor.shutdown();

try {
    executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
    e.printStackTrace();
}

And here the runnable class

private class MyJob implements Runnable {

    private Node node;

    /* collect useful parameters in the constructor */
    public MyJob(Node node) {
        this.node = node;
    }

    public void run() {
        Transaction transaction = service.beginTx();
        Iterable<org.neo4j.graphdb.Relationship> rel = this.node.getRelationships(RelationType.POINTSREL, Direction.OUTGOING);

        double sum = 0;
        for(org.neo4j.graphdb.Relationship entry : rel){
            try{
                sum += (Double)entry.getEndNode().getProperty("Points");
            } catch(Exception e){
                e.printStackTrace();
            }
        }
        this.node.setProperty("Sum", sum);

        transaction.success();
        transaction.close();
    }
}

Is there a better (faster) way to do that?

About my setting: AWS Instance with 8 CPUs and 32GB ram

neo4j-wrapper.conf

# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
wrapper.java.initmemory=16000
wrapper.java.maxmemory=16000

neo4j.properties

# The type of cache to use for nodes and relationships.
cache_type=soft
cache.memory_ratio=30.0
neostore.nodestore.db.mapped_memory=2G
neostore.relationshipstore.db.mapped_memory=7G
neostore.propertystore.db.mapped_memory=2G
neostore.propertystore.db.strings.mapped_memory=2G
neostore.propertystore.db.arrays.mapped_memory=512M

Answer:

From my perspective there is something that can be improved.

Offtopic

If you are using Java 7 (or greater) consider using try with resource to handler transaction. It will prevent you from errors.

Performance

First of all - batching. Currently you are:

  • Creating Job
  • Starting thread (actually, there is pool in executor)
  • Starting transaction

For each node. You should consider to make updates in batches. This means that you should:

  • Collect N nodes (i.e. N=1000)
  • Create single job for N nodes
  • Create single transaction in job
  • Update N nodes in that transaction
  • Close transaction
Setup

You have 8 CPU's. That means that you can create bigger thread pool. I think Executors.newFixedThreadPool(16) will be OK.

Hacks

You have 32GB RAM. I can suggest:

  • Decrease java heap size to 8GB. From my experience large heap size can lead to large GC pauses and performance degradation
  • Increate mapped memory size. Just to make sure that more data can be kept in cache.

Just for your case. If all your data can fit in RAM, then you can change cache_type to hard for this change purposes. Details.

Configuration

As you said - you are using Core API. Is this Embedded graph database or server extension?

If this is Embedded graph database - you should verify that your database settings is applied to created instance.

Question:

In Neo4j 2.1, I used code like this:

ResourceIterable<Node> it = GlobalGraphOperations.at(db).getAllNodesWithLabel(FOO);
TraversalDescription td = db.traversalDescription().breadthFirst().
  relationships(BAR_REL).uniqueness(Uniqueness.NODE_GLOBAL);
for (Path p : td.traverse(it)) {
  ...
}

In Neo4j 2.2, the getAllNodesWithLabel() method is deprecated, but I'm not sure how to eliminate it. The replacement method db.findNodes(Label) is close, but it returns an Iterator rather than an Iterable, and there's no way I can see to start a traversal with an Iterator unless I wrap it in a dummy Iterable or something. Anyone have a pointer?


Answer:

There are (at least) 2 convenient ways to solve this:

1) As you've mentioned wrap the Iterator into an Iterable. Neo4j has this aboard out of the box:

import org.neo4j.helpers.collection.IteratorUtil;
...
for (Path p : td.traverse(IteratorUtil.asIterable(it))) {
  ...
}

2) the traverse method also accepts an Node[] array (as vararg), so you can e.g. use Guava's Iterables.toArray()

I'd prefer 1) due to less memory overhead and it does not include another dependency.

Question:

I am new to Neo4j and I would like to query a Neo4j database containing only nodes in order to create links between them according to 2 lists I already have.

For example, I want to connect nodes with names in a List A with nodes with names from List B.

This is the code I wrote :

public class Main {

    public static void main(String[] args) {

        GraphDatabaseFactory graphDbFactory = new GraphDatabaseFactory();
        GraphDatabaseService graphDb = graphDbFactory.newEmbeddedDatabase("C:\\Zakaria\\NeoTests\\Tetralecture");

        ExecutionEngine execEngine = new ExecutionEngine(graphDb);

/* Here is a loop to read from listA and listB so no need to worry about them */

        try (Transaction ignored = graphDb.beginTx()) {
            String query = "MATCH (auteur1:AUTEUR{Name:'" + listA.get(i) + "'}), (auteur2:AUTEUR{Name:'" + listB.get(j) + "'}) return auteur1, auteur2";
            ExecutionResult execResult = execEngine.execute(query);
            Iterator<Node> aut_column = execResult.columnAs("auteur1");
            for(Node node : IteratorUtil.asIterable(aut_column)) {
                String nodeResult = node + " : " + node.getProperty("Name");
                System.out.println(nodeResult);
            }
        }

    }

}

This example only displays one list of authors from one colunm auteur1, I would like to be able to display both of them.

If I can do that, I think maniputating the nodes from both lists and creating links between them would be easier.

Thanks for your help!


Answer:

Does this work for you?

public class Main {

    public static void main(String[] args) {

        GraphDatabaseFactory graphDbFactory = new GraphDatabaseFactory();
        GraphDatabaseService graphDb = graphDbFactory.newEmbeddedDatabase("C:\\Zakaria\\NeoTests\\Tetralecture");

        ExecutionEngine execEngine = new ExecutionEngine(graphDb);

/* Here is a loop to read from listA and listB so no need to worry about them */

        try (Transaction ignored = graphDb.beginTx()) {
            String query = "MATCH (auteur1:AUTEUR{Name:'" + listA.get(i) + "'}), (auteur2:AUTEUR{Name:'" + listB.get(j) + "'}) return auteur1, auteur2";
            ExecutionResult execResult = execEngine.execute(query);
            for(Map<String, Object> row : execResult) {
                final Node node1 = (Node)row.get("auteur1");
                final Node node2 = (Node)row.get("auteur2");
                String nodeResult = node1 + " : " + node1.getProperty("Name") + "; " + node2 + " : " + node2.getProperty("Name");
                System.out.println(nodeResult);
            }
        }

    }

}

Question:

This is my pom.xml:

<dependency>
            <groupId>org.springframework.data</groupId>
            <artifactId>spring-data-neo4j</artifactId>
            <version>3.2.0.RELEASE</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.data</groupId>
            <artifactId>spring-data-neo4j-rest</artifactId>
            <version>3.2.1.RELEASE</version>
        </dependency>

and config.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:neo4j="http://www.springframework.org/schema/data/neo4j"
       xmlns:context="http://www.springframework.org/schema/context"
       xsi:schemaLocation=" http://www.springframework.org/schema/beans
                            http://www.springframework.org/schema/beans/spring-beans.xsd
                            http://www.springframework.org/schema/data/neo4j
                            http://www.springframework.org/schema/data/neo4j/spring-neo4j.xsd">

     <!-- REST Connection to Neo4j server -->    
    <bean id="graphDatabaseService"
          class="org.springframework.data.neo4j.rest.SpringRestGraphDatabase">
        <constructor-arg index="0" value="http://localhost:7474/db/data" />

    </bean>

    <!-- graphDatabaseService- Neo4j configuration (creates Neo4jTemplate) -->
    <neo4j:config
            storeDirectory="db/neo4j/data/graph.db"
            base-package="x.y.z.mediator.domain.model"
            graphDatabaseService="graphDatabaseService"/>

</beans>

And I want to autowire the neo4jtemplate form neo4j standalone server:

@Repository
public class EmployeeDAO_Neo4j implements EmployeeDAO {

    @Inject
    private Neo4jTemplate neo4jTemplate;


    @Override
    public List<Map<String, Object>> findAll(String query) {

        //Object is the node address
        Result<Map<String, Object>> result = neo4jTemplate.query("MATCH (emp:`EmpBase`) RETURN emp;", null);
        //Below line results with: {emp=http://localhost:7474/db/data/node/1}
        System.out.println("Results:"+ result.as(List.class).get(1).toString());


        List<Map<String, Object>> list = new ArrayList<Map<String, Object>>();
        Map<String, Object> map = null;



        for(EmployeeBase u : result.to(EmployeeBase.class)) {
            map = new HashMap<String, Object>( );
           map.put(u.getE_id().toString(), u.toString());

            list.add(map);
        }

        System.out.println(result.to(EmployeeBase.class).as(List.class).get(1));

        return list;
    }
}

The returned list is 4 element one but it says in every System.out.println from the above code that: {e_id=1, e_bossId=null, e_name='null'} so it is not populated, because the test data are:

$ cat /tmp/empbase.csv 
e_id,e_bossid,e_name
11,11,Smith
12,11,Johnson
13,11,Roberts
14,13,Doe

The line with results only gives:{emp=http://localhost:7474/db/data/node/1}

The data in Neo4j server comes from csv import. empfull.csv:

CREATE empbase;

// Import data and schema for empbase; the '_EmpBase' is required by SpringData-neo4j
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:/tmp/empbase.csv" AS row
CREATE (:EmpBase:_EmpBase {  neo_eb_id:      row.e_id,
                    neo_eb_bossID:  row.e_bossid,
                    neo_eb_name:    row.e_name});

//Create index
CREATE INDEX ON :EmpBase:(neo_eb_id);

// Create relationships
LOAD CSV WITH HEADERS FROM "file:/tmp/empbase.csv" AS row
MATCH (employee:EmpBase:_EmpBase    {neo_eb_id: row.e_id})
MATCH (manager:EmpBase:_EmpBase     {neo_eb_id: row.e_bossid})
MERGE (employee)-[:REPORTS_TO]->(manager);

Is it a bug of SDN or sth is wrong with this config?

PS. The db/neo4j/data/graph.db from xml config is never generated. i run this project with mvn clear package PS2: This is the EmployeeBase.java

@NodeEntity
public class EmployeeBase {

    @GraphId
    private Long e_id;

    private Integer e_bossId;
    private String  e_name;

    public Long getE_id() {
        return e_id;
    }

    public Integer getE_bossId() {
        return e_bossId;
    }

    public String getE_name() {
        return e_name;
    }

    @Override
    public String toString() {
        return "{" +
                "e_id=" + e_id +
                ", e_bossId=" + e_bossId +
                ", e_name='" + e_name + '\'' +
                '}';
    }

}


Answer:

Because you used the wrong property names in your import, you prefixed them all with neo_eb instead of e_. You also had some typos e.g. in the e_bossId

CREATE empbase;

// Import data and schema for empbase; the '_EmpBase' is required by SpringData-neo4j
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:/tmp/empbase.csv" AS row
CREATE (:EmpBase:_EmpBase {  e_id:      row.e_id,
                    e_bossId:  row.e_bossid,
                    e_name:    row.e_name});

//Create index
CREATE INDEX ON :EmpBase:(e_id);

// Create relationships
LOAD CSV WITH HEADERS FROM "file:/tmp/empbase.csv" AS row
MATCH (employee:EmpBase:_EmpBase    {e_id: row.e_id})
MATCH (manager:EmpBase:_EmpBase     {e_id: row.e_bossid})
MERGE (employee)-[:REPORTS_TO]->(manager);

Question:

I'm setting up a P.O.C. using Neo4j, and technically have everything I need working but would like it set up properly.

As a quick overview - I can create nodes and relationships, and traverse the graph (i.e. return all features available in a specific market) so I know these nodes/relationships have been created.

However, when I query to simply return a Node based on ID, it returns ONLY the data for that node - and not any relationships or connected nodes, for example, the markets its available in.

I've looked various places online that have not only a Node returned but also the subsequent nodes - though I follow what they're doing I cant seem to get it to work with mine.

Feature Repository:

    @Repository
    public interface FeatureRepository<T extends Feature> extends Neo4jRepository<T, Long> {
    ...
    }

Colour Repository:

    @Repository
    public interface ColourRepository extends FeatureRepository<Colour>{
        @Query("CREATE(feat:Colour:Feature {marketingDesc:{marketing}, engineeringDesc:{engineering}, code:{code}})")
        Colour createColour(@Param("marketing") String marketingDesc, @Param("engineering") String engineeringDesc, @Param("code") String code);

        @Query("MATCH (c:Colour {code:{colourCode}}) MATCH (c)-[:AVAILABLE_IN]->(market) RETURN c AS colour, COLLECT(market) AS markets")
        Colour getColourByCode(@Param("colourCode") String colourCode);

        Colour findByCode(@Param("code") String code);
    }

Feature Entity:

    @NodeEntity(label = "Feature")
    @Getter
    @Setter
    @AllArgsConstructor
    @NoArgsConstructor
    public class Feature {
        @Id
        @GeneratedValue
        private Long id;
        private String marketingDesc;
        private String engineeringDesc;
        @Index(unique = true)
        private String code;

        @Relationship(type = "HAS_OPTION", direction = Relationship.INCOMING)
        private List<Option> options = new ArrayList<>();

        @Relationship(type = "AVAILABLE_IN")
        private List<Market> markets = new ArrayList<>();

        @Relationship(type = "HAS_PREREQUISITE", direction = Relationship.UNDIRECTED)
        private List<Prerequisite> prerequisites = new ArrayList<>();
    }

Colour Entity:

    @AllArgsConstructor
    @NodeEntity(label = "Colour")
    public class Colour extends Feature {
    }

Market Entity:

    @NodeEntity(label = "Market")
    @Getter
    @Setter
    @AllArgsConstructor
    @NoArgsConstructor
    public class Market {
        @Id
        @GeneratedValue
        private Long id;

        @Index(unique = true)
        private String code;
        private String market;

        @Relationship(type = "AVAILABLE_IN", direction = Relationship.INCOMING)
        private List<Option> features = new ArrayList<>();
    }

Relationship Entity (for features to be connected to markets they can be bought in):

    @RelationshipEntity(type = "AVAILABLE_IN")
    @Getter
    @Setter
    @AllArgsConstructor
    @NoArgsConstructor
    public class Available {
        @Id
        @GeneratedValue
        private Long Id;
        private List<String> availableIn = new ArrayList<>();

        @StartNode
        private Feature feature;
        @EndNode
        private Market market;
    }

Controller:

    @RestController
    public class ConfigController {

        private final Handler configHandler;

        public ConfigController(Handler configHandler) {
            this.configHandler = configHandler;
        }

     @PostMapping(path = "/create/colour", consumes = APPLICATION_JSON_VALUE, produces = APPLICATION_JSON_VALUE)
        public SimpleResponse createColour(@RequestBody Colour request) {
            ColourService service = new ColourService(configHandler);
            Colour created = service.createColour(request);
            return SimpleResponse.builder().result("Created:", created).build();
        }

        @PostMapping(path = "/create/market", consumes = APPLICATION_JSON_VALUE, produces = APPLICATION_JSON_VALUE)
        public SimpleResponse createMarket(@RequestBody Market request) {
            MarketService service = new MarketService(configHandler);
            Market created = service.createMarket(request);
            return SimpleResponse.builder().result("Created", created).build();
        }

        @PostMapping(path = "/create/relationship/availableIn", consumes = APPLICATION_JSON_VALUE, produces = APPLICATION_JSON_VALUE)
        public SimpleResponse createAvailableInRelationship(@RequestBody OptionAvailInRequest request){
            RelationshipService service = new RelationshipService(configHandler);
            Object result = service.createAvailableInRelationship(request);
            return SimpleResponse.builder().result("Result:", result).build();
        }

        @GetMapping(path = "/colour/{code}")
        public SimpleResponse getColourByCode(@PathVariable(value = "code") String code) {
            ColourService service = new ColourService(configHandler);
            Colour colour = service.getColourByCode(code);
            return SimpleResponse.builder().result("Colour:", colour).build();
        }

        @GetMapping(path = "/features/available/{mrktCode}")
        public SimpleResponse getFeaturesInMarket(@PathVariable(value = "mrktCode") String mrktCode){
            RelationshipService service = new RelationshipService(configHandler);
            Collection<Feature> features = service.getFeaturesInMarket(mrktCode);
            return SimpleResponse.builder().result("Features:", features).build();
        }
    }

Neo4jConfig file:

    @Configuration
    @EnableNeo4jRepositories(basePackages = "package.location")
    @EnableTransactionManagement
    public class Neo4jConfig {
        @Bean
        public org.neo4j.ogm.config.Configuration configuration() {
            org.neo4j.ogm.config.Configuration configuration =
                    new org.neo4j.ogm.config.Configuration.Builder().build();


            return configuration;
        }

        @Bean
        public SessionFactory sessionFactory(org.neo4j.ogm.config.Configuration configuration) {

            return new SessionFactory(configuration,"package.location");
        }

        @Bean
        public Neo4jTransactionManager transactionManager(SessionFactory sessionFactory) {
            return new Neo4jTransactionManager(sessionFactory);
        }
    }

So, for example, here I can create a Colour Node:

Example value:

{
  "code": "string",
  "engineeringDesc": "string",
  "id": 0,
  "marketingDesc": "string",
  "markets": [
    {
      "code": "string",
      "features": [
        {}
      ],
      "id": 0,
      "market": "string"
    }
  ],
  "options": [
    {}
  ],
  "prerequisites": [
    {}
  ]
}

What I send:

{
  "code": "BLU",
  "engineeringDesc": "Blue engineering",
  "marketingDesc": "Blue marketing"
}

And this creates a Colour Node successfully:

{
  "result": {
    "Created:": {
      "id": 0,
      "marketingDesc": "Blue marketing",
      "engineeringDesc": "Blue engineering",
      "code": "BLU",
      "options": [],
      "markets": [],
      "prerequisites": []
    }
  },
  "error": null
}

I can create a Market Node: Example Value:

{
  "code": "string",
  "features": [
    {}
  ],
  "id": 0,
  "market": "string"
}

What I send:

{
  "code": "UB",
  "market": "England"
}

Which creates a Market Node successfully:

{
  "result": {
    "Created": {
      "id": 1,
      "code": "UB",
      "market": "England",
      "features": []
    }
  },
  "error": null
}

I can then create a relationship between the two, to say that colour is available in that market:

{
  "featureCode": "BLU",
  "marketCode": "UB"
}

Which I can verify has been created by hitting: localhost:8080/features/available/UB

{
  "result": {
    "Features:": [
      {
        "id": 0,
        "marketingDesc": "Blue marketing",
        "engineeringDesc": "Blue engineering",
        "code": "BLU",
        "options": [],
        "markets": [],
        "prerequisites": []
      }
    ]
  },
  "error": null
}

However when I then go to return the Colour Node itself: localhost:8080/colour/BLU

{
  "result": {
    "Colour:": {
      "id": 0,
      "marketingDesc": "Blue marketing",
      "engineeringDesc": "Blue engineering",
      "code": "BLU",
      "options": [],
      "markets": [],
      "prerequisites": []
    }
  },
  "error": null
}

The 'markets' option is always null. I have tried custom queries and building queries using the neo4j helper (e.g. findByCode etc.), and every example I can find will sucessfully return the related nodes, but I cant seem to get mine to.

Can anyone help? P.S. Please let me know if there is anything else that would be helpful for you to see. Been trying to get this sorted for days....


Answer:

Got the answer to this question...

Feature Entity should have been:

    @Relationship(type = "AVAILABLE_IN")
    @ApiModelProperty(hidden = true)
    private Set<Available> markets = new HashSet<>();

Market Entity should have been:

    @Relationship(type = "AVAILABLE_IN", direction = Relationship.INCOMING)
    @ApiModelProperty(hidden = true)
    private Set<Available> features = new HashSet<>();

Which gets the markets section of the feature JSON no longer null...

Now I have the problem that there's an infinite recursion loop between the two classes, with a feature displaying the markets and the markets displaying the features

EDIT:

For anyone else with this/similar issues, I've found a really good github resource. GitHub neo4j ogm walkthrough

Helped a lot.

Question:

I am working on nodes with several attributes, 7 or 8 attributes per node. Since Neo4j is based on node objects, in case I am interested in getting only one of those attributes, is it faster to return the whole node and then get the attribute, or return directly the attributes? I am talking about queries returning millions of records, and I am using a Java API to gather the results.


Answer:

If you're talking about the actual return, then returning the node will implicitly return all attributes, so it's going to be more expensive.

If you haven't done the return yet, and are still processing within the transaction, then property access won't happen until you actually access the property or properties yourself.

For either case, Cypher or Java, it's often best to withhold property access until you've done your filtering/limiting/aggregating and just use the node instead, if it makes sense for you to do so. This would avoid performing property access on nodes that may be filtered out due to these operations.

Question:

I'm using neo4j-ogm 3.1.5 in my project.

In my code, when I'm fetching any relationship entity with depth = 1, it is fetching startNode and endNode and it is also fetching relationships of startNode and endNode. Basically, it the depth parameter is working as depth = depth + 1, because the same value of depth is passed to the nodes when fetching relationship entity.

AFAIK understand depth parameter is used basically like a hibernate's LAZY or EAGER loading. In SchemaRelationshipLoadClauseBuilder class, its happening in the method

public String build(String variable, String label, int depth)
Steps to Reproduce

Fetch a relationship entity using findById method

Current Implementation

In SchemaRelationshipLoadClauseBuilder, the following method:

public String build(String variable, String label, int depth)
  • calls expand(sb, "n", start, depth) instead of expand(sb, "n", start, depth-1), AND
  • calls expand(sb, "m", end, depth) instead of expand(sb, "m", end, depth-1).

The thing is, this will cause a problem in my project, as the startNode and endNode of the respective relationship entity can have more than 100 000 relationships of the same kind and fetching all those relationships will take up the memory of the machine.

Can anyone explain why is it so?


Answer:

The reason for this behaviour is not a bug but the nature of a Cypher query. You cannot load a relationship on its own. There have to be start and end nodes to form a correct query.

The depth will then get applied to both nodes. Of course this is more or less a kind of definition if you already took one "hop" from a relationship to the nodes but this will definitely also question the general depth model in Neo4j-OGM because suddenly every relationship (without touching a node) will count as a hop and hitting the node would be the next one.

Question:

This question is similar to the one asked in apoc.gephi.add doesn't work : NODE[25512922] has no property with propertyKey='name' yet, I want to add a few things: the issue is with Neo4j 3.2.12 and APOC version 3.2.0.3, if the node does not contain the property "name" (exactly this string), then it raises the error:

NODE[x] has no property with propertyKey='name'

Even if the node contains properties with the substring 'name' (example, "propertyname"), it throws the same error. On checking the code in https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/3.2/src/main/java/apoc/gephi/Gephi.java, it looks like the last block of code in the method caption(Node n) should take care of it (the absence of a node property with string "name"). However, for some reason, it is not able to detect the substring "name" in the property names. Can someone shed some light on this issue?


Answer:

I have made some tests on the lastest version, and everything is working.

So I started to check the commit log of the procedure, and I have found this : https://github.com/neo4j-contrib/neo4j-apoc-procedures/commit/8b25b05fa461ae0177db1b0604f628b73f12e08d#diff-d27b3f05da2e50dbcd2c95ca367b0e65

So it's a bug on the procedure, but it has been correted on the version 3.2.0.4. So you just have to upgrade your apoc version.

Question:

Let's say, i want to create 100 Nodes and import it to graphDb of neo4j, does it mean i have to call

Node nodeName = graphDb.createNode();

100 times and use 100 different names as well ? That is a lot of work. Is there another way to create large amount of Nodes, without writing it one by one ?


Answer:

There are multiple solutions :

A. Standard Java and simple names

IntStream.range(0, 1000).forEach(i -> {
    database.createNode(Label.label("Person")).setProperty("name", "person-" + i);
        });

B. Use java faker for generating firstnames :

https://github.com/DiUS/java-faker

C. Use graphgen online and import into your neo4j

http://graphgen.graphaware.com

D. Use the graphgen procedure for neo4j 3.x

https://github.com/graphaware/neo4j-graphgen-procedure

Question:

I've been trying to place labels over newly created nodes using Neo4J restful api. The following is the CURL request I've tried.

Client client = ClientBuilder.newClient();

WebTarget target = client.target("http://localhost:7474/db/data/");

String propertyUri_labels = "http://localhost:7474/db/data/node/9/labels";

Response response = target
                .path(propertyUri_labels)
                .request(MediaType.APPLICATION_JSON)
                .header("application/xml", "true")
                .accept(MediaType.APPLICATION_JSON)
                .put(Entity.entity("\"" + "Artist" + "\"", MediaType.APPLICATION_JSON_TYPE));
        
System.out.println( String.format( "PUT to [%s], status code [%d]",
                propertyUri_labels, response.getStatus() ) );

Answer:

Try a POST instead of PUT. Check the documentation => http://neo4j.com/docs/rest-docs/current/#rest-api-node-labels

Question:

I try to save a new entity who contains another new entity of different type and also a new relationship between them and I failed. Basically I hope to understand Transitive Persistence.

Spring Data Neo4j version: 3.3.2.RELEASE Neo4j Server: neo4j-community-2.2.3

Here is what I have tested:

SUCCEED: save new entities seperately and then create/save a relationship Entity A, Entity B, Relationship C between A & B

FAILED1: save new entities seperately, then create several relationship of same type and save Entity A1, A2, Entity B1, B2, Relationship C1, C2. Then A1-C1-B1, A1-C2-B2, A2-C1-B2

Result: I got off course entitie As and Bs but no relationship Cs.

FuncModerate fm1 = new FuncModerate(m, s, "HEAL");
    FuncModerate fm2 = new FuncModerate(m2, s1, "HEAL");
    FuncModerate fm3 = new FuncModerate(m3, s1, "HEAL");

    Set<FuncModerate> fms = new HashSet<FuncModerate>();
    fms.add(fm1);
    fms.add(fm2);
    fms.add(fm3);
    neo4jOps.save(fms); //Exception occurs

Exception Log:

java.lang.NullPointerException
at org.springframework.data.neo4j.support.Neo4jTemplate.getMappingPolicy(Neo4jTemplate.java:549)
at org.springframework.data.neo4j.support.Neo4jTemplate.getMappingPolicy(Neo4jTemplate.java:726)
at org.springframework.data.neo4j.support.Neo4jTemplate.save(Neo4jTemplate.java:354)
at org.ming.controller.Neo4JController.newF(Neo4JController.java:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)

FAILED2: save a new entity A, creat two entities B and set to A, save A again, I got the first entity B saved with the right relationship but then an exception occurs and the second entity B isn't saved.

Entity A:

@NodeEntity
public class Medicine {
    @GraphId Long id;

    @RelatedTo(type = "HEAL")
    private Set<Symptom> symptom;

    ...

 }

Entity B:

@NodeEntity
public class Symptom {
    @GraphId Long nodeId;

    ...

}

Controller:

@RequestMapping(value = "/new", method = RequestMethod.POST)
public void newF() {
    Medicine m = new Medicine("moderate hungry");
    neo4jOps.save(m);
    Symptom s = new Symptom("much confidence");
    Symptom s2 = new Symptom("less angry");
    Set<Symptom> ss = new HashSet<Symptom>();
    ss.add(s);  // got saved
    ss.add(s2); // not got saved
    m.setSymptom(ss);
    neo4jOps.save(m);

    ...

}

Exception Log:

GRAVE: Servlet.service() for servlet [dispatcher] in context with path     [/springlearn] threw exception [Request processing failed; nested     exception is org.neo4j.graphdb.NotFoundException: '__type__' on     http://localhost:7474/db/data/relationship/14] with root cause
org.neo4j.graphdb.NotFoundException: '__type__' on     http://localhost:7474/db/data/relationship/14
    at     org.neo4j.rest.graphdb.entity.RestEntity.getProperty(RestEntity.java:125)
at org.springframework.data.neo4j.support.typerepresentation.AbstractIndexBasedTypeRepresentationStrategy.readAliasFrom(AbstractIndexBasedTypeRepresentationStrategy.java:126)
at org.springframework.data.neo4j.support.mapping.TRSTypeAliasAccessor.readAliasFrom(TRSTypeAliasAccessor.java:36)
at org.springframework.data.neo4j.support.mapping.TRSTypeAliasAccessor.readAliasFrom(TRSTypeAliasAccessor.java:26)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:102)
at org.springframework.data.convert.DefaultTypeMapper.getDefaultedTypeToBeUsed(DefaultTypeMapper.java:165)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:142)
at org.springframework.data.neo4j.support.mapping.Neo4jEntityConverterImpl.read(Neo4jEntityConverterImpl.java:77)

Answer:

I solve to problem for Failed 2 Saving Bs to DB should be done before saving A who contains B as a relationship-end node. Then the Annotation RelatedTo from A to B will be created. If B doesnot exist when trying to save A, null exception will be reported.

Question:

I am developing a Question Answering application using Neo4j in Java. For that I need to find intermediate nodes between given two nodes through any relationship.

For Example given Graph:

A - x -> C
B - y -> C

Therefore if given [A,B] nodes, the output should be [C], because it is connected to both A and B through relationship x and y respectively. Is this possible using Java driver of neo4j.

Thanks


Answer:

If A and B have ids 1 and 2, the cypher query you want looks something like:

MATCH A -- C -- B
WHERE id(A)=1 AND id(B)=2
RETURN C

Make this query from your Java setup and you should be good to go