Hot questions for Using Neo4j in jvm

Question:

I am using Mac OS X Yosemite and I had Java8 JRE installed

but, when I start neo4j, it is unable to find any JVM's matching version 1.7


Answer:

Neo4J startup script looks for JAVA_HOME and if it is not set defaults to setting it to Java v1.7. So, set your JAVA_HOME in your .bash_profile file.

export JAVA_HOME=$(/usr/libexec/java_home)

Then, either restart your Terminal or do

$ source .bash_profile

Now, if you restart Neo4J you shouldn't see the error message.

Question:

I'm running neo4j 2.3.0-RC1 embedded, using the Java API. It keeps crashing without warning, and I'm trying to figure out why.

I was previously using this code just fine with 1.9.8. Upgrading to 2.0+ has required adding transactions, altering some cypher syntax, the boot-time Spring config, and a small finite number of other changes.

The vast majority of the code remains the same, and is functionally correct as confirmed by unit and integration tests.

When the engine is booted, it is adding new nodes on a fairly constant basis. The output below shows a mysterious crash after 290 minutes.

This seems to happen always. Sometimes after 2 hours, sometimes after 5. It never happened at all with 1.9.8.

The JVM is run with ./start-engine.sh > console.out 2>&1 &.

The operative line of start-engine.sh is

$JAVA_HOME/bin/java -server $JAVA_OPTIONS $JPROFILER_OPTIONS -cp '.:lib/*' package.engine.Main $*

Below is the last few lines of console.out.

17437.902: RevokeBias                       [     112          6              5    ]      [    20     6    27    43    26    ]  1
17438.020: RevokeBias                       [     112          3              9    ]      [     5     0     5     0     0    ]  3
17438.338: GenCollectForAllocation          [     113          2              2    ]      [     1     0    11     4    32    ]  2
17438.857: BulkRevokeBias                   [     112          3             13    ]      [     0     0    28     6     2    ]  3
./start-engine.sh: line 17: 19647 Killed                  $JAVA_HOME/bin/java -server $JAVA_OPTIONS $JPROFILER_OPTIONS -cp '.:lib/*' package.engine.Main $*

There is no stacktrace, and no other error output.

These are the last few lines of messages.log from within /mnt/engine-data

2015-10-30 18:07:44.457+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [845664646]:  Starting check pointing...
2015-10-30 18:07:44.458+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [845664646]:  Starting store flush...
2015-10-30 18:07:44.564+0000 INFO  [o.n.k.i.s.c.CountsTracker] About to rotate counts store at transaction 845664650 to [/mnt/engine-data/neostore.counts.db.b], from [/mnt/engine-data/neostore.counts.db.a].
2015-10-30 18:07:44.565+0000 INFO  [o.n.k.i.s.c.CountsTracker] Successfully rotated counts store at transaction 845664650 to [/mnt/engine-data/neostore.counts.db.b], from [/mnt/engine-data/neostore.counts.db.a].
2015-10-30 18:07:44.834+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [845664646]:  Store flush completed
2015-10-30 18:07:44.835+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [845664646]:  Starting appending check point entry into the tx log...
2015-10-30 18:07:44.836+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [845664646]:  Appending check point entry into the tx log completed
2015-10-30 18:07:44.836+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [845664646]:  Check pointing completed
2015-10-30 18:07:44.836+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [35826]:  Starting log pruning.
2015-10-30 18:07:44.844+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [35826]:  Log pruning complete.

So everything looks fine up until the moment of crash, and the crash comes as a complete surprise.

There is plenty of other data in messages.log, but I don't know what I'm looking for.


$ java -version
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

$uname -a
Linux 3.13.0-65-generic #106-Ubuntu SMP Fri Oct 2 22:08:27 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Answer:

You may be seeing the effects of the Linux Out-of-Memory Killer, which will kill processes when the system is running critically low on physical memory. This can explain why you found nothing in the logs.

To quote this excellent article:

Because many applications allocate their memory up front and often don't utilize the memory allocated, the kernel was designed with the ability to over-commit memory to make memory usage more efficient. ……… When too many applications start utilizing the memory they were allocated, the over-commit model sometimes becomes problematic and the kernel must start killing processes …

The above-quoted article is a great resource for learning about the OOM Killer, and includes a lot of information on how to troubleshoot and configure Linux to try to avoid the problem.

And to quote the answer to this question:

The OOM Killer has to select the best process to kill. Best here refers to that process which will free up maximum memory upon killing and is also least important to the system.

Since it is very possible that the neo4j process is the most memory-intensive process on your system, it makes sense that it would be killed when physical resources start running out.

One way to avoid the OOM Killer is to try to keep other memory-intensive processes off of the same system. That should make over-commitment of memory much less likely. But you should at least read the first article above, to understand the OOM Killer better -- there is a lot to know.

Question:

I am running neo4j 2.1.7 on a windows 8.1 pro laptop. I have 16G of RAM, but I keep running out of heap memory. I have a large-ish database of maybe 250K nodes, but nothing close to what I am planning to run.

I have set -Xmx to 1024m in neo4j-community.vmoptions. I tried to increase it to more, but neo4j-community.exe won't start up.

Any advice would be gratefully received

Regards, Richard


Answer:

As per the official docs,

When using Neo4j Server, JVM configuration goes into the conf/neo4j-wrapper.conf file

So set the heap size like below in neo4j-wrapper.conf file,

wrapper.java.additional=-Xmx4g

On Windows, by default that folder and file won't exist. See this for a similar issue, you need to create that folder and that file insider your Neo4j installation directory.

You said that Using -Xmx4g results in the error The JVM could not be started. The maximum heap size (-Xmx) might be too large or an antivirus or firewall tool could be blocking execution.

That error means that when you try to run java process you don't have 4GB free memory at the time of starting JVM. When you say -Xmx4g, JVM process will ask the host OS for 4GB block and reserve it for future use. But as you already have 16gb ram, check if there's any other process that's taking too much memory.

Question:

I have a problem using neo4j-shell in my mac. When I tried to start it up, these information showed up.

But I have Xms and Xmx set to 128 and 512 respectively and I did that in .bash_profile file

How can I fix this problem? Thx!


Answer:

You're getting caught by a typo. Instead of -Xmx512 it should be -Xmx512m.

You're specifying that the initial size should be 128MB, and that the maximum size should be 512 bytes. Without the m, that doesn't make sense. :)

EDIT - I tried your sample, and upped -XX:MaxPermSize to 512m, and it works for me:

export JAVA_OPTS="-Xms128m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=512m"

Question:

I'm tying to load a graph of several hundred million nodes using the neo4j-admin import tool to load the data from csv. The import will run for about two hours but then crashes with the following error:

Exception in thread "Thread-0" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.String.substring(String.java:1969)
at java.util.Formatter.parse(Formatter.java:2557)
at java.util.Formatter.format(Formatter.java:2501)
at java.util.Formatter.format(Formatter.java:2455)
at java.lang.String.format(String.java:2940)
at org.neo4j.unsafe.impl.batchimport.input.BadCollector$RelationshipsProblemReporter.getReportMessage(BadCollector.java:209)
at org.neo4j.unsafe.impl.batchimport.input.BadCollector$RelationshipsProblemReporter.message(BadCollector.java:195)
at org.neo4j.unsafe.impl.batchimport.input.BadCollector.processEvent(BadCollector.java:93)
at org.neo4j.unsafe.impl.batchimport.input.BadCollector$$Lambda$110/603650290.accept(Unknown Source)
at org.neo4j.concurrent.AsyncEvents.process(AsyncEvents.java:137)
at org.neo4j.concurrent.AsyncEvents.run(AsyncEvents.java:111)
at java.lang.Thread.run(Thread.java:748)

I've been trying to adjust my max and initial heap size settings in a few different ways. First I tried simply creating a HEAP_SIZE= variable before running the command to load the data as described here and I tried setting the heap size on the JVM like this:

export JAVA_OPTS=%JAVA_OPTS% -Xms100g -Xmx100g

but whatever I setting I use when the import starts I get the same report:

Available resources:
  Total machine memory: 1.48 TB
  Free machine memory: 95.00 GB
  Max heap memory : 26.67 GB
  Processors: 48
  Configured max memory: 1.30 TB
  High-IO: true

As you can see, I'm building this on a large server that should have plenty of resources available. I'm assuming I'm not setting the JVM parameters correctly for Neo4j but I can't find anything online showing me the correct way to do this.

What might be causing my GC memory error and how can I resolve it? Is this something I can resolve by throwing more resources at the JVM and if so, how do I do that so the neo4j-admin import tool can use it?

RHEL 7 Neo4j CE 3.4.11 Java 1.8.0_131


Answer:

The issue was resolved by increasing the maximum heap memory. The problem was I wasn't setting the heap memory allocation correctly.

It turns out there was a simple solution; it was just a matter of when I tried to set the heap memory. Initially, I had tried the command export JAVA_OPTS='-server -Xms300g -Xmx300g' at the command line then run my bash script to call neo4j-admin import. This was not working, neo4j-admin import continued to use the same heap space configuration regardless.

The solution was to simple include the command to set the heap memory in the shell script that called neo4j-admin import. My shell script ended up looking like this:

#!/bin/bash

export JAVA_OPTS='-server -Xms300g -Xmx300g'

/usr/local/neo4j-community-3.4.11/bin/neo4j-admin import \
--ignore-missing-nodes=true \
--database=mag_cs2.graphdb \
--multiline-fields=true \
--high-io=true \

This seems super obvious but it took me almost a week to realize what I needed to change. Hopefully, this saves someone else the same headache.