Neo4j improve performance of counting number of relationships

Posted By : Yasir Zuberi | 31-Aug-2019

One of the challenage which almost every neo4j database user encounters is perfomance issue while working with node relationships.

 

I have faced the similar problem, when I had to count the number of relationships a node is linked to.

 

In order to understand the performance improvement lets create few nodes and create relationship between them.


CREATE (n:Person { name: 'Mike', profile: 'Software Developer' });

CREATE (n:ProgrammingLanguage { name: 'Java', description: 'Java Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'Php', description: 'Php Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'Python', description: 'Python Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'Go', description: 'Go Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'C++', description: 'C++ Programming Language' });

So we have create one node of Person and five nodes of ProgrammingLanguage

 

	MATCH (developer:Person),(language:ProgrammingLanguage)
	WHERE developer.name = 'Mike' AND language.name = 'Java' OR language.name = 'Php' OR language.name = 'Python' OR language.name = 'Go' OR 		language.name = 'C++'
	CREATE (developer)-[r:KNOW_LANGUAGE]->(language);
        

Now we have created relationships between Person node and ProgrammingLanguage nodes.

 

Lets's take a look at two entirely different approaches for calculating total number of incoming/outgoing relationships for Person named Mike.

 

First approach using Count (low performance in terms of time execution and database access)

 

	MATCH (n:Person {name:'Mike'})-[]-() RETURN count(*);
        Output : 5

//For detailed analysis use Profile in Query
   	PROFILE MATCH (n:Person {name:'Mike'})-[]-() RETURN count(*);
Detailed Query Results
Query Results
+----------+
| count(*) |
+----------+
| 5        |
+----------+
1 row
21 ms

Execution Plan
Compiler CYPHER 3.5

Planner COST

Runtime INTERPRETED

Runtime version 3.5

+-------------------+----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| Operator          | Estimated Rows | Rows | DB Hits | Page Cache Hits | Page Cache Misses | Page Cache Hit Ratio | Variables               | Other                     |
+-------------------+----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +ProduceResults   |              1 |    1 |       0 |               0 |                 0 |               0.0000 | count(*)                |                           |
| |                 +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +EagerAggregation |              1 |    1 |       0 |               0 |                 0 |               0.0000 | count(*)                |                           |
| |                 +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +Expand(All)      |              1 |    5 |       6 |               0 |                 0 |               0.0000 | anon[31], anon[35] -- n | (n)--()                   |
| |                 +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +Filter           |              0 |    1 |       1 |               0 |                 0 |               0.0000 | n                       | n.name = $`  AUTOSTRING0` |
| |                 +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +NodeByLabelScan  |              1 |    1 |       2 |               0 |                 0 |               0.0000 | n                       | :Person                   |
+-------------------+----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+

Total database accesses: 9

        

Second approach using Size (Better performance in terms of time execution and database access)

	MATCH (n:Person {name:'Mike'}) RETURN size((n)-[]-());
        Output : 5

//For detailed analysis use Profile in Query
   	PROFILE MATCH (n:Person {name:'Mike'}) RETURN size((n)-[]-());

Detailed Query Results
Query Results
+-----------------+
| size((n)-[]-()) |
+-----------------+
| 5               |
+-----------------+
1 row
16 ms

Execution Plan
Compiler CYPHER 3.5

Planner COST

Runtime INTERPRETED

Runtime version 3.5

+------------------+----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| Operator         | Estimated Rows | Rows | DB Hits | Page Cache Hits | Page Cache Misses | Page Cache Hit Ratio | Variables            | Other                                                |
+------------------+----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +ProduceResults  |              0 |    1 |       0 |               0 |                 0 |               0.0000 | n, size((n)-[]-())   |                                                      |
| |                +----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +Projection      |              0 |    1 |       1 |               0 |                 0 |               0.0000 | size((n)-[]-()) -- n | {size((n)-[]-()) : GetDegree(Variable(n),None,BOTH)} |
| |                +----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +Filter          |              0 |    1 |       1 |               0 |                 0 |               0.0000 | n                    | n.name = $`  AUTOSTRING0`                            |
| |                +----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +NodeByLabelScan |              1 |    1 |       2 |               0 |                 0 |               0.0000 | n                    | :Person                                              |
+------------------+----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+

Total database accesses: 4

 

Below table depicts performance difference between above two Cypher queries:

+--------------------------------------------------------+--------------------+-------------------------+
| Cypher Query                                           | Time Execution(ms) | Total database accesses |
+--------------------------------------------------------+--------------------+-------------------------+
| MATCH (n:Person {name:'Mike'})-[]-() RETURN count(*);  |         21         |            9            |
+--------------------------------------------------------+--------------------+-------------------------+
| MATCH (n:Person {name:'Mike'}) RETURN size((n)-[]-()); |         16         |            4            |
+--------------------------------------------------------+--------------------+-------------------------+

 

Looking at the data in above table, it's clear second approach is much better.

About Author

Author Image
Yasir Zuberi

Yasir is Lead Developer. He is a bright Java and Grails developer and have worked on development of various SaaS applications using Grails framework.

Request for Proposal

Name is required

Comment is required

Sending message..