Neo4j improve performance of counting number of relationships
Posted By : Yasir Zuberi | 31-Aug-2019
One of the challenage which almost every neo4j database user encounters is perfomance issue while working with node relationships.
I have faced the similar problem, when I had to count the number of relationships a node is linked to.
In order to understand the performance improvement lets create few nodes and create relationship between them.
CREATE (n:Person { name: 'Mike', profile: 'Software Developer' });
CREATE (n:ProgrammingLanguage { name: 'Java', description: 'Java Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'Php', description: 'Php Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'Python', description: 'Python Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'Go', description: 'Go Programming Language' });
CREATE (n:ProgrammingLanguage { name: 'C++', description: 'C++ Programming Language' });
So we have create one node of Person and five nodes of ProgrammingLanguage
MATCH (developer:Person),(language:ProgrammingLanguage)
WHERE developer.name = 'Mike' AND language.name = 'Java' OR language.name = 'Php' OR language.name = 'Python' OR language.name = 'Go' OR language.name = 'C++'
CREATE (developer)-[r:KNOW_LANGUAGE]->(language);
Now we have created relationships between Person node and ProgrammingLanguage nodes.
Lets's take a look at two entirely different approaches for calculating total number of incoming/outgoing relationships for Person named Mike.
First approach using Count (low performance in terms of time execution and database access)
MATCH (n:Person {name:'Mike'})-[]-() RETURN count(*);
Output : 5
//For detailed analysis use Profile in Query
PROFILE MATCH (n:Person {name:'Mike'})-[]-() RETURN count(*);
Detailed Query Results
Query Results
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row
21 ms
Execution Plan
Compiler CYPHER 3.5
Planner COST
Runtime INTERPRETED
Runtime version 3.5
+-------------------+----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| Operator | Estimated Rows | Rows | DB Hits | Page Cache Hits | Page Cache Misses | Page Cache Hit Ratio | Variables | Other |
+-------------------+----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +ProduceResults | 1 | 1 | 0 | 0 | 0 | 0.0000 | count(*) | |
| | +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +EagerAggregation | 1 | 1 | 0 | 0 | 0 | 0.0000 | count(*) | |
| | +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +Expand(All) | 1 | 5 | 6 | 0 | 0 | 0.0000 | anon[31], anon[35] -- n | (n)--() |
| | +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +Filter | 0 | 1 | 1 | 0 | 0 | 0.0000 | n | n.name = $` AUTOSTRING0` |
| | +----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
| +NodeByLabelScan | 1 | 1 | 2 | 0 | 0 | 0.0000 | n | :Person |
+-------------------+----------------+------+---------+-----------------+-------------------+----------------------+-------------------------+---------------------------+
Total database accesses: 9
Second approach using Size (Better performance in terms of time execution and database access)
MATCH (n:Person {name:'Mike'}) RETURN size((n)-[]-());
Output : 5
//For detailed analysis use Profile in Query
PROFILE MATCH (n:Person {name:'Mike'}) RETURN size((n)-[]-());
Detailed Query Results
Query Results
+-----------------+
| size((n)-[]-()) |
+-----------------+
| 5 |
+-----------------+
1 row
16 ms
Execution Plan
Compiler CYPHER 3.5
Planner COST
Runtime INTERPRETED
Runtime version 3.5
+------------------+----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| Operator | Estimated Rows | Rows | DB Hits | Page Cache Hits | Page Cache Misses | Page Cache Hit Ratio | Variables | Other |
+------------------+----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +ProduceResults | 0 | 1 | 0 | 0 | 0 | 0.0000 | n, size((n)-[]-()) | |
| | +----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +Projection | 0 | 1 | 1 | 0 | 0 | 0.0000 | size((n)-[]-()) -- n | {size((n)-[]-()) : GetDegree(Variable(n),None,BOTH)} |
| | +----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +Filter | 0 | 1 | 1 | 0 | 0 | 0.0000 | n | n.name = $` AUTOSTRING0` |
| | +----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
| +NodeByLabelScan | 1 | 1 | 2 | 0 | 0 | 0.0000 | n | :Person |
+------------------+----------------+------+---------+-----------------+-------------------+----------------------+----------------------+------------------------------------------------------+
Total database accesses: 4
Below table depicts performance difference between above two Cypher queries:
+--------------------------------------------------------+--------------------+-------------------------+
| Cypher Query | Time Execution(ms) | Total database accesses |
+--------------------------------------------------------+--------------------+-------------------------+
| MATCH (n:Person {name:'Mike'})-[]-() RETURN count(*); | 21 | 9 |
+--------------------------------------------------------+--------------------+-------------------------+
| MATCH (n:Person {name:'Mike'}) RETURN size((n)-[]-()); | 16 | 4 |
+--------------------------------------------------------+--------------------+-------------------------+
Looking at the data in above table, it's clear second approach is much better.
Cookies are important to the proper functioning of a site. To improve your experience, we use cookies to remember log-in details and provide secure log-in, collect statistics to optimize site functionality, and deliver content tailored to your interests. Click Agree and Proceed to accept cookies and go directly to the site or click on View Cookie Settings to see detailed descriptions of the types of cookies and choose whether to accept certain cookies while on the site.
About Author
Yasir Zuberi
Yasir is Lead Developer. He is a bright Java and Grails developer and have worked on development of various SaaS applications using Grails framework.