Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres daemon uses 100% of CPU for several minutes despite indexing deactivated #12190

Open
2 tasks done
ytzemih opened this issue Nov 14, 2024 · 12 comments
Open
2 tasks done

Comments

@ytzemih
Copy link

ytzemih commented Nov 14, 2024

JabRef version

Latest development branch build (please note build date below)

Operating system

GNU / Linux

Details on version and operating system

JabRef 6.0--2024-11-13--5f1dd59 Linux 6.1.0-26-amd64 amd64 Java 23.0.1 JavaFX 23.0.1+4

Checked with the latest development build (copy version output from About dialog)

  • I made a backup of my libraries before testing the latest development version.
  • I have tested the latest development version and the problem persists

Steps to reproduce the behaviour

I can't exactly replicate this issue, it occurs a few times a day. The following steps might increase the likelihood to stimulate the issue:

  1. I've generally deactivated "fulltext search" or background features in the preference pane.
  2. Use (global) search and leave search term in the search field (might be an optional step).
  3. Add a new entry to a large DB OR edit the comments field and other field of an existing entry.
  4. Save file.

I'm using top to see that there is a postgre instance spun off and to check its CPU load. This issue has not occurred under JR5.x.

Appendix

Log File
78754 xxx     25   5  235792  56520  50536 R  99.7   0.4   7:35.39 postgres
@ytzemih
Copy link
Author

ytzemih commented Nov 14, 2024

When killing the postgre instance (needed, otherwise my laptop will die), I'm getting about 10-15 of such exceptions thrown:

2024-11-14 13:47:06 [pool-2-thread-2] org.jabref.logic.search.retrieval.BibFieldsSearcher.search()
ERROR: Error during bib fields search execution: org.postgresql.util.PSQLException: This connection has been closed.
	at org.jabref.merged.module@6.0.6575/org.postgresql.jdbc.PgConnection.checkClosed(Unknown Source)
	at org.jabref.merged.module@6.0.6575/org.postgresql.jdbc.PgConnection.prepareStatement(Unknown Source)
	at org.jabref.merged.module@6.0.6575/org.postgresql.jdbc.PgConnection.prepareStatement(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.logic.search.retrieval.BibFieldsSearcher.search(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.logic.search.retrieval.BibFieldsSearcher.isMatched(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.logic.search.IndexManager.isEntryMatched(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.gui.groups.GroupNodeViewModel$SearchIndexListener.lambda$listen$1(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.logic.util.BackgroundTask$2.call(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.logic.util.BackgroundTask$2.call(Unknown Source)
	at org.jabref@6.0.6575/org.jabref.gui.util.UiTaskExecutor$1.call(Unknown Source)
	at javafx.graphics@23.0.1/javafx.concurrent.Task$TaskCallable.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

I don't know why such a comprehensive search is triggered when just editing some entry fields... I've not been using search, but I realise that my global search is non-empty. Apparently, it seems to be the search update is again triggered to often. I realise that there is a related issue with JR5, right?

@ytzemih
Copy link
Author

ytzemih commented Nov 14, 2024

Perhaps it helps to distinguish two search triggers,

  1. one when a user requests a search (that should be immediate)
  2. one when some update is made in the DB (that should be delayed to a bunch of updates)

@koppor koppor added this to the 6.0-alpha milestone Nov 14, 2024
@ytzemih
Copy link
Author

ytzemih commented Nov 26, 2024

I'm getting this behaviour quite often. So, just reporting further observations:
There are up to 5 threads pool-2-thread-x with x=1..5. Now, killing postgre immediately throws 100s of these errors within a second:

2024-11-26 17:08:00 [pool-2-thread-5] org.jabref.logic.search.retrieval.BibFieldsSearcher.search()
ERROR: Error during bib fields search execution: org.postgresql.util.PSQLException: This connection has been closed.

Before my CPU kept spinning, which suggests to me a classical DB deadlock situation (spinlocks) created by several (5?) java postgre calls not being synchronized (that is, not being properly monitored). It seems reasonable to start having a look there.

@github-project-automation github-project-automation bot moved this to Normal priority in Prioritization Nov 26, 2024
@Siedlerchr Siedlerchr moved this from Normal priority to High priority in Prioritization Nov 26, 2024
@subhramit subhramit added the [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs label Dec 1, 2024
@subhramit subhramit changed the title Postgre daemon uses 100% of CPU for several minutes despite indexing deactivated Postgres daemon uses 100% of CPU for several minutes despite indexing deactivated Dec 5, 2024
@koppor
Copy link
Member

koppor commented Dec 9, 2024

Before my CPU kept spinning, which suggests to me a classical DB deadlock situation (spinlocks) created by several (5?) java postgre calls not being synchronized (that is, not being properly monitored). It seems reasonable to start having a look there.

Can you help us with that? Our contributor developing that feature disappeared.

@ytzemih
Copy link
Author

ytzemih commented Dec 10, 2024

@koppor Sorry to hear that and thanks for asking.

First of all, can you or anybody else confirm that there is a bug?

I'm not familiar with the JR code base, nor it's architecture, which means, it'd take me a while to isolate the problem and make informed decisions. From February onward, however, I'd have time to take have a look into this issue if this helps you?

@koppor
Copy link
Member

koppor commented Dec 21, 2024

First of all, can you or anybody else confirm that there is a bug?

TBH, if it appears at one user, it is most likely that it will appear at other users, too.

We will do an alpha-release these days - and hope for feedback at https://discourse.jabref.org/c/beta-testing/11.

I'm not familiar with the JR code base, nor it's architecture, which means, it'd take me a while to isolate the problem and make informed decisions. From February onward, however, I'd have time to take have a look into this issue if this helps you?

Sure! We are all volunteers and busy with other things. (I personally need to get my funding ordered for 2025 and beyond). Thus, any help is very much welcome. Looking forward!

@ytzemih
Copy link
Author

ytzemih commented Dec 23, 2024

Alright, good to know. I'll see whether I can spot the issue in the then alpha-release and, if possible, can propose a fix in February.

In the meanwhile, all the best for your funding acquisition!

@LoayGhreeb
Copy link
Member

Hello @ytzemih,

First of all, disabling the "fulltext search" will not affect this issue. Fulltext search does not use PostgreSQL; we use Lucene for the fulltext search of linked files. PostgreSQL is used for searching bib fields and search groups.

Killing the PostgreSQL server process will certainly cause a lot of exceptions, as jabref expects the server to be running.

I cannot reproduce the issue on my machine. The initial indexing of a large library with 6.5K entries took about 8% of CPU usage, while updating or inserting entries did not exceed 0.3% CPU usage.

Do you have any search groups in your library? This might be the cause of the issue. After every update to the index, for each character you type in the field, a search query is performed for every search group you have.

@ytzemih
Copy link
Author

ytzemih commented Jan 9, 2025

@LoayGhreeb Thanks for clarifying the use of PostgreSQL.

I'm indeed using search groups, a handful of them the files affected.

In the meanwhile, I've observed that the issue occurs when I work on bib entries, in particular, editing comments and other fields, which involves a lot of typing. And, certainly, I'd expect any hooks for search routines NOT to be spawned upon every other key stroke. Rather, I'd expect (in a way like pawn detection algorithms with touch pads) to only react after a typing activity, say one second after the last key stroke.

Anyway, your conjecture about the cause seems reasonable.

So, I understand that the number of PostgreSQL instances spawned then corresponds to the number of search groups I use? I should be looking at the code at some point.

If it is the search groups, this is not a bug but a performance problem, right?

@ytzemih
Copy link
Author

ytzemih commented Jan 9, 2025

The question then is, whether an update of all search groups is necessary if no or at most one group is selected? Is an update of the search perhaps done too often?

Whatever it is, there seems to be some interaction between search, global search, and search groups that can cause performance issues and might need some rethinking.

@ytzemih
Copy link
Author

ytzemih commented Jan 16, 2025

This issue is still occurring for me after the fix of #12373. It, however, seems to be performance issue. I didn't have this before with JR 5.x, but used similarly large bib files as well as search groups. JR is spinning up my CPU after each couple of changes, which is not so nice when working with my laptop without power connection.

If PostgreSQL has already been used in JR 5 for searching bib fields and search groups, then there is some change in JR 6 that seems to cause this issue. If PostgreSQL has been newly introduced in JR6, that implementation might need a look at.

@koppor koppor removed the [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs label Jan 20, 2025
@ytzemih
Copy link
Author

ytzemih commented Jan 25, 2025

After a few weeks of using JR6, I think, this issue could be a severe showstopper for anyone using a combination of medium to large bib-files and search groups. I just wanted to add some information for the protocol:

Closing the just changed file (because the postgresql daemon is killing my laptop) , I get (unsurprisingly) the following three kinds of exceptions (1000s of them actually but with the same ID):

  1. 2025-01-25 14:09:11 [pool-2-thread-4] org.jabref.logic.search.retrieval.BibFieldsSearcher.search()

ERROR: Error during bib fields search execution: org.postgresql.util.PSQLException: ERROR: relation "bib_fields.i1otlkgvdvhl" does not exist
Position: 44
at org.jabref.merged.module@6.0.15043/org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.core.v3.QueryExecutorImpl.processResults(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.core.v3.QueryExecutorImpl.execute(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgStatement.executeInternal(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgStatement.execute(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgPreparedStatement.executeQuery(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.retrieval.BibFieldsSearcher.search(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.IndexManager.lambda$search$5(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)

  1. 2025-01-25 14:09:11 [pool-2-thread-3] org.jabref.logic.search.indexing.BibFieldsIndexer.removeField()

ERROR: Error deleting field from entry in index: org.postgresql.util.PSQLException: This connection has been closed.
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgConnection.checkClosed(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgConnection.createStatement(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgConnection.createStatement(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.indexing.BibFieldsIndexer.removeField(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.indexing.BibFieldsIndexer.updateEntry(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.IndexManager$8.call(Unknown Source)
at org.jabref@6.0.15043/org.jabref.gui.util.UiTaskExecutor$1.call(Unknown Source)
at javafx.graphics@23.0.1/javafx.concurrent.Task$TaskCallable.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)

  1. 2025-01-25 14:09:11 [pool-2-thread-3] org.jabref.logic.search.indexing.BibFieldsIndexer.insertField()

ERROR: Could not add an entry to the index.: org.postgresql.util.PSQLException: This connection has been closed.
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgConnection.checkClosed(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgConnection.prepareStatement(Unknown Source)
at org.jabref.merged.module@6.0.15043/org.postgresql.jdbc.PgConnection.prepareStatement(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.indexing.BibFieldsIndexer.insertField(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.indexing.BibFieldsIndexer.updateEntry(Unknown Source)
at org.jabref@6.0.15043/org.jabref.logic.search.IndexManager$8.call(Unknown Source)
at org.jabref@6.0.15043/org.jabref.gui.util.UiTaskExecutor$1.call(Unknown Source)
at javafx.graphics@23.0.1/javafx.concurrent.Task$TaskCallable.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)

My observation: I'm often editing the comments field to add some notes. JR is spawing a postgresql instance upon my changes to update the index. But, of course, I'm continuing to type my comments because I don't know that. It looks as if I am interfering with the DB ACID concept? If this is the case, a quick fix could be to not only wait with the index update a few seconds after typing but also to allow typing to kill the indexer. There seems to be a too fine granularity / too close interaction between the user and the DB changes. Please, refute or confirm my theory as you see fit.

Perhaps, this helps us to hunt down the location of the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: High priority
Development

No branches or pull requests

4 participants