-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rule::unify_source method execution takes about 30% of ForwardChainer execution time #54
Comments
Not to hijack the conversation, but the flame-graph also points at Also: what's Also: as a general rule, you want to run any/all atomspace things with |
If you haven't configed hugepages, see below; its cut-n-paste from atomspace-postgres; you have to adjust these instructions for your app:Tuning HugePagesUsing 2MB-sized HugePages can also offer a large performance boost,
Then you need to find out what the group id (gid) was:
Suppose that this shows group id 1234 for
Don't forget to
|
actually, you may as well set hugepages to 80% of RAM, or whatever the typical working-set size for the atomspace is. |
Linas thank you for spotting this. I believe pthread stuff is a mutex lock at: ure/opencog/ure/forwardchainer/SourceSet.cc Lines 37 to 41 in 0a353dd
so root cause is not boost but intensive locking.
I believe it is exponent calculation here: ure/opencog/ure/forwardchainer/SourceSet.cc Lines 175 to 188 in 0a353dd
|
Hmm. Given that |
Thanks. Great findings! Regarding the locks, these are just crude half-way steps towards parallel URE. Once we move to C++17 I'll use more refined (unique/shared) mutexes. Maybe some thread pool, etc, not sure yet. Regarding |
Calculate once at insertion rather than at each iteration. Address the exp part of opencog#54
Calculate once at insertion rather than at each iteration. Address the exp part of opencog#54
I profiled ForwardChainer to find out how performance can be improved and found that it spends significant amount of time in the Rule::unify_source method. It depends on rules complexity. The more complex rules are used the bigger this amount of time is. The following benchmark demonstrates this: opencog/benchmark#27
The problem is that
unify_source
does almost nothing. First part converts rule'sBindLink
into another one with random variable naming (RewriteLink::alpha_convert
). And second one constructs newBindLink
with some variables grounded.Proper fix probably should be raised to
opencog/atomspace
, but I am raising the issue here as main concern is URE performance.Flame graph of the code benchmarked (better opening in web-browser):
perf.svg.gz
Following steps require getting
perf
andFlameGraph
tool (https://github.com/brendangregg/FlameGraph). Intel CPU with support of Intel Processor Trace is required as well.Steps to reproduce:
Run benchmark in the background:
Collect profile using Intel Processor Trace. You need only 1 second of profile it will be too large otherwise:
Build flame graph of the
do_step
call (it can take a lot of time):The text was updated successfully, but these errors were encountered: