-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add user-facing log interception callback. #87576
base: master
Are you sure you want to change the base?
Conversation
A bunch of answers here. First, my general experience is that in every engine, errors regularly get unnoticed by people. People don't look at the error log unless they already know there's something posted, and people never expect an error. I've seen this happen in Unity, Unreal, Godot, and custom engines; it's just a universal truth and I've given up convincing people to look at the error log on their own, they just don't do it. So this is being used for a system that pops up an error dialog when an error shows up. It pops up in the game viewport, which makes it impossible to ignore (the other effective solution I've seen is a modal popup.) Example: The funny part is that when I originally implemented this, I realized this particular error had been happening for a few days and I hadn't even noticed it while working on the error reporting system. Second, testers need to be aware of errors also, and will often be testing on the fully-built game (otherwise you're just begging for release-only errors, which will happen.) They need some visible error log, and that means I need to be able to reliably record errors to show ingame. Third, stuff like Sentry is very convenient for reporting errors to a central server. But you still need a way to get those errors, and right now there's no way besides modifying the engine code. This is a general solution for retrieving all messages so you can do whatever you like with them. Fourth, I'm setting this system up for mod support. Modders don't have the full game engine tools available either, and they still need access to errors. In addition, their players need access to errors. Note that all of this is modeled off Rimworld's Developer Mode, which is considered a critical feature for mod developers: The end result of all of this is that in-game error reporting is really really handy. |
Please open a feature proposal to track this feature and the demand/ideas 🙂 |
@zorbathut thank you for the ping! I agree with everything you wrote. Can't speak for the code unfortunately. Will try to understand how to test this PR, it's a first for me, but i think this is very important, so I'll try to put in the effort
There are already some existing proposals afaik: a possible candidate is godotengine/godot-proposals#1896 or maybe godotengine/godot-proposals#3062 |
Would have noticed them if they were listed in the original PR 🙂, or if OP had linked to them instead |
No worries! Luckily we're a nice community, so we can compensate for each other's shortcomings 💪 |
I did look through the proposals for anything that matched but didn't see it through skimming - unfortunately there aren't great keywords, "log" gives a huge number of results and I guess I didn't choose any better keywords. But thanks for the links, though, at least someone managed to find them :D |
Yeah well, i've been lurking around this specific logging issue for a while, so I remember the most promising proposals and PRs by heart 😅 Maybe could be useful to cross-reference this with some or all the proposals I linked above, with the old PR and with some other related famous ones, like #56014 |
Proposals linked in the intro :) As much as I'd love to be able to handle stuff like Breakpad, this kinda doesn't - it just handles errors, not crashes, and true crash handling is tough because the environment is - sort of by definition - in a broken state. Trying to enter the .NET runtime might just crash it further. I think practically that would have to be a C++ interface just to minimize the cross-contamination with the rest of the code. This is also something I might do someday but it may take a while :) |
Just want to add to this, that having an in-game log pad / in-game message capture mechanism allows developers to early-report these errors under release builds, without delaying until they actually find some part of the functionality is broken and it is too late to pause for debugging/detail checking. |
Just to update on some brainstorming . . .
. . . my current best solution is to provide two ways of registering a callback, one Buffered and one Nonthreadsafe. Either as two separate functions or as one function with a two-value enum as a parameter. If it's registered as Buffered, it just stores up messages until they're dispatched all together in the main thread, if it's registered as Nonthreadsafe it has the current behavior. In theory, this means people will look at the two options and say "oh, a decision! I should understand this decision" instead of either finding themselves annoyed by buffered-by-default or ending up with bizarre crashes thanks to unintentionally choosing nonthreadsafe-by-default. Better suggestions appreciated, but this seems like a solution that is at least neither actively dangerous nor actively insufficient. |
Are you sure godot/core/string/print_string.h Lines 56 to 57 in 5034478
godot/core/error/error_macros.h Lines 60 to 61 in 5034478
There is a separate mechanism for the debugger, although this is probably less relevant since it is only applicable in debug builds. Finally there is Also, I'm not sure if there are differences between the Output Log and the Debugger > Errors tab, and there may be other logging/error/notification systems that I'm not aware of. I can't judge the implementation, just trying to add useful input. |
It will not surprise me at all if I'm missing stuff :)
This one seems to function as a fallback for printing errors if the OS singleton doesn't yet exist. Right now I'm starting right after the OS singleton is initialized, so it's not necessary, but by using this I might be able to buffer even earlier. I'm tempted to look into this one, but it's also going to be a pain because with the current API I'll have to detach the error_handler event when I attach to the OS singleton in order to avoid getting double-warnings.
This actually does route through the OS singleton and would crash if the OS singleton wasn't initialized, so I can be certain I'm not losing anything by not registering this one. However, it's also set up to silence itself if
Yep, that sounds reasonable to me too. I kinda-sorta feel like some more unified Single Way To Do Logging might be the right approach, but that's a much heavier-weight refactor than I am currently up for. That might be the appropriate time to solve the One nice part about the design I currently have is that all of this would work without API changes; |
I second the callback thing, was going to suggest this myself, at least to explore the option of adding support for registering a custom callback to the existing logging output, it catches everything that currently exists with various data provided |
Alright, so it looks like the TODO list is:
A lot of the other potential improvements (start recording earlier, record more types of message) wouldn't involve breaking backwards compatibility and can be done later. Any way I can attract someone with the authority to say "yes, assuming you do this competently we would be happy to integrate it"? Or to provide a different set of changes that need to be made? |
7933c7d
to
96cdd83
Compare
Big update time. I've made the proposed changes and fleshed out the documentation. I also squashed and rebased the entire thing because the history was frankly misleading; if anyone wants to see the original, it's in https://github.com/zorbathut/godot/commits/pr_loghook_old/ . The threading work is finicky, because threading work is always finicky. I've done my best to document everything, especially the subtle things. I've tested this a bunch. It is intrinsically fiddly, though, and it won't surprise me if there are things I didn't think of. I can't find any issues but more testing is always appreciated. As far as I am concerned, this is ready to go (pending review). Note that I considered the last two revisions not ready to go. This revision is ready to go! |
How noticeable is this if at all? Will it impact |
Whoops! Good catch. It would have logged stuff forever and perpetually leaked memory. Fixed now.
I haven't benchmarked it, but for each log message it's basically creating a Dictionary wrapped in a Variant, with (at the worst case) six key/value pairs. And that's two log messages reported at startup, usually. So, maybe thirty small allocations? I'd expect this to be in the single-microsecond range. And remember this is once at startup, not per-frame. Stuff like "parsing startup config" is going to be orders of magnitude larger. |
@zorbathut the folks at Sentry are pretty pumped about this PR and potential followups. One of their developers should appear in followups, at least as commenters, If I understood correctly. About this specific PR, quoting a message by Bruno Garcia in getsentry/sentry-godot#1 :
I think at this point the "providing access to the actual C# exception object" will be considered in followups? Or is it something feasible in the current PR? Asking without any expectation or urgency, I really can't estimate the load of the feature (maybe one day I'll study the Godot source code... but not today) In the meantime, thanks a lot for your work! Can't wait to log everything! |
I do not personally want to add more stuff to the PR, partly because I don't want to slow down the process, partly because of entirely selfish reasons, specifically that I want this code in place and I don't care about the exception object :V I don't remember offhand how the exception catching works - I think I had an early version that did this in #68169 - and it might be necessary to add a parameter to Godot error messages for a LanguageSpecificErrorData or something along those lines, that we could wrap the C# Exception in. Then when passed to Godot we look for the existence of that and slap it into a new field in the error handler. Happy to talk to @bruno-garcia about this, I will probably get to it on my own eventually but uh don't hold your breath on that one I've got a TODO list the size of a giraffe and available free time more comparable to an underweight squirrel.
You're welcome! :) |
Co-authored-by: A Thousand Ships <96648715+AThousandShips@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general approach I think is OK, what I would change is that, instead of registering callables which is kinda unintuitive what you get ( or adding information via dictionaries) we do it via Extension API.
your Logger binding is OK, but instead, register objects like this rather than callables.
Here is some mock-up of how I would do it:
// IN H
class LoggerExtension : public Reference {
GDCLASS(LoggerExtension, Reference)
protected:
enum ErrorType { //need to bind this to script, can be done here.
};
GDVIRTUAL7( _log_error, String, String, int, String, String, bool, ErrorType)
public:
virtual void log_error(const char *p_function, const char *p_file, int p_line, const char *p_code, const char *p_rationale, bool p_editor_notify = false, ::ErrorType p_type = ERR_ERROR);
};
VARIANT_ENUM_CAST( LoggerExtension::ErrorType)
// in CPP
void LoggerExtension::log_error(const char *p_function, const char *p_file, int p_line, const char *p_code, const char *p_rationale, bool p_editor_notify = false, ::ErrorType p_type = ::ERR_ERROR) {
GDVIRTUAL_CALL(_log_error,p_function,p_file,p_line,p_code,p_rationale, p_editor_notify,ErrorType(p_type));
}
void LoggerExtension::_bind_methods() {
GDVIRTUAL_BIND(_log_error, "function", "file", "line", "code", "rationale", "editor_notify", "error_type")
BIND_ENUM_CONSTANT( error types )
}
So, a few big downsides here. First, this makes it impossible to provide optional parameters. We don't actually have all these parameters in all cases, so either we need to provide default values ("line will be filled if it exists, otherwise it will be -1") or we need to change them to some kind of Optional<> type. I'm not sure there is an Optional right now (there probably should be!) Second, this makes it considerably harder to add new parameters in the future. Even in this thread we're talking about adding an Exception object; here, we either have to break everyone's code to add a new parameter, or add an entire new set of callbacks. Third, I admit I don't see the advantage. The Callable infrastructure already allows a callable on a member of an object; why force an entire object derivation? In C# I can just do this: Godot.LogManager.RegisterLogCaptureNonthreadsafe(Callable.From((Variant var) => {
// code here
})); and I admit I don't see a benefit to making it bigger than that. Fourth, the entire point of Callable is to use for callbacks, which is exactly what this is:
I do agree the dictionary solution is sorta crummy, but the solution here isn't mandatory inheritance, it's proper This feels like not the right decision to me. |
@zorbathut I prefer a class is used like this because its friendlier to statically typed languages (extension languages, C#) and its more intuitive to receive the arguments. Additionally, performance wise its far more efficient to do it like this. If more parameters are added, this function is left as is and hidden and another one is added, so it will continue working. |
How does this work from the user perspective? Doesn't this either mean "the program stops compiling", "you stop getting some events", or "we're now dispatching events to multiple functions, some of which don't have implementations"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this one earlier, assuming we'd still go with callables, as is_null
and is_valid
are not complementary, so this doesn't guarantee it's valid
// in most cases, this is going to be a deleted callable, so ignore it; | ||
// is_null() would be enough for that, we intentionally invalidate it to null in register_callable | ||
// | ||
// it's apparently possible for callables to be yanked out from under us, though, and if we call them, we crash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment isn't necessary, especially not with "apparently", as it isn't helpful in comments, the code is self explanatory and the comment just clutters up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do specifically want to call out "it is expected and normal that it's null, because our removal does that"; it's not otherwise clear that this is standard behavior. Do you have another phrasing you'd recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No phrasing needed, this is the universal way to check this, it doesn't need explaining here, it's not unusual or unexpected, there's almost no checks against Callable.is_null
in the codebase, and most remaining ones are invalid and will hopefully be removed soon
This isn't obvious to new users at times, but this isn't the place to clarify this
There's in fact only four valid uses of this outside of Callable
itself, and about 40 invalid ones, being fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, again, the thing I want to clarify isn't "we're checking if the callable is valid here", it's "normal functionality of the removal function replaces callables with null callables and this will happen much more often than you would normally expect callable invalidation to happen".
I'll take the entire comment out if you insist but I think that's removing important information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is already pretty over annotated, this makes it even more cluttered
But if a comment is necessary just add "Check the callable is valid and hasn't been disconnected", anything more is unnecessary, the code explains itself, it isn't just to prevent calling it if it's null, that'll be handled automatically, it's to prevent crashing, but that's something that you don't need to explain here
So anyone reading the code will see that removing the callable sets it to empty, so no confusion is going to happen
Comments should explain things that aren't clear from the code, and the clearing is already explained in the other method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, once approved, this code needs a comment pass to clean up unnecessary comments and their style as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is already pretty over annotated, this makes it even more cluttered
Intentionally so; it's delicate multithreading work and there's a lot of implicit stuff going on.
Comments should explain things that aren't clear from the code, and the clearing is already explained in the other method
I think the problem is locality. Sure, the clearing is already explained in the other method, but I can absolutely see someone (for example) changing callp()
to do the is_valid()
check itself, then looking at every place callp()
is used and saying "aha, don't need this check anymore!"
If you're looking at A(), and there's an important bit of behavior that you need to know in order to properly modify A(), and that bit of behavior is located in B(), and there's no comment in A() directing you to B(), and there's no other indication that you should go look at B(), then this is - in my opinion - a bug waiting to happen.
Generally, once approved, this code needs a comment pass to clean up unnecessary comments and their style as well
Is there a way that the reviewer can just submit whatever changes they want? If you want to change the code, you're welcome to just do it.
Again, I think this is a mistake, if you insist I'll change it to whatever you want, but at this point it might just be easier if you made the change instead of me; it'll take less time for you to do it than for you to explain exactly what you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to change it, it's just my suggestion, but I'll go over this code once it's approved fully and review the comments, and you can apply those changes then
but I can absolutely see someone (for example) changing callp() to do the is_valid() check itself
That kind of sweeping change won't happen in such a careless way, and anyone doing so would go over the code carefully, no worries, also this wouldn't make any difference here as the check wouldn't be needed in this case either, so it doesn't really apply here
But there's no rush, the decision on whether to use callables or not hasn't been made, but I'll go over the style of this and give suggestions when it's ready to merge otherwise, the style needs fixing anyway
As a C# developer, I'm fairly concerned about the dictionaries we are passing on here. |
Can use a custom RefCounted class instead of a dictionary? --- logger.h ---
class LogInfo : public RefCounted{
public:
enum LogType {
TYPE_INFO,
TYPE_WARNING,
TYPE_ERROR,
TYPE_SCRIPT,
TYPE_SHADER,
TYPE_UNKNOWN,
};
String text;
LogType type;
String get_text() const;
LogType get_log_type() const;
StringName get_log_type_name() const;
};
class LogInfoError : public LogInfo{
public:
String file;
int line;
String function;
String rationale;
String get_file() const;
int get_line() const;
String get_function() const;
String get_rationale() const;
}; |
RefCounted classes need a significant amount of allocation too, and this will also affect GDScript users. They do have the upside of being closer to structs in terms of type safety (and autocompletion/error reporting) though. |
On some level I'm just not convinced that optimizing this for top performance is really relevant. Compare the cost of allocating a Dictionary or a RefCounted class against the cost of generating the stacktrace; my suspicion is that generating the stacktrace is already going to be considerably slower. Compare that against the cost of displaying the text in the Godot editor in debug mode; compare that against the cost of sending the text to VS Code or Rider in debug mode; I'm deeply suspicious that these allocations just aren't that relevant. It's possible there's going to be someone who needs to output a million individual log lines per second for some reason, and I think my answer to them would be "look, you have weird requirements, you should probably not be using the standard debug output system, especially because it's going to be generating a stack trace on every single line, this is designed for debug not for high-throughput data output". |
Disabling |
The log interception feature is essentially zero-overhead if it's not being used; it's about the same performance hit as adding a config variable would be. The way to disable it is "don't use it". In an ideal world with infinite refactor time available, I think we'd have a (honestly in an ideal world we might just have |
Hey! In the end as I am just rehashing large part of the notifications to add also script backtraces, I added custom logging support in my PR (#91006). My approach is simpler, does not add any extra logging by default and thread safety is up to the implementer of the logger (so we don't need to do this built-in into the engine, which is not normally needed in a lot of logging APIs as they handle their own thread safety). Let me know if you also need anything in mine that may be missing that this one provides. |
Two problems:
It is simpler, but it's simpler by virtue of not doing things that are actually really important; virtually all the complexity of this solution comes from handling the tough cases. |
This is intended, if you want to get such early reporting, write an extension. Those can be hooked into the logger at the very beginning of the engine initialization. I am not going to just save all messages internally "just in case" someone adds a custom logger because 99.999% of times It's not going to happen.
That's fine, this is not a feature intended for every user, It's instead meant for extending core engine functionality in the most efficient way possible. The logging function can be documented with an example (including a simple mutex) for users who want to do it from GDScript, should be very simple anyway. |
That's not what my patch does, however.
I strongly feel that exposing a big subtle threadsafety vulnerability directly to users is not a good idea. This will cause hard-to-track-down instability in people's games. Can't stop you, obviously, but I think it's a mistake. |
I wanted a way to catch all Godot messages, including errors, so they can be displayed to the user and/or handled in code. (If you're saying "why do you want that", read the next comment!) Not only should this handle all messages after the callback is applied, I wanted to handle all messages before the callback is applied so that startup warnings or errors are reported as well.
This provides a LogManager singleton that can have a callback attached. It saves messages until Frame 1, meaning that if you apply your callback during Frame 0, it quickly replays its entire saved log for you.
This used to be #68169 but takes a completely different approach at solving the same underlying issue. For my purposes, this is far better, and it's also supported on all scripting languages.
Personal concerns with this solution:
It imposes a bit of CPU overhead and memory churn on every log message. I don't think this is a big issue because there's already a ton of that, but this does impose further costs.Essentially solved; it still imposes a function call, std::atomic lookup, and conditional, but I don't think that's a big problem.The callback must be threadsafe, but will generally work pretty much fine even if it's not threadsafe. This is a perfect recipe for introducing subtle threading issues in people's code. I'm not sure how to fix this - providing the message immediately is important to me, and I haven't come up with a way to coherently single-thread that process without introducing performance issues and long message delays.Solved by splitting the callback into Nonthreadsafe and Buffered versions.There is currently no way to attach more than one callback, nor is there a way to remove a callback.Solved.If multiple callbacks were supported, the current design would either mean that the second callback hook didn't get the full debug log replay, or it would require some further engineering work to support that.Solved.Things that do in fact need to be fixed before commit:
The documentation is lacking. (I'll fix this if the design is approved in general.)Solved.Edit: Related feature proposals:
Add global crash handler/logger godot-proposals#1896
Add a way to print a backtrace without stopping/crashing Godot godot-proposals#3062
Add user-facing log interception callback godot-proposals#8964
Bugsquad edit, closes: Add user-facing log interception callback godot-proposals#8964