-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add ability to configure logging for debugging #360
base: master
Are you sure you want to change the base?
Conversation
|
||
|
||
class ConfigurationError(Exception): | ||
"""Exception used for configuration errors""" | ||
|
||
|
||
def load_config(filename="config.toml", **kwargs): | ||
def _deep_update(source, overrides): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or comparable such that within the function it's more apparent when read
def _deep_update(source, overrides): | |
def _deep_update(old_dict, new_dict): |
def load_config(filename="config.toml", **kwargs): | ||
def _deep_update(source, overrides): | ||
"""Recursively update a nested dictionary. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the Args
part seems to be missing
if isinstance(value, collections.Mapping) and value: | ||
# Override value is a non-empty dictionary. | ||
# Update the source key with the override dictionary. | ||
returned = _deep_update(source.get(key, {}), value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting this up might make it easier to grasp the small steps here
returned = _deep_update(source.get(key, {}), value) | |
source_key = source.get(key, {}) | |
returned = _deep_update(source_key, value) |
"""dict: Nested dictionary representing the allowed configuration | ||
sections, options, default values, and allowed types for Strawberry | ||
Fields configurations. For each configuration option key, the | ||
corresponding value is a length-2 tuple, containing: | ||
|
||
* A type or tuple of types, representing the allowed type | ||
for that configuration option. | ||
|
||
* The default value for that configuration option. | ||
|
||
.. note:: | ||
|
||
By TOML convention, keys with a default value of ``None`` | ||
will **not** be present in the generated/loaded configuration | ||
file. This is because TOML has no concept of ``NoneType`` or ``Null``, | ||
instead, the non-presence of a key indicates that the configuration | ||
value is not set. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! 🎊
@@ -154,6 +154,8 @@ def create_job(self, target: str, program: Program, run_options: dict = None) -> | |||
|
|||
circuit = bb.serialize() | |||
|
|||
self.log.debug("Submitting job\n%s", circuit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to keep the level consistent here and on line 164: self.log.info("Job %s was successfully submitted.", job_id)
. Either both could be DEBUG
or both could be INFO
.
One thing I was wondering about: for errors, how should we handle logging? Would it make sense to have the same message when instantiating the error and also log that at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! Errors should definitely be logged, the generated log files should have a running commentary of what happens during program execution.
One way of doing this could be using logger.exception
.
This article seems to provide a nice overview of potential logging/exception patterns: https://www.loggly.com/blog/exceptional-logging-of-exceptions-in-python/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally had this as info
, but I'm not so sure they both belong at the same level. From the Python docs:
Level | Description |
---|---|
DEBUG |
Designates fine-grained informational events that are most useful to debug an application. |
INFO |
Designates informational messages that highlight the progress of the application at coarse-grained level. |
-
Job submission provides information about the progress of the job submission at a very course-grained level. Seems particularly well suited, given that
info
is the default level, and will be displayed on the output by default. -
Logging the submitted circuit is instead more fine-grained debugging data. In most cases, users won't want this displayed as 'info', since the submitted circuit should be evident from their Python code (or determined via
prog.compile()
). However, this will be useful information when debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Happy if these are kept as they are right now :)
match the expected type defined in the configuration spec. | ||
""" | ||
res = {} | ||
for k, v in config_spec.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any use case for being able to pass config_spec
to this function? Most of the times one would be using DEFAULT_CONFIG_SPEC
, but now the function allows any nested dictionary structure. Not sure if the complexity arising from this is worth having.
To me, it seems that the block within the isinstance(v, tuple)
case could be renamed and organized into a new function. Since we know that the maximum depth of nesting is 2, using recursion here seems like a general solution to a specific problem which can still be avoided to yield a simpler line of logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably agree with @antalszava that recursion probably should be avoided if not completely necessary (like in this case since only a very shallow, specific depth of 2 is used), in favour of a more explicit solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point 🙂 I tried to keep this as general as possible, because TOML allows for nested sections. For example, in the future, we might have something like this:
[api]
hostname = "platform.strawberryfields.ai"
[default]
token = "personal_token"
[org1]
token = "organization_token"
When loaded via toml.load
, it would look like this:
{
"api": {
"hostname": "platform.strawberryfields.ai",
"default": {"token": "personal_token"},
"org1": {"token": "organization_token"}
}
}
I suppose it's a question of do we want to disallow nested TOML config files (in which case it makes sense to restrict to simply using DEFAULT_CONFIG_SPEC
here), or if we want to keep this function general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, yes :) I'd be leaning towards keeping it simple for now and updating it whenever a specific nested section is introduced for the configuration (but this is subjective point of view). Overall, this would come with the
- Advantage: the configuration related code is tailored onto the current needs, is slightly simpler and easier to be picked up
- Disadvantage: once a nested section is being added, the logic here will have to change to this more general solution
**Example** | ||
|
||
>>> _generate_config(DEFAULT_CONFIG_SPEC, api={"port": 54}) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come that the logging
section is not included when using DEFAULT_CONFIG_SPEC
? To me, it is something unexpected
Edit: added a suggestion, though this might once again change if the suggestion of having ""
as default filename is accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I just forgot to update the docstring!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right that makes sense :)
|
||
DEFAULT_CONFIG_SPEC = { | ||
"api": { | ||
"authentication_token": (str, ""), | ||
"hostname": (str, "platform.strawberryfields.ai"), | ||
"use_ssl": (bool, True), | ||
"port": (int, 443), | ||
} | ||
}, | ||
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this one to go with the same as for authentication_token
?
(Edit: although, it also came to my mind that the advantage of leaving it as is would keep the distinction between required and not required options)
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)}, | |
"logging": {"level": (str, "info"), "logfile": (str, "")}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for readability, I think it would be nicer to always write JSON-style nested dictionaries on several lines. 🙂
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)}, | |
"logging": { | |
"level": (str, "info"), | |
"logfile": ((str, type(None)), None) | |
}, |
Seems like the circular dependency could be resolved by having module imports: https://stackoverflow.com/a/22210807/12899253 Edit: locally I also had to separate parsing logging optionsfunction into a Overall the overhead of pulling the logging options each time This is also the reason why in #359 such a solution was chosen for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great! 💯 Only a few comments, but nothing major.
|
||
DEFAULT_CONFIG_SPEC = { | ||
"api": { | ||
"authentication_token": (str, ""), | ||
"hostname": (str, "platform.strawberryfields.ai"), | ||
"use_ssl": (bool, True), | ||
"port": (int, 443), | ||
} | ||
}, | ||
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for readability, I think it would be nicer to always write JSON-style nested dictionaries on several lines. 🙂
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)}, | |
"logging": { | |
"level": (str, "info"), | |
"logfile": ((str, type(None)), None) | |
}, |
DEFAULT_CONFIG_SPEC = { | ||
"api": { | ||
"authentication_token": (str, ""), | ||
"hostname": (str, "platform.strawberryfields.ai"), | ||
"use_ssl": (bool, True), | ||
"port": (int, 443), | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious; is the type defined for each value something pythonic or is it only part of this code (it looks like the type validation is done in _generate_config
below)? Nice either way. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing Pythonic, just a way of having a single 'source of truth' for the allowed configuration at the top of the file 🙂 This means that changes to the configuration/default values/allowed types requires simply modifying this variable --- all subsequent functions are independent of the allowed keys/options/types.
match the expected type defined in the configuration spec. | ||
""" | ||
res = {} | ||
for k, v in config_spec.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably agree with @antalszava that recursion probably should be avoided if not completely necessary (like in this case since only a very shallow, specific depth of 2 is used), in favour of a more explicit solution.
res[k] = _generate_config(v, **kwargs.get(k, {})) | ||
return res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(apart from maybe not needing to use recursion) really nice! 🎉
if logging: | ||
config_details = "Loaded configuration: {}".format(config) | ||
auth_token = config.get("api", {}).get("authentication_token", "") | ||
config_details = config_details.replace(auth_token[5:], "*" * len(auth_token[5:])) | ||
log.debug(config_details) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit hacky, though 😆
@@ -46,7 +47,8 @@ | |||
"hostname": "platform.strawberryfields.ai", | |||
"use_ssl": True, | |||
"port": 443, | |||
} | |||
}, | |||
'logging': {'level': 'info'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'logging': {'level': 'info'} | |
"logging": {"level": "info"} |
@josh146 There seem to be an issue with updating from environment variables. |
Co-Authored-By: antalszava <antalszava@gmail.com> Co-Authored-By: Theodor <theodor@xanadu.ai>
Context:
When debugging configuration and API access, it's useful to be able to output debugging logging information to a file. This can then be shared when requesting help.
Currently, Strawberry Fields performs logging of internal state and warning, but requires the user have knowledge of the Python logging module in order to modify the logging level/output the log information to a file.
Description of the Change:
Adds a new section to the configuration pertaining to logging:
By changing the
level
option, the user can specify the types of log messages that are displayed in the terminal when using the default SF logger. For instance, by settinglevel = "debug"
, more detailed information is output during execution.By setting the
logfile
option, the user can specify a logfile that is appended to during SF execution. Currently, the logfile contains all logged messages, of level DEBUG and up.Click here to see an example logfile created when submitting a remote program.
Note: these options only affect the built-in SF logger handlers. If the user creates their own custom logging handler, this overrides the SF built-in handlers.
Adds a DEBUG log message on job submission, logging the submitted blackbird script.
Adds a DEBUG log message on
RemoteEngine.run_async()
, providing details on program compilation if it occurs.Adds an INFO log message if the job is complete after a refresh
Adds a
sf.configuration.SESSION_CONFIG
global variable that stores the configuration of Strawberry Fields as first import. This is used to configure the logger --- the configuration of the logger should not change after this.load_config()
now logs debugging information, including the loaded configuration. Note that loaded API tokens are automatically masked.Minor refactor of the
sf.configuration
module:The logging section has been added to
DEFAULT_CONFIG_SPEC
.Some existing functions had hardcoded dependence on the
"api"
section of the configuration file. This has been removed; now no functions, except forstore_account()
, depend on the structure of the"api"
section. As a result,create_config
has been replaced with_generate_config
, which automatically generates a default configuration object given a configuration spec.find_config_file
seemed to duplicate the logic inget_available_config_paths
, so now depends on the former.get_api_section
was currently used only inload_config
. Sinceload_config
is now agnostic of the configuration structure, this function is redundant.keep_valid_options
was removed; in testing, it tended to lead to bad user experience to filter out unknown options. Typos (e.g.,prot
instead ofport
) would be pruned from the loaded configuration, making debugging difficult.Since there is now more than just an
api
section in the configuration,store_account
now loads the existing configuration, modifies it as requested, and then re-writes it to disk. This avoids non-API configuration from being overwritten.Benefits:
Possible Drawbacks:
Currently,
create_logger
requires a loaded configuration to configure the logger. However,load_config
creates a logger in order to log details about finding/loading the configuration! This leads to a circular dependency. Currently, this is avoided by passinglogging=False
toload_config
on first load, to turn off any logging (since the logger has not been created yet). However, this is ugly, and bad form. If something goes wrong with the users logging configuration, it won't be logged!We could consider having a function
sf.configure_logger()
in order to configure the logging section of the config file within SF, similar to howsf.store_account()
configures the API section. E.g.:However, since the logger is configured on import, these settings would not take affect until the next SF import.
Related GitHub Issues: n/a