Add callbacks #11

brettshollenberger · 2024-10-11T18:37:32Z

Hey Andrew! Thanks for this library 😄

I wanted to add the callbacks API so I could support an integration with Wandb

I have the Ruby implementation of this callback over here as an example of the use case and have tested the integration in my Wandb console

Some dummy data just to show the integration:

I kept the API consistent w/ the Python implementation but let me know if there's anything else you'd want to see here

ankane · 2024-10-13T18:52:05Z

Hey Brett, great to hear from you! This looks pretty neat. It looks like the Python API uses the return values of the callbacks in train. I think the best way to do that would be to add TrainingCallback and CallbackContainer classes like Python (code), but I can do that in a follow-up commit if you just want to get the tests passing.

brettshollenberger · 2024-10-14T21:22:20Z

Good idea! Let me know what you think, should be passing now

ankane

Awesome, thanks for adding those classes. Added some comments inline. It looks like the new files need to be required.

require_relative "xgboost/callback_container"
require_relative "xgboost/training_callback"

Also, if you're seeing an error running the tests locally, try running:

bundle exec rake vendor:platform

ankane · 2024-10-14T22:29:56Z

lib/xgboost.rb

      booster = Booster.new(params: params)
+      cb_container = CallbackContainer.new(callbacks)
+      cb_container.before_training(model: booster)


This should update booster to match Python.

ankane · 2024-10-14T22:31:01Z

lib/xgboost.rb

@@ -59,32 +63,36 @@ def train(params, dtrain, num_boost_round: 10, evals: nil, early_stopping_rounds
      end

      num_boost_round.times do |iteration|
+        cb_container.before_iteration(model: booster, epoch: iteration)


This should break if the return value is falsy.

ankane · 2024-10-14T22:32:52Z

lib/xgboost.rb

        booster.update(dtrain, iteration)

-        if evals.any?


Please keep the existing code where possible to keep the changeset minimal / easier to review.

Sorry, disabled rubocop

ankane · 2024-10-14T22:33:21Z

lib/xgboost.rb

        end
+        cb_container.after_iteration(model: booster, epoch: iteration, res: res)


This should break for falsy values like before_iteration.

ankane · 2024-10-14T22:33:38Z

lib/xgboost.rb

      end
+      cb_container.after_training(model: booster)


This should update booster like before_training.

ankane · 2024-10-14T22:34:10Z

lib/xgboost/callback_container.rb

+      @history = {}
+
+      callbacks.each do |callback|
+        unless callback.class.ancestors.include?(TrainingCallback)


callback.is_a?(TrainingCallback)

ankane · 2024-10-14T22:35:10Z

test/train_test.rb

@@ -55,6 +55,88 @@ def test_feature_names_and_types
    assert_nil model.feature_types
  end

+  class MockCallback < XGBoost::TrainingCallback


Let's create a separate file for the callback tests.

brettshollenberger · 2024-10-15T17:37:32Z

Thanks Andrew, had to get rid of some aggressive Rubocop settings complicating the changes. Tests are all passing for me and everything's committed now 😅 Looks like you have to approve the Github workflow run here but I think we should have it this time.

Co-authored-by: Brett Shollenberger <brett.shollenberger@gmail.com>

ankane · 2024-10-16T01:31:55Z

Thanks @brettshollenberger! Merged in the commit above with a few minor changes:

I had the logic backwards for before/after_iteration - it should stop if it returns a truthy value
Changed callbacks to use positional arguments instead of keyword
Left out the params change, as it's not present in the Python library (from what I can tell)

Going to spend a little time getting the overall code more in sync with Python, and then will push a new release.

brettshollenberger · 2024-10-16T15:36:20Z

@ankane nice, thanks!

With the params thing, it's not exactly 1:1, but I think an easier solution than the Python Wandb callback, which calls xgboost/core#save_config.

https://github.com/wandb/wandb/blob/8698af5862e44baf31af5411b81bea546e069257/wandb/integration/xgboost/xgboost.py#L117

    def before_training(self, model: Booster) -> Booster:
        """Run before training is finished."""
        # Update W&B config
        config = model.save_config()
        wandb.config.update(json.loads(config))

        return model

https://github.com/dmlc/xgboost/blob/3f9bfaf86e6db6a4f54734aa7d164df55aa69ef6/python-package/xgboost/core.py#L2008

    def save_config(self) -> str:
        """Output internal parameter configuration of Booster as a JSON
        string.

        .. versionadded:: 1.0.0

        """
        json_string = ctypes.c_char_p()
        length = c_bst_ulong()
        _check_call(
            _LIB.XGBoosterSaveJsonConfig(
                self.handle, ctypes.byref(length), ctypes.byref(json_string)
            )
        )
        assert json_string.value is not None
        result = json_string.value.decode()  # pylint: disable=no-member
        return result

ankane · 2024-10-16T17:23:16Z

I think it's better to keep things synced for maintainability in most cases. Added save_config in the commit above.

brettshollenberger · 2024-10-17T12:05:21Z

Awesome, thank you!

ankane · 2024-10-17T20:55:02Z

Great, just pushed 0.9.0. Let me know if you need anything else that's missing.

brettshollenberger force-pushed the callbacks branch 3 times, most recently from 96dd7ec to f66a4ba Compare October 14, 2024 21:21

ankane reviewed Oct 14, 2024

View reviewed changes

Add callbacks

80ff93c

brettshollenberger force-pushed the callbacks branch from f66a4ba to 80ff93c Compare October 15, 2024 17:32

Add params

85b8a12

ankane added a commit that referenced this pull request Oct 16, 2024

Added support for callbacks - #11

73c4b4c

Co-authored-by: Brett Shollenberger <brett.shollenberger@gmail.com>

ankane closed this Oct 16, 2024

ankane added a commit that referenced this pull request Oct 16, 2024

Added save_config method to Booster - #11

c0542ef

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add callbacks #11

Add callbacks #11

brettshollenberger commented Oct 11, 2024

ankane commented Oct 13, 2024

brettshollenberger commented Oct 14, 2024

ankane left a comment

ankane Oct 14, 2024

ankane Oct 14, 2024

ankane Oct 14, 2024

brettshollenberger Oct 15, 2024

ankane Oct 14, 2024

ankane Oct 14, 2024

ankane Oct 14, 2024

ankane Oct 14, 2024

brettshollenberger commented Oct 15, 2024

ankane commented Oct 16, 2024

brettshollenberger commented Oct 16, 2024 •

edited

Loading

ankane commented Oct 16, 2024

brettshollenberger commented Oct 17, 2024

ankane commented Oct 17, 2024

		end
		cb_container.after_iteration(model: booster, epoch: iteration, res: res)

Add callbacks #11

Add callbacks #11

Conversation

brettshollenberger commented Oct 11, 2024

ankane commented Oct 13, 2024

brettshollenberger commented Oct 14, 2024

ankane left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brettshollenberger commented Oct 15, 2024

ankane commented Oct 16, 2024

brettshollenberger commented Oct 16, 2024 • edited Loading

ankane commented Oct 16, 2024

brettshollenberger commented Oct 17, 2024

ankane commented Oct 17, 2024

brettshollenberger commented Oct 16, 2024 •

edited

Loading