Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary classification metrics throws error in v 1.0.0 #148

Closed
manujosephv opened this issue Jan 19, 2023 · 5 comments
Closed

Binary classification metrics throws error in v 1.0.0 #148

manujosephv opened this issue Jan 19, 2023 · 5 comments
Labels
wontfix This will not be worked on

Comments

@manujosephv
Copy link
Owner

manujosephv commented Jan 19, 2023

Describe the bug
Binary Classification task with default settings throws an error in metric calculation.

To Reproduce
Steps to reproduce the behavior:

data_config = DataConfig(
    target=['target'], #target should always be a list. Multi-targets are only supported for regression. Multi-Task Classification is not implemented
    continuous_cols=num_col_names,
    categorical_cols=cat_col_names,
)
trainer_config = TrainerConfig(
    auto_lr_find=True, # Runs the LRFinder to automatically derive a learning rate
    batch_size=1024,
    max_epochs=100,
    accelerator="auto", # can be 'cpu','gpu', 'tpu', or 'ipu' 
)

model_config = CategoryEmbeddingModelConfig(
    task="classification",
    layers="32-16", # Number of nodes in each layer
    activation="LeakyReLU", # Activation between each layers
    dropout=0.1,
    initialization="kaiming",
    learning_rate = 1e-3
)


tabular_model = TabularModel(
    data_config=data_config,
    model_config=model_config,
    optimizer_config=optimizer_config,
    trainer_config=trainer_config,
)
tabular_model.fit(train=train, validation=val)

will throw below error:

RuntimeError: Predictions and targets are expected to have the same shape, but got torch.Size([419, 2]) and torch.Size([419]).

Expected behavior
It should train with default Accuracy metric

Additional context
pytorch_tabular version 1.0.0

@manujosephv
Copy link
Owner Author

Known bug and will be fixed in next release.

Workaround:

Define metrics explicitly, like below:

model_config = CategoryEmbeddingModelConfig(
    task="classification",
    layers="32-16", # Number of nodes in each layer
    activation="LeakyReLU", # Activation between each layers
    dropout=0.1,
    initialization="kaiming",
    learning_rate = 1e-3,
    metrics=['accuracy', "f1_score"],
    metrics_params=[dict(task="multiclass", num_classes=2), dict(task="multiclass", num_classes=2)]
)

@manujosephv manujosephv changed the title Classification metrics throws error in v 1.0.0 Binary classification metrics throws error in v 1.0.0 Jan 19, 2023
@manujosephv manujosephv pinned this issue Jan 19, 2023
@manujosephv
Copy link
Owner Author

manujosephv commented Jan 20, 2023

This is fixed in v1.0.1 in PyPi.

pip install -U pytorch_tabular to install new version

@stale
Copy link

stale bot commented Mar 22, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Mar 22, 2023
@stale stale bot closed this as completed Mar 29, 2023
@destefani
Copy link

destefani commented Apr 4, 2023

I run into a similar problem. I'm using the CategoryEmbeddingModelConfig to do binary classification and would like to get the F1 Score of the positive class. The problem with the workaround you suggest above is that it only works when averaging. Normally in torchmetrics you would use binary_f1_score, but because it is in torchmetrics.functional.classification instead of tochmetrics.functional it is not recognised.

I'm running v1.0.1 just in case.

@manujosephv
Copy link
Owner Author

torchmetrics have updated their interfaces a bit and created separate functions like binary_f1_score. PyTorchTabular hasn't been updated to use these new functions yet. But If I remember correctly, the legacy f1_score has a parameter called task with which you can define the desired averaging, isn't it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants