Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Needs reproducing] Handle one-dimensional embeddings for clustering #211

Conversation

AndreiMoraru123
Copy link

Hi folks

I was playing around with Brain for a dataset at work, and I noticed that when I provided a roi_field (the detection labels) for a method that relies on clustering, I got this error.

fob.compute_representativeness(dataset, roi_field="ground_truth")
ValueError: Expected 2D array, got 1D array instead:
array=[0.06107287 0.06011039 0.06012045 0.05879862 0.05759485 0.05654685
0.05719245 0.0446697  0.04646276 0.04628405 0.04667709 0.04758289
0.04678112 0.04675514 0.04684635 0.04685971 0.04689653 0.04614001
0.04592452 0.04747253 0.04706833 0.04716367 0.04662085 0.04660043
0.04788357 0.03245687 0.04705974 0.04701892 0.04882907 0.05358888
0.05417315 0.05604687 0.05566651 0.04973941 0.04815942 0.04763582
0.04768014 0.0472154  0.04717452 0.04974527 0.04951364 0.04999169
0.04736153 0.04292721 0.03433677 0.03456343 0.04151432 0.03937531
0.04073388 0.04284388].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

This is computed for a slice of 50 dataset samples, so we're in the first case of the error.

I also noticed this for uniqueness with the brain version we use (0.16), but by cloning the latest locally, I see that uniqueness no longer works the same way, so I did not test it. Possibly other methods that rely on clustering as well.

I can't provide the dataset and did not test this on a zoo dataset.

I will leave this as a draft until I know better, but you can let me know if you'd rather have this as an issue.

@brimoor brimoor changed the title Handle one-dimensional embeddings for clustering [Needs reproducing] Handle one-dimensional embeddings for clustering Nov 29, 2024
@brimoor
Copy link
Contributor

brimoor commented Nov 29, 2024

Hi @AndreiMoraru123 👋

I'm not able to reproduce your error. The following works as expected when running fiftyone-brain==0.17.0 for me. Does it work for you?

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")
dataset.delete_sample_field("uniqueness")

# Test with 50% missing ROI fields (whole images are used instead in these cases)
dataset.take(100).clear_sample_field("ground_truth")

fob.compute_uniqueness(dataset, roi_field="ground_truth")
fob.compute_representativeness(dataset, roi_field="ground_truth")

@AndreiMoraru123
Copy link
Author

You are right, this works with a dataset from the zoo. I can't reproduce it either. However, I ran into the error when using a fo.Dataset with fo.Detections.

The run was with 0.16 for brain, but the repre API looks the same. I can't run colab notebooks with different versions for fiftyone as apparently they have a problem with pymongo, I'll see if I can test it any different

@brimoor
Copy link
Contributor

brimoor commented Dec 3, 2024

Converted to issue: #215

@brimoor brimoor closed this Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants