-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oops! synthetic data contains missing values #17
Comments
Same here |
Thank you for using SMOGN. Have you ensured that you have a valid pandas dataframe with no missing values? |
Great idea implementing this, thanks. |
This only occurs for me in a specific circumstance:
When N is 1000, I get that error, when N=1001 it works.... This happens for different values of N dependent on the choice of parameters |
I'm still not sure what causes it (it seems to happen intermittently when I switch samp_method from 'extreme' to 'balanced'), but it's seemingly-random (in a dataset with ~300 columns and ~150 rows, no missing entries). I've had cases where I've run the exact same code twice, and one time it failed, the other time it didn't. Here's a bit of an ugly workaround. Given the seemingly-random nature of it (I've never seen it happen more than twice in a row), you could wrap the smoter in a while block like this: n_tries = 0
done = False
while not done:
try:
smogn.smoter(data=foo, y="bar")
done = True
except ValueError:
if n_tries < 5:
n_tries += 1
else:
raise That said, keep in mind that this will still fail if there really is something wrong with the dataset you're feeding smogn. This only helps when things fail seemingly-randomly. |
I encountered this error too but had verified that the dataframe fed to smogn.smoter did not have any missing values hence not sure why this error appears. Any solution to pass over the error? Moreover, when the data > 2000 rows, the distmatrix is very slow, any way to speed things up? |
I dont know why this error is happening, how can I avoid it?
The text was updated successfully, but these errors were encountered: