Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse to Dense #79

Open
brett1479 opened this issue Apr 21, 2021 · 2 comments · May be fixed by #81
Open

Sparse to Dense #79

brett1479 opened this issue Apr 21, 2021 · 2 comments · May be fixed by #81

Comments

@brett1479
Copy link

In choice_calcs.py line 930, it appears the library is calling rows_to_obs.toarray() which converts a sparse array to a potentially huge dense array (for my use case, the resulting sparse array is small, and the dense version is about 200 GiB).
Here is the existing code:

weights_per_obs =\
        np.max(rows_to_obs.toarray() * weights[:, None], axis=0)

Is the intended behavior given below?

M = rows_to_obs.multiply(weights.reshape(-1,1))
weights_per_obs = np.max(M, axis=0).toarray().reshape(-1)

If so, I think this is a simple fix.

@mathijsvdv
Copy link

I have run into the exact same issue in my usage of Pylogit and had to work around it with a similar rewrite. I like your simple fix: it's very concise and will preserve the sparse matrix structure.

Perhaps you can create a Pull request where you implement this exact fix and have Timothy review it. Here, make sure to include the same fix in nested_choice_calcs.py on lines 569 and 746.

@mathijsvdv mathijsvdv linked a pull request Jul 20, 2021 that will close this issue
@timothyb0912 timothyb0912 linked a pull request Aug 17, 2021 that will close this issue
@timothyb0912 timothyb0912 linked a pull request Aug 17, 2021 that will close this issue
@friedertheo
Copy link

Hi @brett1479 and @mathijsvdv,

Does a similar trick also work for the mixed logit calculation? Because when estimating it, the dense array becomes super large. Using xlogit (or Stata's cmxtmixlogit) needs 1/20 of the memory pylogit requires. However, I need pylogit's feature for constrained optimization.

Any help is very much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants