Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numba typing errors #15

Open
Drzwioddomu opened this issue Nov 10, 2022 · 0 comments
Open

Numba typing errors #15

Drzwioddomu opened this issue Nov 10, 2022 · 0 comments

Comments

@Drzwioddomu
Copy link

Hi!

I'm trying to reproduce the experiment in your readme, but I keep getting numba errors that are not very descriptive.

My code:

from gefs import RandomForest
from experiments.prep import get_data, train_test_split

data, ncat = get_data('wine')
X_train, X_test, y_train, y_test, data_train, data_test = train_test_split(data, ncat)
rf = RandomForest(n_estimators=30, ncat=ncat)
rf.fit(X_train, y_train)
gef = rf.topc()

Traceback:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "***/test_gefs.py", line 7, in <module>
    rf.fit(X_train, y_train)
  File "***/gefs/trees.py", line 533, in fit
    self.estimators = build_forest(X, y, self.n_estimators, self.bootstrap,
  File "/opt/conda/lib/python3.9/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/opt/conda/lib/python3.9/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
- Resolution failure for literal arguments:
Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in method choice of numpy.random.mtrand.RandomState object at 0x7f2c36da0940>) found for signature:

 >>> choice(array(int64, 1d, C), OptionalType(int64), replace=Literal[bool](False))

There are 2 candidate implementations:
  - Of which 2 did not match due to:
  Overload in function 'choice': File: numba/cpython/randomimpl.py: Line 1360.
    With argument(s): '(array(int64, 1d, C), OptionalType(int64), replace=bool)':
   Rejected as the implementation raised a specific error:
     TypingError: Failed in nopython mode pipeline (step: nopython frontend)
   No implementation of function Function(<built-in function empty>) found for signature:

    >>> empty(OptionalType(int64), class(int64))

   There are 2 candidate implementations:
         - Of which 2 did not match due to:
         Overload in function 'ol_np_empty': File: numba/np/arrayobj.py: Line 4086.
           With argument(s): '(OptionalType(int64), class(int64))':
          Rejected as the implementation raised a specific error:
            TypingError: Cannot parse input types to function np.empty(OptionalType(int64), class(int64))
     raised from /opt/conda/lib/python3.9/site-packages/numba/np/arrayobj.py:4105
   
   During: resolving callee type: Function(<built-in function empty>)
   During: typing of call at /opt/conda/lib/python3.9/site-packages/numba/cpython/randomimpl.py (1417)
   
   
   File "../../../../../../opt/conda/lib/python3.9/site-packages/numba/cpython/randomimpl.py", line 1417:
           def choice_impl(a, size=None, replace=True):
               <source elided>
               if replace:
                   out = np.empty(size, dtype)
                   ^

  raised from /opt/conda/lib/python3.9/site-packages/numba/core/typeinfer.py:1086

During: resolving callee type: Function(<built-in method choice of numpy.random.mtrand.RandomState object at 0x7f2c36da0940>)
During: typing of call at ***/gefs/split.py (145)


File "gefs/split.py", line 145:
def find_best_split(node, tree, random_state):
    <source elided>
    np.random.seed(random_state)
    vars = np.random.choice(np.arange(tree.X.shape[1]), tree.max_features, replace=False)
    ^

During: resolving callee type: type(CPUDispatcher(<function find_best_split at 0x7f2b8ca93ee0>))
During: typing of call at ***/gefs/trees.py (132)

During: resolving callee type: type(CPUDispatcher(<function find_best_split at 0x7f2b8ca93ee0>))
During: typing of call at ***/gefs/trees.py (132)

During: resolving callee type: type(CPUDispatcher(<function find_best_split at 0x7f2b8ca93ee0>))
During: typing of call at ***/gefs/trees.py (132)


File "gefs/trees.py", line 132:
def build_tree(tree, parent, counts, ordered_ids):
    <source elided>
        node = queue.pop(0)
        split = find_best_split(node, tree, np.random.randint(1e6))
        ^

During: resolving callee type: type(CPUDispatcher(<function build_tree at 0x7f2b8ca9f700>))
During: typing of call at ***/gefs/trees.py (465)

During: resolving callee type: type(CPUDispatcher(<function build_tree at 0x7f2b8ca9f700>))
During: typing of call at ***/gefs/trees.py (465)


File "gefs/trees.py", line 465:
    def fit(self, X, y):
        <source elided>
        ordered_ids = np.arange(X.shape[0], dtype=np.int64)
        self.root, self.n_nodes = build_tree(self, None, counts, ordered_ids)
        ^

- Resolution failure for non-literal arguments:
None

During: resolving callee type: BoundFunction((<class 'numba.core.types.misc.ClassInstanceType'>, 'fit') for instance.jitclass.Tree#7f2b8caad490<X:OptionalType(array(float64, 2d, A)),y:OptionalType(array(int64, 1d, A)),ncat:OptionalType(array(int64, 1d, A)),scope:OptionalType(array(int64, 1d, A)),imp_measure:unicode_type,min_samples_leaf:int64,min_samples_split:int64,n_classes:int64,max_features:OptionalType(int64),n_nodes:int64,root:instance.jitclass.TreeNode#7f2b8caa6b80<id:int64,counts:array(int64, 1d, A),idx:array(int64, 1d, A),split:OptionalType(instance.jitclass.Split#7f2b8ca89bb0<score:float64,var:int64,threshold:array(float64, 1d, A),surr_var:array(int64, 1d, A),surr_thr:array(float64, 1d, A),surr_go_left:array(bool, 1d, A),surr_blind:bool,left_ids:array(int64, 1d, A),right_ids:array(int64, 1d, A),left_counts:array(int64, 1d, A),right_counts:array(int64, 1d, A),type:unicode_type>),parent:OptionalType(DeferredType#139825020508336),left_child:OptionalType(DeferredType#139825020508336),right_child:OptionalType(DeferredType#139825020508336),isleaf:OptionalType(bool),depth:int16>,depth:int16,max_depth:int64,surrogate:bool,random_state:int64>)
During: typing of call at ***/gefs/trees.py (179)


File "gefs/trees.py", line 179:
def build_forest(X, y, n_estimators, bootstrap, ncat, imp_measure,
    <source elided>
                                               estimators[i].random_state)
            estimators[i].fit(Xtree_, ytree_)

My guess is that it might happen because some dependencies got updated. I'm running the code in a conda environment with the following versions installed:

numba                     0.56.3
numpy                     1.22.3 
pandas                    1.4.2
scipy                     1.9.0
sklearn                   1.1.2
tqdm                      4.64.0 

Could you possibly upload a solved environment or a freeze with specific package versions that allow to execute the code properly?

BR,
Maurycy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant