Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations #1057

Closed
wants to merge 239 commits into from

Conversation

syunkova
Copy link
Collaborator

  • Instead of having separate parameters for string and integer IDs, the annotation column is now specified in a single mandatory parameter that can take: integer values between 1 and the number of columns in the Raven file; a string name that matches the annotation column name, and None. If not specified or if set to anything else, it will complain.
  • Additionally fixed a dtype issue - when a column isn't specified, the annotation column is set to entirely be "NaN" by default. The way we had it changed the column type from object to float64 - now it remains an object type (significant for subsetting the dataframe and working with specific rows/cell)
  • Finally, minor spelling fix in testing doc for annotations :)

sammlapp and others added 30 commits June 20, 2023 12:41
can now use BoxedAnnotations() to create object with standard columns in .df and no annotations
provides methods to import crowsetta Annotation, BBox, and Sequence classes (loading as BoxedAnnotations) and to export to Annotation objects, with annotations stored either as .bbox or .sequence.
added to pyproject.toml and also to autodoc_mock_imports to avoid error when building documentation
for crowsetta integration
gives user advice if preprocessor.forward() gets pd.dataframe instead of pd.series
modifies input validation for SafeAudioDataloader
Windows users are getting errors logging to wandb. We can't reproduce them, but Louis suspected it may be because of " / " in wandb logging keys ending up in file paths. This commit removes slashes, spaces, and other special characters ([]) from wandb.log() string keys
changes the behavior of extend_to() so that it doesn't trim audio
three changes:
(1) AudioSample.from_series rounds duration by default to avoid floating point precision errors in subtracting end-start
(2) AudioPreprocessor includes trimming and extending audio to expected duration by default
(3) generate_clip_times_df rounds starts before calculating ends, to preserve sample duration as the difference between the quantities
resolves check for values other than 0/1 in labels #891

now asserts that label values are >=0 and <=1 during CNN.train() and CNN.eval(). Adds tests for both. Also adds a missing test for input validation check of wrong class list during CNN.train()
sammlapp and others added 29 commits September 10, 2024 16:52
resolves code style: call super() without arguments #1013

reduces chances of bugs, improves code style
now from pyplot rather than matplitlib.cm
list methods and editable attribute in SpectrogramClassifier and SpatialEvent class docstrings
 #871 and #854 request this

re-organizes assignment of attributes in SpatialEvent init to group by editable, static, and computed
resolves better Action printing #994
update lock file

resolves update black version? #1002
resolves #502
I reviewed the utils modules to check for docstrings, and decided to import set_seed into the top-level API to make it easier to find
allows google colab preferred version 9.4.0
note that I updated it to install from develop branch, but once we merge develop we'll want it to install the pypi release

fixing 3 package versions to exact matches of colab defaults remove the error/warning requiring user to restart the runtime
going to close:
SynchronizedRecorderArray.estimate_locations assumes names for the levels of the multi-index of the detections dataframe #712
because this assumption is well-documented in docstrings
resolves colab notebooks should always use num_workers=0 #986
resovles Docs update: how to use embeddings #969 by adding embedding example

also provides model zoo examples
was still getting mps error after I thought it was fixed
during refactor, did not pass these (incl batch_size and num_workers) to the method run_validation, resulting in use of batch_size=1 and num_workers=0 during validations in SpectrogramClassifier.train() loop
Merge branch 'develop' into 936_annotation_column_name
@syunkova syunkova closed this Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants