Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations #1057

syunkova · 2024-09-17T22:45:34Z

Instead of having separate parameters for string and integer IDs, the annotation column is now specified in a single mandatory parameter that can take: integer values between 1 and the number of columns in the Raven file; a string name that matches the annotation column name, and None. If not specified or if set to anything else, it will complain.
Additionally fixed a dtype issue - when a column isn't specified, the annotation column is set to entirely be "NaN" by default. The way we had it changed the column type from object to float64 - now it remains an object type (significant for subsetting the dataframe and working with specific rows/cell)
Finally, minor spelling fix in testing doc for annotations :)

can now use BoxedAnnotations() to create object with standard columns in .df and no annotations

provides methods to import crowsetta Annotation, BBox, and Sequence classes (loading as BoxedAnnotations) and to export to Annotation objects, with annotations stored either as .bbox or .sequence.

added to pyproject.toml and also to autodoc_mock_imports to avoid error when building documentation

for crowsetta integration

Also somewhat related to #865

gives user advice if preprocessor.forward() gets pd.dataframe instead of pd.series

modifies input validation for SafeAudioDataloader

Windows users are getting errors logging to wandb. We can't reproduce them, but Louis suspected it may be because of " / " in wandb logging keys ending up in file paths. This commit removes slashes, spaces, and other special characters ([]) from wandb.log() string keys

changes the behavior of extend_to() so that it doesn't trim audio

three changes: (1) AudioSample.from_series rounds duration by default to avoid floating point precision errors in subtracting end-start (2) AudioPreprocessor includes trimming and extending audio to expected duration by default (3) generate_clip_times_df rounds starts before calculating ends, to preserve sample duration as the difference between the quantities

resolves check for values other than 0/1 in labels #891 now asserts that label values are >=0 and <=1 during CNN.train() and CNN.eval(). Adds tests for both. Also adds a missing test for input validation check of wrong class list during CNN.train()

Feat_categorical_labels

resolves code style: call super() without arguments #1013 reduces chances of bugs, improves code style

now from pyplot rather than matplitlib.cm

list methods and editable attribute in SpectrogramClassifier and SpatialEvent class docstrings #871 and #854 request this re-organizes assignment of attributes in SpatialEvent init to group by editable, static, and computed

resolves better Action printing #994

update lock file resolves update black version? #1002

resolves #502 I reviewed the utils modules to check for docstrings, and decided to import set_seed into the top-level API to make it easier to find

allows google colab preferred version 9.4.0

note that I updated it to install from develop branch, but once we merge develop we'll want it to install the pypi release fixing 3 package versions to exact matches of colab defaults remove the error/warning requiring user to restart the runtime

Fix index checking

going to close: SynchronizedRecorderArray.estimate_locations assumes names for the levels of the multi-index of the detections dataframe #712 because this assumption is well-documented in docstrings

resolves colab notebooks should always use num_workers=0 #986

resovles Docs update: how to use embeddings #969 by adding embedding example also provides model zoo examples

resolves #968

was still getting mps error after I thought it was fixed

during refactor, did not pass these (incl batch_size and num_workers) to the method run_validation, resulting in use of batch_size=1 and num_workers=0 during validations in SpectrogramClassifier.train() loop

Merge branch 'develop' into 936_annotation_column_name

sammlapp and others added 30 commits June 20, 2023 12:41

rename "raven_file" to "annotation_file"

dde681b

allow initializing empty BoxedAnnotations object

e54d207

can now use BoxedAnnotations() to create object with standard columns in .df and no annotations

remove outdated part of annotation tutorial

ed23254

add crowsetta integration

00b83c6

provides methods to import crowsetta Annotation, BBox, and Sequence classes (loading as BoxedAnnotations) and to export to Annotation objects, with annotations stored either as .bbox or .sequence.

add to_csv and from_csv for BoxedAnnotations

ea5564b

add crowsetta dependency

c61cede

added to pyproject.toml and also to autodoc_mock_imports to avoid error when building documentation

require python >=3.9

01cc582

for crowsetta integration

add kwargs to Spectrogram.from_audio

ace47bb

resolves #942 helpful message when index not set

d278fe4

Also somewhat related to #865

resolve 803 useful error if preprocessor.forward gets df

9d7dc5e

gives user advice if preprocessor.forward() gets pd.dataframe instead of pd.series

resolve #865 helpful error for wrong train_df format

7ddc1ba

modifies input validation for SafeAudioDataloader

handle 0-length samples scenario

212e7d4

fix input validation for SafeAudioDataloader

b89d4aa

resolve 911 change labels of Spectrogram.plot() and add kHz arg

35fd080

fix extend_to resolves #972 and #948

98704d4

changes the behavior of extend_to() so that it doesn't trim audio

Solved #855.

1e35014

black formated.

7b69fda

Modified empty annotation behavior and changed tests.

d8e6c2d

Black formatted.

506dcd3

bugfix resolves #930

36229f4

Merge branch 'develop' into issue_945_overlap

8364957

Merge branch 'develop' into issue_945_overlap

f94832c

black

a8c32fa

Merge branch 'develop' into issue_749_crowsetta

ab7e754

Cleaned dtypes and assignemnt function in preprocessor.foward().

c35d712

Black formatted.

556a7ad

Merge branch 'develop' into patch_wandb_windows

b99efac

check for labels outside range [0,1]

59906ac

resolves check for values other than 0/1 in labels #891 now asserts that label values are >=0 and <=1 during CNN.train() and CNN.eval(). Adds tests for both. Also adds a missing test for input validation check of wrong class list during CNN.train()

sammlapp and others added 29 commits September 10, 2024 16:52

Merge pull request #1053 from kitzeslab/feat_categorical_labels

fdd1266

Feat_categorical_labels

Merge branch 'develop' into issue_942_wrong_index

19b9645

call super() without arguments

ad84d6f

resolves code style: call super() without arguments #1013 reduces chances of bugs, improves code style

update import of get_cmap

59fb509

now from pyplot rather than matplitlib.cm

add lightning_logs/ dir to gitignore

9d02de0

list methods & attributes in docstrings

80b4d5d

list methods and editable attribute in SpectrogramClassifier and SpatialEvent class docstrings #871 and #854 request this re-organizes assignment of attributes in SpatialEvent init to group by editable, static, and computed

better action printing

7cb3523

resolves better Action printing #994

specify black ~24.3

341d077

update lock file resolves update black version? #1002

improve documentation of utils modules

d79c214

resolves #502 I reviewed the utils modules to check for docstrings, and decided to import set_seed into the top-level API to make it easier to find

decrease pillow version requirement

1265d2d

allows google colab preferred version 9.4.0

update tutorial notebook pip install

270dcd4

note that I updated it to install from develop branch, but once we merge develop we'll want it to install the pypi release fixing 3 package versions to exact matches of colab defaults remove the error/warning requiring user to restart the runtime

Merge pull request #1054 from kitzeslab/issue_942_wrong_index

b4aae3e

Fix index checking

add docstring for CategoricalLabels.from_multihot_df

661aa20

going to close: SynchronizedRecorderArray.estimate_locations assumes names for the levels of the multi-index of the detections dataframe #712 because this assumption is well-documented in docstrings

use num_workers=0 for colab notebooks

4c37f33

resolves colab notebooks should always use num_workers=0 #986

update predict tutorial

7896c63

resovles Docs update: how to use embeddings #969 by adding embedding example also provides model zoo examples

use model zoo tagged version

3b89136

update dead links in docs

1961d01

resolves #968

catch any exception when checking channel dimensions

abbc086

convert both targets and labels to float32 for eval()

3118648

was still getting mps error after I thought it was fixed

pass dataloader kwargs to run_validation

7a2ea1b

during refactor, did not pass these (incl batch_size and num_workers) to the method run_validation, resulting in use of batch_size=1 and num_workers=0 during validations in SpectrogramClassifier.train() loop

changed annotation column selection format

65d6c48

added black formatting

a348cc6

updating to include newer changes

8625726

Merge branch 'develop' into 936_annotation_column_name

added tests for changes with loading annotations

a936cbb

format test file

8fdd83c

fixed np.nan error when setting whole column to it

261215b

quick fix for testing np nan column

6566aeb

fixed numeric values for annotation column value

2168f70

added black formatting to annotations + tests

b2755f1

syunkova closed this Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations #1057

Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations #1057

syunkova commented Sep 17, 2024

Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations #1057

Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations #1057

Conversation

syunkova commented Sep 17, 2024