New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adding utility functions #48

Merged

fabiocat93 merged 10 commits into audio_abstract_dtype from utility_functions

Jun 4, 2024

Collaborator

fabiocat93 commented Jun 3, 2024

I have added some utility functions for computing cosine similarity, eer, wer, cca, cka, cross correlation

fabiocat93 added 5 commits

June 3, 2024 09:42


          adding speech to text evaluation task


          adding cca and cka functions

62a88aa


          adding cosine similarity function

97e6102


          adding cross correlation

31466dd


          adding eer function

fabiocat93 requested a review from wilke0818

June 3, 2024 17:28


          fixing spell issue

710e2d1

codecov-commenter commented Jun 3, 2024 •

edited

Loading

Codecov Report

Attention: Patch coverage is 94.76744% with 18 lines in your changes missing coverage. Please review.

Project coverage is 59.38%. Comparing base (3e2e6b6) to head (66ff00b).

Files	Patch %	Lines
...lab/audio/tasks/speech_to_text_evaluation_pydra.py	0.00%	7 Missing ⚠️
src/senselab/audio/tasks/preprocessing_pydra.py	0.00%	4 Missing ⚠️
src/senselab/audio/tasks/preprocessing.py	93.18%	3 Missing ⚠️
src/senselab/utils/tasks/cca_cka.py	95.55%	2 Missing ⚠️
src/senselab/utils/tasks/cosine_similarity.py	90.90%	1 Missing ⚠️
src/tests/audio/tasks/preprocessing_test.py	98.11%	1 Missing ⚠️

Additional details and impacted files

@@                    Coverage Diff                    @@
##           audio_abstract_dtype      #48       +/-   ##
=========================================================
+ Coverage                 41.62%   59.38%   +17.75%     
=========================================================
  Files                        25       36       +11     
  Lines                       627      943      +316     
=========================================================
+ Hits                        261      560      +299     
- Misses                      366      383       +17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


          fixing typing issue

wilke0818 approved these changes

View reviewed changes

Collaborator

wilke0818 left a comment

Looks mostly good to me. I have two suggestions that might be helpful in the future that can be minor tasks for others later:

In speech_to_text_evaluation I think it would be good to give more concrete definitions on what some of these mean or reference say the jiwer docs since I haven't heard of/am not sure the definitions of say WIP or WIL.
In cca_cka we should create an enum for the types of kernels that are allowed as a better software practice

fabiocat93 added 2 commits

June 3, 2024 18:47


          adding preprocessing functions

a74517e


          treating cka kernels with enum

66ff00b

Collaborator Author

fabiocat93 commented Jun 3, 2024

thank you, @wilke0818, for your feedback. For now, I have created an enum for the kernel types. I will address documentation issues more calmly if that sounds reasonable.
Also, I have added some more functionalities for audio pre-processing. Please feel free to review the code whenever you have time

wilke0818 reviewed

View reviewed changes

src/senselab/audio/tasks/preprocessing.py Outdated

+                  """
+                  down_mixed_audios = []
+                  for audio in audios:
+                      if audio.waveform.dim() != 2 or audio.waveform.size(0) < 1:

Collaborator

wilke0818 Jun 4, 2024

Not sure if this is needed since Pydantic should be handling this?

Collaborator Author

fabiocat93 Jun 4, 2024

fair. i have removed this

wilke0818 reviewed

View reviewed changes

src/senselab/audio/tasks/preprocessing.py Outdated

+                      if audio.waveform.dim() != 2 or audio.waveform.size(0) < 1:
+                          raise ValueError("waveform should have shape (num_channels, num_samples)")
+                      down_mixed_audio = audio.copy()

Collaborator

wilke0818 Jun 4, 2024

think we need to change this to audio.model_copy(deep=True) but curious if what's better is this or just creating a new Audio instance?

Collaborator Author

fabiocat93 Jun 4, 2024

i am not sure there is a big difference here. But can easily switch to creating a new Audio object so that we are consistent in coding style plus it becomes clearer how we create a new entity/item

src/senselab/audio/tasks/preprocessing.py Outdated

+                          raise ValueError("waveform should have shape (num_channels, num_samples)")
+                      down_mixed_audio = audio.copy()
+                      down_mixed_audio.waveform = audio.waveform.mean(dim=0)

Collaborator

wilke0818 Jun 4, 2024

I worry slightly about this but maybe your test cases handle this, but does this guarantee that waveform is shape (1, num_samples)? I'm not 100% sure that the field_validator runs in this case but if it does then this is fine. If it doesn't, then use audio.waveform.mean(dim=0, keepdim=True)

Collaborator Author

fabiocat93 Jun 4, 2024

it was managed anyway, but i have now added keepdim=True to make it clearer

src/senselab/audio/tasks/preprocessing.py

+                                               Shape: (num_channels, num_samples).
+                  Returns:
+                      List[Audio]: The list of audio objects with a mono waveform averaged from all channels. Shape: (num_samples).

Collaborator

wilke0818 Jun 4, 2024

I guess this gets into a later comment I made, but at least in the Audio class as of right now, we standardize the waveform field to be (num_channels, num_samples) and perhaps to keep consistency we keep it as (1, num_samples)?

src/senselab/audio/tasks/preprocessing.py

		return down_mixed_audios


		def select_channel_from_audios(audios: List[Audio], channel_index: int) -> List[Audio]:

Collaborator

wilke0818 Jun 4, 2024

basically see all of my comments for the above function

src/senselab/utils/tasks/eer.py Outdated

Collaborator

wilke0818 Jun 4, 2024

we've broken up a lot of the tasks into individual files (e.g. individual modules) and I'm wondering if we should consolidate a bit to make it easier to use. Not super opinionated on this though.

Collaborator Author

fabiocat93 Jun 4, 2024

this is a good point. I think we can proceed like this for this first release and next week we do some restructure


          fixing style issues

3ba1620

This was linked to issues Jun 4, 2024

Task: Preprocessing #24

Closed

Task: Utilities #30

Closed

fabiocat93 added the enhancement label

fabiocat93 self-assigned this

fabiocat93 merged commit 66c646b into audio_abstract_dtype

4 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels