-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release 0.1.2 #47
Release 0.1.2 #47
Conversation
wilke0818
commented
Jun 2, 2024
- Created Audio Pydantic model class for storing an Audio representation that can support mono and stereo audio files (Task: Internal architecture (data structures) #42)
- Audio class offers different formats for creating instances and importantly maintains a consistent internal representation of the Audio as a torch.Tensor of shape (num_channels, num_samples)
- Audios maintain, but currently do not utilize metadata information
- Created AudioDataset class to help manage large number of Audios and importantly to offer functionality for running Audio tasks and pipelines with Pydra in an efficient manner
- Rewrote data_augmentations and preprocessing to show use cases and code simplification that these abstract data types will allow (Task: data augmentation. #43)
…r prepping a set of Audios for a Pydra task
…ntations to show their use
…selab into audio_abstract_dtype
thank you, @wilke0818; this is great! I have made some changes to align this branch with Some suggestions:
|
Adding utility functions
def augment_hf_dataset( | ||
dataset: Dict[str, Any], augmentation: Compose, audio_column: str = "audio" | ||
) -> Dict[str, Any]: | ||
def augment_audio_dataset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we say "audios" instead of "audio_dataset" since it's a list of audios for how it is rn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not urgent, for the very good documentation that we will write, it's a good idea to include some suggestions for parameters to use for the different types of augmentations
""" | ||
augmentation.output_type = "dict" | ||
new_audios = [] | ||
device_type, dtype = _select_device_and_dtype( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you manage the scenario when the developer wants to use a device which is not supported? this is the question you asked me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont
As a general comment, we have many |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general, if we create some workflows, it makes sense having a _pydra version of the scripts. Otherwise, we can probably remove marking functions as pydra tasks or we can do it in the same .py file. will leave this here for our discussion later today @wilke0818
from senselab.utils.data_structures.video import Video | ||
|
||
|
||
class SenselabDataset(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we said, we need to merge our Dataset classes @wilke0818
""" | ||
pass | ||
|
||
def create_audio_split_for_pydra_task(self, batch_size: int = 1) -> List[List[Audio]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we also need a method for combining results together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point. Worth considering how useful we think this Pydra split functionality is
(e.g. participant demographics, video settings, location information) | ||
""" | ||
|
||
frames: torch.Tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we validate the shape of this torch.Tensor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we can, I think it should be 4 dimensions?
MPS: str = "mps" | ||
|
||
|
||
def _select_device_and_dtype( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we maybe have 3 params?
- requested_device (optional)
- available_devices (we can compute this)
- compatible_devices (this depends on the models)
in this way, we can more easily handle when the user asks for a device that is not available or compatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
Audio(waveform=stereo_audio[0].waveform[1], sampling_rate=stereo_audio[0].sampling_rate), | ||
] | ||
).create_audio_split_for_pydra_task(2) | ||
batch_inverted = augment_audio_dataset(batched_audio[0], apply_augmentation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you tested this code with and without GPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, as I mentioned before, I am not sure exactly how we would go about realistically testing on GPU. We could mock it, but that seems like it doesn't really test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will need to be adjusted based on the merge we will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably mostly creation logic and testing. Probably easiest to merge your branch in and just clean it up. Hopefully there'll be minimal conflicts.
adding some more API classes (participant, session, dataset)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in person