Skip to content

Commit

Permalink
organizing interface+service abstractions and testing it with input_o…
Browse files Browse the repository at this point in the history
…utput tasks
  • Loading branch information
fabiocat93 committed Apr 12, 2024
1 parent d7227da commit d56f9c5
Show file tree
Hide file tree
Showing 83 changed files with 2,419 additions and 873 deletions.
101 changes: 0 additions & 101 deletions FEATURES.md

This file was deleted.

38 changes: 38 additions & 0 deletions FEATURES.tmp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Functionalities
This file is here just as a support for development.

AUDIO

[TODO]:
- speech to text
1. to transcribe speech into text
- INPUT:
1. a datasets object with the audio recordings in the "audio" column
2. the audio column (default = "audio")
3. the speech to text service to use (including the name, the version, the revision, and - for some services only and sometimes it's optional - the language of the transcription model we want to use)
- PREPROCESSING:
1. adapt the language to the service format
2. organize the dataset into batches
- PROCESSING:
1. transcribe the dataset
- POSTPROCESSING:
1. formatting the transcripts to follow a standard organization
- OUTPUT:
1. a new dataset including only the transcripts of the audios in a standardized json format (plus an index?)
- TESTS:
1. test input errors (a field is missing, the audio column exists and contains audio objects, params missing)
2. test the transcript of a test file is ok
3. test the language is supported (and the tool handles errors)

2. to compute word error rate
- INPUT:
1. a dataset object with the "transcript" and the "groundtruth" columns
2. a service with a name (default is jitter)
- PROCESSING:
1. computing the per-row WER between the 2 columns
- OUTPUT:
1. a dataset with the "WER" column
- TESTS:
1. test input errors (a field is missing, fields missing, the 2 columns don't contain strings)
2. test output is ok

46 changes: 37 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,32 @@

[![pages](https://img.shields.io/badge/api-docs-blue)](https://sensein.github.io/pipepal)

Welcome to the ```pipepal``` repo! This is a Python package for doing incredible stuff with speech and voice.
Welcome to the ```pipepal``` repo! This is a Python package for streamlining the processing and analysis of behavioral data, such as voice and speech patterns, with robust and reproducible methodologies.

**Caution:**: this package is still under development and may change rapidly over the next few weeks.

## Features
- A few
- Cool
- Things
- These may include a wonderful CLI interface.
- **Modular design**: Utilize a variety of task-specific transformations that can be easily integrated or used standalone, allowing for flexible data manipulation and analysis strategies.

## Installation
Install this package via :
- **Pre-built pipelines**: Access pre-configured pipelines combining multiple transformations tailored for common analysis tasks, which help in reducing setup time and effort.

- **Reproducibility**: Ensures consistent outputs through the use of fixed seeds and version-controlled processing steps, making your results verifiable and easily comparable.

- **Easy integration**: Designed to fit into existing workflows with minimal configuration, `pipepal` can be used alongside other data analysis tools and frameworks seamlessly.

- **Extensible**: Open to modifications and contributions, the package can be expanded with custom transformations and pipelines to meet specific research needs. <u>Do you want to contribute? Please, reach out!</u>

- **Comprehensive documentation**: Comes with detailed documentation for all features and modules, including examples and guides on how to extend the package for other types of behavioral data analysis.

- **Performance Optimized**: Efficiently processes large datasets with optimized code and algorithms, ensuring quick turnaround times even for complex analyses.

- **Interactive Examples**: Includes Jupyter notebooks that provide practical examples of how `pipepal` can be implemented to derive insights from real-world data sets.

Whether you're researching speech disorders, analyzing customer service calls, or studying communication patterns, `pipepal` provides the tools and flexibility needed to extract meaningful conclusions from your data.


## Installation
Install this package via:

```sh
pip install pipepal
Expand All @@ -42,5 +55,20 @@ hello_world()
```

## To do:
- [ ] A
- [ ] lot
- [ ] Integrating more audio tasks and moving functions from b2aiprep package:
- [ ] data_augmentation
- [ ] data_representation
- [x] example_task
- [x] input_output
- [ ] raw_signal_processing
- [ ] speaker_diarization
- [ ] speech emotion recognition
- [ ] speech enhancement
- [ ] speech_to_text
- [ ] text_to_speech
- [ ] voice conversion
- [ ] Integrating more video tasks:
- [x] input_output

- [ ] Preparing some pipelines with pydra
- [ ] Populating the CLI
Binary file added data_for_testing/audio_48khz_mono_16bits.wav
Binary file not shown.
Binary file added data_for_testing/audio_48khz_stereo_16bits.wav
Binary file not shown.
Binary file not shown.
16 changes: 16 additions & 0 deletions data_for_testing/output_dataset/dataset_info.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"citation": "",
"description": "",
"features": {
"pokemon": {
"dtype": "string",
"_type": "Value"
},
"type": {
"dtype": "string",
"_type": "Value"
}
},
"homepage": "",
"license": ""
}
13 changes: 13 additions & 0 deletions data_for_testing/output_dataset/state.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"_data_files": [
{
"filename": "data-00000-of-00001.arrow"
}
],
"_fingerprint": "57821607a631abce",
"_format_columns": null,
"_format_kwargs": {},
"_format_type": null,
"_output_all_columns": false,
"_split": null
}
Binary file added data_for_testing/video_48khz_stereo_16bits.mp4
Binary file not shown.
Loading

0 comments on commit d56f9c5

Please sign in to comment.