generated from sensein/python-package-template
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
organizing interface+service abstractions and testing it with input_o…
…utput tasks
- Loading branch information
1 parent
d7227da
commit d56f9c5
Showing
83 changed files
with
2,419 additions
and
873 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Functionalities | ||
This file is here just as a support for development. | ||
|
||
AUDIO | ||
|
||
[TODO]: | ||
- speech to text | ||
1. to transcribe speech into text | ||
- INPUT: | ||
1. a datasets object with the audio recordings in the "audio" column | ||
2. the audio column (default = "audio") | ||
3. the speech to text service to use (including the name, the version, the revision, and - for some services only and sometimes it's optional - the language of the transcription model we want to use) | ||
- PREPROCESSING: | ||
1. adapt the language to the service format | ||
2. organize the dataset into batches | ||
- PROCESSING: | ||
1. transcribe the dataset | ||
- POSTPROCESSING: | ||
1. formatting the transcripts to follow a standard organization | ||
- OUTPUT: | ||
1. a new dataset including only the transcripts of the audios in a standardized json format (plus an index?) | ||
- TESTS: | ||
1. test input errors (a field is missing, the audio column exists and contains audio objects, params missing) | ||
2. test the transcript of a test file is ok | ||
3. test the language is supported (and the tool handles errors) | ||
|
||
2. to compute word error rate | ||
- INPUT: | ||
1. a dataset object with the "transcript" and the "groundtruth" columns | ||
2. a service with a name (default is jitter) | ||
- PROCESSING: | ||
1. computing the per-row WER between the 2 columns | ||
- OUTPUT: | ||
1. a dataset with the "WER" column | ||
- TESTS: | ||
1. test input errors (a field is missing, fields missing, the 2 columns don't contain strings) | ||
2. test output is ok | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
{ | ||
"citation": "", | ||
"description": "", | ||
"features": { | ||
"pokemon": { | ||
"dtype": "string", | ||
"_type": "Value" | ||
}, | ||
"type": { | ||
"dtype": "string", | ||
"_type": "Value" | ||
} | ||
}, | ||
"homepage": "", | ||
"license": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"_data_files": [ | ||
{ | ||
"filename": "data-00000-of-00001.arrow" | ||
} | ||
], | ||
"_fingerprint": "57821607a631abce", | ||
"_format_columns": null, | ||
"_format_kwargs": {}, | ||
"_format_type": null, | ||
"_output_all_columns": false, | ||
"_split": null | ||
} |
Binary file not shown.
Oops, something went wrong.