feat: add FAQ based on faqtory (#1283)

pyannote · Mar 16, 2023 · 39a261a · 39a261a
1 parent b56add2
commit 39a261a
Show file tree

Hide file tree

Showing 12 changed files with 174 additions and 12 deletions.
diff --git a/.faq/FAQ.md b/.faq/FAQ.md
@@ -0,0 +1,20 @@
+
+# Frequently Asked Questions
+
+{%- for question in questions %}
+- [{{ question.title }}](#{{ question.slug }})
+{%- endfor %}
+
+
+{%- for question in questions %}
+
+<a name="{{ question.slug }}"></a>
+## {{ question.title }}
+
+{{ question.body }}
+
+{%- endfor %}
+
+<hr>
+
+Generated by [FAQtory](https://github.com/willmcgugan/faqtory)
diff --git a/.faq/suggest.md b/.faq/suggest.md
@@ -0,0 +1,20 @@
+{%- if questions -%}
+{% if questions|length == 1 %}
+We found the following entry in the [FAQ]({{ faq_url }}) which you may find helpful:
+{%- else %}
+We found the following entries in the [FAQ]({{ faq_url }}) which you may find helpful:
+{%- endif %}
+
+{% for question in questions %}
+- [{{ question.title }}]({{ faq_url }}#{{ question.slug }})
+{%- endfor %}
+
+Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.
+
+{%- else -%}
+Thank you for your issue. Give us a little time to review it.
+
+PS. You might want to check the [FAQ]({{ faq_url }}) if you haven't done so already.
+{%- endif %}
+
+This is an automated reply, generated by [FAQtory](https://github.com/willmcgugan/faqtory)
diff --git a/.github/workflows/new_issue.yml b/.github/workflows/new_issue.yml
@@ -0,0 +1,27 @@
+name: issues
+on:
+  issues:
+    types: [opened]
+jobs:
+  add-comment:
+    runs-on: ubuntu-latest
+    permissions:
+      issues: write
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          ref: main
+      - name: Install FAQtory
+        run: pip install FAQtory
+      - name: Run Suggest
+        run: faqtory suggest "${{ github.event.issue.title }}" > suggest.md
+      - name: Read suggest.md
+        id: suggest
+        uses: juliangruber/read-file-action@v1
+        with:
+          path: ./suggest.md
+      - name: Suggest FAQ
+        uses: peter-evans/create-or-update-comment@a35cf36e5301d70b76f316e867e7788a55a31dae
+        with:
+          issue-number: ${{ github.event.issue.number }}
+          body: ${{ steps.suggest.outputs.content }}
diff --git a/FAQ.md b/FAQ.md
@@ -1,11 +1,17 @@
-# Frequently asked questions
 
-## How does one capitalize and pronounce the name of this awesome library?
+# Frequently Asked Questions
+- [Can I apply pretrained pipelines on audio already loaded in memory?](#can-i-apply-pretrained-pipelines-on-audio-already-loaded-in-memory)
+- [Can I use gated models (and pipelines) offline?](#can-i-use-gated-models-(and-pipelines)-offline)
+- [Does pyannote support streaming speaker diarization?](#does-pyannote-support-streaming-speaker-diarization)
+- [How can I improve performance?](#how-can-i-improve-performance)
+- [How does one spell and pronounce pyannote.audio?](#how-does-one-spell-and-pronounce-pyannoteaudio)
 
-📝 Written in lower case: `pyannote.audio` (or `pyannote` if you are lazy).  Not `PyAnnote` nor `PyAnnotate` (*sic*).
-📢 [Pronounced](https://www.howtopronounce.com/french/pianote) like the french verb *pianoter*.  *pi* like in **pi**ano, not *py* like in **py**thon.
-🎹 *pianoter* means *to play the piano* (hence the logo 🤯).
+<a name="can-i-apply-pretrained-pipelines-on-audio-already-loaded-in-memory"></a>
+## Can I apply pretrained pipelines on audio already loaded in memory?
 
+Yes: read [this tutorial](tutorials/applying_a_pipeline.ipynb) until the end.
+
+<a name="can-i-use-gated-models-(and-pipelines)-offline"></a>
 ## Can I use gated models (and pipelines) offline?
 
 **Short answer**: yes, see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
@@ -16,10 +22,33 @@ For instance, before gating `pyannote/speaker-diarization`, I had no idea that s
 
 That being said, this whole authentication process does not prevent you from using official `pyannote.audio` models offline (i.e. without going through the authentication process in every `docker run ...` or whatever you are using in production): see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
 
-## **[Pretrained pipelines](https://huggingface.co/models?other=pyannote-audio-pipeline) do not produce good results on my data. What can I do?**
+<a name="does-pyannote-support-streaming-speaker-diarization"></a>
+## Does pyannote support streaming speaker diarization?
+
+**Short answer:** not out of the box, no.
+
+**Long answer:** [I](https://herve.niderb.fr) am looking for sponsors to add this feature. In the meantime, [`diart`](https://github.com/juanmc2005/StreamingSpeakerDiarization) is the closest you can get from a streaming `pyannote.audio`. You might also be interested in [this blog post](https://herve.niderb.fr/fastpages/2021/08/05/Streaming-voice-activity-detection-with-pyannote.html) about streaming voice activity detection based on `pyannote.audio`.
+
+<a name="how-can-i-improve-performance"></a>
+## How can I improve performance?
+
+**Long answer:**
 
 1. Manually annotate dozens of conversations as precisely as possible.
 2. Separate them into train (80%), development (10%) and test (10%) subsets.
 3. Setup the data for use with [`pyannote.database`](https://github.com/pyannote/pyannote-database#speaker-diarization).
 4. Follow [this recipe](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/adapting_pretrained_pipeline.ipynb).
 5. Enjoy.
+
+**Also:** [I am available](https://herve.niderb.fr) for contracting to help you with that.
+
+<a name="how-does-one-spell-and-pronounce-pyannoteaudio"></a>
+## How does one spell and pronounce pyannote.audio?
+
+📝 Written in lower case: `pyannote.audio` (or `pyannote` if you are lazy). Not `PyAnnote` nor `PyAnnotate` (sic).
+📢 Pronounced like the french verb `pianoter`. `pi` like in `pi`ano, not `py` like in `py`thon.
+🎹 `pianoter` means to play the piano (hence the logo 🤯).
+
+<hr>
+
+Generated by [FAQtory](https://github.com/willmcgugan/faqtory)
diff --git a/README.md b/README.md
@@ -60,6 +60,7 @@ pip install pyannote.audio
 ## Documentation
 
 - [Changelog](CHANGELOG.md)
+- [Frequently asked questions](FAQ.md)
 - Models
     - Available tasks explained
     - [Applying a pretrained model](tutorials/applying_a_model.ipynb)
@@ -84,12 +85,6 @@ pip install pyannote.audio
     - [Speaker verification](tutorials/speaker_verification.ipynb)
     - Visualization and debugging
 
-## Frequently asked questions
-
-* [How does one capitalize and pronounce the name of this awesome library?](FAQ.md)
-* [Can I use gated models (and pipelines) offline?](FAQ.md)
-* [Pretrained pipelines do not produce good results on my data. What can I do?](FAQ.md)
-
 ## Benchmark
 
 Out of the box, `pyannote.audio` default speaker diarization [pipeline](https://hf.co/pyannote/speaker-diarization) is expected to be much better (and faster) in v2.x than in v1.1. Those numbers are diarization error rates (in %)

diff --git a/faq.yml b/faq.yml
@@ -0,0 +1,7 @@
+# FAQtory settings
+
+faq_url: "https://github.com/pyannote/pyannote-audio/blob/develop/FAQ.md" # Replace this with the URL to your FAQ.md!
+
+questions_path: "./questions" # Where questions should be stored
+output_path: "./FAQ.md" # Where FAQ.md should be generated
+templates_path: ".faq" # Path to templates
diff --git a/questions/README.md b/questions/README.md
@@ -0,0 +1,6 @@
+
+# Questions
+
+Your questions should go in this directory.
+
+Question files should be named with the extension ".question.md".
diff --git a/questions/bad_performance.question.md b/questions/bad_performance.question.md
@@ -0,0 +1,16 @@
+---
+title: "How can I improve performance?"
+alt_titles:
+  - "Pretrained pipelines do not produce good results on my data. What can I do?"
+  - "It does not work! Help me!"
+---
+
+**Long answer:**
+
+1. Manually annotate dozens of conversations as precisely as possible.
+2. Separate them into train (80%), development (10%) and test (10%) subsets.
+3. Setup the data for use with [`pyannote.database`](https://github.com/pyannote/pyannote-database#speaker-diarization).
+4. Follow [this recipe](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/adapting_pretrained_pipeline.ipynb).
+5. Enjoy.
+
+**Also:** [I am available](https://herve.niderb.fr) for contracting to help you with that.
diff --git a/questions/from_memory.question.md b/questions/from_memory.question.md
@@ -0,0 +1,7 @@
+---
+title: "Can I apply pretrained pipelines on audio already loaded in memory?"
+alt_titles:
+  - "Can I apply models on an audio array?"
+---
+
+Yes: read [this tutorial](tutorials/applying_a_pipeline.ipynb) until the end.
diff --git a/questions/offline.question.md b/questions/offline.question.md
@@ -0,0 +1,15 @@
+---
+title: "Can I use gated models (and pipelines) offline?"
+alt_titles:
+  - "Why does one need to authenticate to access the pretrained models?"
+  - "Can I use pyannote.audio pretrained pipelines without the Hugginface token?"
+  - "How can I solve the permission issue?"
+---
+
+**Short answer**: yes, see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
+
+**Long answer**: gating models and pipelines allows [me](https://herve.niderb.fr) to know a bit more about `pyannote.audio` user base and eventually help me write grant proposals to make `pyannote.audio` even better. So, please fill gating forms as precisely as possible.
+
+For instance, before gating `pyannote/speaker-diarization`, I had no idea that so many people were relying on it in production. Hint: sponsors are more than welcome! Maintaining open source libraries is time consuming.
+
+That being said, this whole authentication process does not prevent you from using official `pyannote.audio` models offline (i.e. without going through the authentication process in every `docker run ...` or whatever you are using in production): see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
diff --git a/questions/pyannote.question.md b/questions/pyannote.question.md
@@ -0,0 +1,10 @@
+---
+title: "How does one spell and pronounce pyannote.audio?"
+alt_titles:
+  - "Why the name of the library?"
+  - "Why the logo of the library?"
+---
+
+📝 Written in lower case: `pyannote.audio` (or `pyannote` if you are lazy). Not `PyAnnote` nor `PyAnnotate` (sic).
+📢 Pronounced like the french verb `pianoter`. `pi` like in `pi`ano, not `py` like in `py`thon.
+🎹 `pianoter` means to play the piano (hence the logo 🤯).
diff --git a/questions/streaming.question.md b/questions/streaming.question.md
@@ -0,0 +1,10 @@
+---
+title: "Does pyannote support streaming speaker diarization?"
+alt_titles:
+  - "Is it possible to do realtime speaker diarization?"
+  - "Can it process online audio buffers?"
+---
+
+**Short answer:** not out of the box, no.
+
+**Long answer:** [I](https://herve.niderb.fr) am looking for sponsors to add this feature. In the meantime, [`diart`](https://github.com/juanmc2005/StreamingSpeakerDiarization) is the closest you can get from a streaming `pyannote.audio`. You might also be interested in [this blog post](https://herve.niderb.fr/fastpages/2021/08/05/Streaming-voice-activity-detection-with-pyannote.html) about streaming voice activity detection based on `pyannote.audio`.