Move things around. Simplify code and wording

juanmc2005 · Nov 11, 2023 · 145d32d · 145d32d
1 parent ab8f516
commit 145d32d
Showing 1 changed file with 42 additions and 44 deletions.
diff --git a/README.md b/README.md
@@ -15,28 +15,24 @@
 
 <div align="center">
   <h4>
-    <a href="#-installation">
-      💾 Installation
-    </a>
-    <span> | </span>
     <a href="#%EF%B8%8F-stream-audio">
       🎙️ Stream audio
     </a>
     <span> | </span>
-    <a href="#-available-models">
-      🧠 Available models
+    <a href="#-installation">
+      💾 Installation
     </a>
     <span> | </span>
-    <a href="#-add-your-model">
-      🤖 Add your model
+    <a href="#-models">
+      🧠 Available models
     </a>
     <br />
     <a href="#-tune-hyper-parameters">
-      📈 Tune hyper-parameters
+      📈 Tuning
     </a>
     <span> | </span>
     <a href="#-build-pipelines">
-      🧠🔗 Build pipelines
+      🧠🔗 Pipelines
     </a>
     <span> | </span>
     <a href="#-websockets">
@@ -59,30 +55,6 @@
 <img width="100%" src="/demo.gif" title="Real-time diarization example" />
 </p>
 
-## 💾 Installation
-
-1) Create environment:
-
-```shell
-conda env create -f diart/environment.yml
-conda activate diart
-```
-
-2) Install the package:
-```shell
-pip install diart
-```
-
-### Get access to 🎹 pyannote models
-
-By default, diart is based on [pyannote.audio](https://github.com/pyannote/pyannote-audio) models stored in the [huggingface](https://huggingface.co/) hub.
-To allow diart to use them, you need to follow these steps:
-
-1) [Accept user conditions](https://huggingface.co/pyannote/segmentation) for the `pyannote/segmentation` model
-2) [Accept user conditions](https://huggingface.co/pyannote/segmentation-3.0) for the newest `pyannote/segmentation-3.0` model
-3) [Accept user conditions](https://huggingface.co/pyannote/embedding) for the `pyannote/embedding` model
-4) Install [huggingface-cli](https://huggingface.co/docs/huggingface_hub/quick-start#install-the-hub-library) and [log in](https://huggingface.co/docs/huggingface_hub/quick-start#login) with your user access token (or provide it manually in diart CLI or API).
-
 ## 🎙️ Stream audio
 
 ### From the command line
@@ -122,10 +94,41 @@ prediction = inference()
 
 For inference and evaluation on a dataset we recommend to use `Benchmark` (see notes on [reproducibility](#-reproducibility)).
 
-## 🧠 Available models
+## 💾 Installation
+
+**1) Make sure your system has the following dependencies:**
+
+```
+ffmpeg < 4.4
+portaudio == 19.6.X
+libsndfile >= 1.2.2
+```
+
+Alternatively, we provide an `environment.yml` file for a pre-configured conda environment:
+
+```shell
+conda env create -f diart/environment.yml
+conda activate diart
+```
+
+**2) Install the package:**
+```shell
+pip install diart
+```
 
-You can use a different segmentation or embedding model with `--segmentation` and `--embedding`.
+### Get access to 🎹 pyannote models
 
+By default, diart is based on [pyannote.audio](https://github.com/pyannote/pyannote-audio) models stored in the [huggingface](https://huggingface.co/) hub.
+To allow diart to use them, you need to follow these steps:
+
+1) [Accept user conditions](https://huggingface.co/pyannote/segmentation) for the `pyannote/segmentation` model
+2) [Accept user conditions](https://huggingface.co/pyannote/segmentation-3.0) for the newest `pyannote/segmentation-3.0` model
+3) [Accept user conditions](https://huggingface.co/pyannote/embedding) for the `pyannote/embedding` model
+4) Install [huggingface-cli](https://huggingface.co/docs/huggingface_hub/quick-start#install-the-hub-library) and [log in](https://huggingface.co/docs/huggingface_hub/quick-start#login) with your user access token (or provide it manually in diart CLI or API).
+
+## 🧠 Models
+
+You can use other models with the `--segmentation` and `--embedding` arguments.
 Or in python:
 
 ```python
@@ -135,6 +138,8 @@ segmentation = m.SegmentationModel.from_pretrained("model_name")
 embedding = m.EmbeddingModel.from_pretrained("model_name")
 ```
 
+### Available pre-trained models
+
 Below is a list of all the models currently supported by diart:
 
 | Model Name | Model Type   | CPU Time* | GPU Time* |
@@ -155,16 +160,13 @@ The latency of embedding models is measured in a diarization pipeline using `pya
 
 \* CPU: AMD Ryzen 9 - GPU: RTX 4060 Max-Q
 
-## 🤖 Add your model
+### Custom models
 
 Third-party models can be integrated by providing a loader function:
 
 ```python
 from diart import SpeakerDiarization, SpeakerDiarizationConfig
 from diart.models import EmbeddingModel, SegmentationModel
-from diart.sources import MicrophoneAudioSource
-from diart.inference import StreamingInference
-
 
 def segmentation_loader():
     # It should take a waveform and return a segmentation tensor
@@ -174,17 +176,13 @@ def embedding_loader():
     # It should take (waveform, weights) and return per-speaker embeddings
     return load_pretrained_model("my_other_model.ckpt")
 
-
 segmentation = SegmentationModel(segmentation_loader)
 embedding = EmbeddingModel(embedding_loader)
 config = SpeakerDiarizationConfig(
     segmentation=segmentation,
     embedding=embedding,
 )
 pipeline = SpeakerDiarization(config)
-mic = MicrophoneAudioSource()
-inference = StreamingInference(pipeline, mic)
-prediction = inference()
 ```
 
 If you have an ONNX model, you can use `from_onnx()`: