Merge branch 'develop' into issue_942_wrong_index

kitzeslab · Sep 10, 2024 · 6f6198e · 6f6198e
2 parents acdd7a3 + 70301d5
commit 6f6198e
Show file tree

Hide file tree

Showing 161 changed files with 17,640 additions and 5,566 deletions.
diff --git a/.github/workflows/poetry.yml b/.github/workflows/poetry.yml
@@ -37,33 +37,35 @@ jobs:
         run: /home/runner/.local/bin/poetry run pytest
       - name: Poetry run black check
         run: /home/runner/.local/bin/poetry run black . --check --diff
-  test_macos:
-    runs-on: macos-latest
-    strategy:
-      matrix:
-        python-version: ["3.9", "3.10", "3.11"]
-      # Ensure that all flavours are run to completion even if an other flavor failed
-      fail-fast: false
-    steps:
-      - uses: actions/checkout@v2
-      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v2
-        with:
-          python-version: ${{ matrix.python-version }}
-      - name: Install libsndfile
-        run: brew install libsndfile
-      - name: Install ffmpeg v4
-        run: brew install ffmpeg
-      - name: Install poetry
-        run: curl -sSL https://install.python-poetry.org | python
-      - name: Poetry install
-        run: /Users/runner/.local/bin/poetry install
-      - name: Workaround for missing pytorch dependencies during poetry install
-        run: /Users/runner/.local/bin/poetry run pip install torch
-      - name: Poetry run pytest
-        run: /Users/runner/.local/bin/poetry run pytest
-      - name: Poetry run black check
-        run: /Users/runner/.local/bin/poetry run black . --check --diff
+  # temporarily disabling macos tests as they are failing due to mps out of memory errors
+  # that we can't seem to fix. - SL 2024-06-21
+  # test_macos:
+  #   runs-on: macos-latest
+  #   strategy:
+  #     matrix:
+  #       python-version: ["3.9", "3.10", "3.11"]
+  #     # Ensure that all flavours are run to completion even if an other flavor failed
+  #     fail-fast: false
+  #   steps:
+  #     - uses: actions/checkout@v2
+  #     - name: Set up Python ${{ matrix.python-version }}
+  #       uses: actions/setup-python@v2
+  #       with:
+  #         python-version: ${{ matrix.python-version }}
+  #     - name: Install libsndfile
+  #       run: brew install libsndfile
+  #     - name: Install ffmpeg v4
+  #       run: brew install ffmpeg
+  #     - name: Install poetry
+  #       run: curl -sSL https://install.python-poetry.org | python
+  #     - name: Poetry install
+  #       run: /Users/runner/.local/bin/poetry install
+  #     - name: Workaround for missing pytorch dependencies during poetry install
+  #       run: /Users/runner/.local/bin/poetry run pip install torch
+  #     - name: Poetry run pytest
+  #       run: /Users/runner/.local/bin/poetry run pytest
+  #     - name: Poetry run black check
+  #       run: /Users/runner/.local/bin/poetry run black . --check --diff
 ## ------------------------------------------------------------------------
 ## The below job installs opensoundscape on a windows machine using WSL.
 ## It is commented out, as it keeps hanging on the final step 'Poetry run pytest'

diff --git a/.gitignore b/.gitignore
@@ -7,5 +7,8 @@ doc/_build/
 *.ipynb_checkpoints/
 *.DS_Store
 docs/tutorials/woodcock_labeled_data
+docs/tutorials/annotated_data
 .vscode
 docs/tutorials/wandb
+untracked/
+docs/tutorials/annotated_data
diff --git a/README.md b/README.md
@@ -1,3 +1,5 @@
+# OpenSoundscape
+
 [![CI Status](https://github.com/kitzeslab/opensoundscape/workflows/CI/badge.svg)](https://github.com/kitzeslab/opensoundscape/actions?query=workflow%3ACI)
 [![Documentation Status](https://readthedocs.org/projects/opensoundscape/badge/?version=latest)](http://opensoundscape.org/en/latest/?badge=latest)
 
@@ -14,7 +16,7 @@ OpenSoundscape includes utilities which can be strung together to create data an
 * estimate the location of sound sources from synchronized recordings
 
 
-OpenSoundscape's source code can be found on [GitHub] (https://github.com/kitzeslab/opensoundscape) and its documentation can be found on [OpenSoundscape.org](https://opensoundscape.org).
+OpenSoundscape's documentation can be found on [OpenSoundscape.org](https://opensoundscape.org).
 
 ## Show me the code!
 
@@ -47,7 +49,7 @@ Details about installation are available on the OpenSoundscape documentation at
 
 #### How do I install OpenSoundscape?
 
-* Most users should install OpenSoundscape via pip, preferably within a virtual environment: `pip install opensoundscape==0.10.1`. 
+* Most users should install OpenSoundscape via pip, preferably within a virtual environment: `pip install opensoundscape==0.10.2`. 
 * To use OpenSoundscape in Jupyter Notebooks (e.g. for tutorials), follow the installation instructions for your operating system, then follow the "Jupyter" instructions.
 * Contributors and advanced users can also use Poetry to install OpenSoundscape using the "Contributor" instructions
 
@@ -59,7 +61,7 @@ Details about installation are available on the OpenSoundscape documentation at
 * Most computer cluster users should follow the Linux installation instructions
 
 
-### Use Audio and Spectrogram classes
+### Use Audio and Spectrogram classes to inspect audio data
 ```python
 from opensoundscape import Audio, Spectrogram
 
@@ -83,7 +85,27 @@ path = '/path/to/audiomoth_file.WAV' #an AudioMoth recording
 Audio.from_file(path, start_timestamp=start_time,duration=audio_length)
 ```
 
-### Use a pre-trained CNN to make predictions on long audio files
+### Load and use a model from the Bioacoustics Model Zoo
+The [Bioacoustics Model Zoo](https://github.com/kitzeslab/bioacoustics-model-zoo) hosts models in a respository that can be accessed via `torch.hub` and are compatible with OpenSoundscape. Load up a model and apply it to your own audio right away:
+
+```python
+from opensoundscape.ml import bioacoustics_model_zoo as bmz
+
+#list available models
+print(bmz.list_models())
+
+#generate class predictions and embedding vectors with Perch
+perch = bmz.load("Perch")
+scores = perch.predict(files)
+embeddings = perch.generate_embeddings(files)
+
+#...or BirdNET
+birdnet = bmz.load("BirdNET")
+scores = birdnet.predict(files)
+embeddings = birdnet.generate_embeddings(files)
+```
+
+### Load a pre-trained CNN from a local file, and make predictions on long audio files
 ```python
 from opensoundscape import load_model
 
@@ -111,7 +133,7 @@ all_annotations = BoxedAnnotations.from_raven_files(raven_file_paths,audio_file_
 class_list = ['IBWO','BLJA']
 
 # create labels for fixed-duration (2 second) clips 
-labels = all_annotations.one_hot_clip_labels(
+labels = all_annotations.multi_hot_clip_labels(
   cip_duration=2,
   clip_overlap=0,
   min_label_overlap=0.25,
@@ -123,20 +145,7 @@ train_df, validation_df = train_test_split(labels, test_size=0.3)
 
 # create a CNN and train on the labeled data
 model = CNN(architecture='resnet18', sample_duration=2, classes=class_list)
-model.train(train_df, validation_df, epochs=20, num_workers=8, batch_size=256)
-```
-
-### Train a CNN with labeled audio data (one label per audio file):
-```python
-from opensoundscape import CNN
-from sklearn.model_selection import train_test_split
-
-#load a DataFrame of one-hot audio clip labels
-df = pd.read_csv('my_labels.csv') #index: paths; columns: classes
-train_df, validation_df = train_test_split(df,test_size=0.2)
 
-#create a CNN and train on 2-second spectrograms for 20 epochs
-model = CNN('resnet18', classes=df.columns, sample_duration=2.0)
-model.train(train_df, validation_df, epochs=20)
-#the best model is automatically saved to a file `./best.model`
-```
+# train the model to recognize the classes of interest in audio data
+model.train(train_df, validation_df, epochs=20, num_workers=8, batch_size=256)
+```