Skip to content

Commit 70a8507

Browse files
committed
Merge branch 'release/3.2.0'
2 parents 2a72067 + bb4dd2e commit 70a8507

File tree

89 files changed

+6018
-4135
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

89 files changed

+6018
-4135
lines changed
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
name: Bug report
2+
description: Report a bug in pyannote.audio
3+
body:
4+
5+
- type: markdown
6+
attributes:
7+
value: |
8+
When reporting bugs, please follow the guidelines in this template. This helps identify the problem precisely and thus enables contributors to fix it faster.
9+
- Write a descriptive issue title above.
10+
- The golden rule is to **always open *one* issue for *one* bug**. If you notice several bugs and want to report them, make sure to create one new issue for each of them.
11+
- Search [open](https://github.com/pyannote/pyannote-audio/issues) and [closed](https://github.com/pyannote/pyannote-audio/issues?q=is%3Aissue+is%3Aclosed) issues to ensure it has not already been reported. If you don't find a relevant match or if you're unsure, don't hesitate to **open a new issue**. The bugsquad will handle it from there if it's a duplicate.
12+
- Please always check if your issue is reproducible in the latest version – it may already have been fixed!
13+
- If you use a custom build, please test if your issue is reproducible in official releases too.
14+
15+
- type: textarea
16+
attributes:
17+
label: Tested versions
18+
description: |
19+
To properly fix a bug, we need to identify if the bug was recently introduced in the engine, or if it was always present.
20+
- Please specify the pyannote.audio version you found the issue in, including the **Git commit hash** if using a development build.
21+
- If you can, **please test earlier pyannote.audio versions** and, if applicable, newer versions (development branch). Mention whether the bug is reproducible or not in the versions you tested.
22+
- The aim is for us to identify whether a bug is a **regression**, i.e. an issue that didn't exist in a previous version, but was introduced later on, breaking existing functionality. For example, if a bug is reproducible in 3.2 but not in 3.0, we would like you to test intermediate 3.1 to find which version is the first one where the issue can be reproduced.
23+
placeholder: |
24+
- Reproducible in: 3.1, 3.2, and later
25+
- Not reproducible in: 3.0
26+
validations:
27+
required: true
28+
29+
- type: input
30+
attributes:
31+
label: System information
32+
description: |
33+
- Specify the OS version, and when relevant hardware information.
34+
- For issues that are likely OS-specific and/or GPU-related, please specify the GPU model and architecture.
35+
- **Bug reports not including the required information may be closed at the maintainers' discretion.** If in doubt, always include all the requested information; it's better to include too much information than not enough information.
36+
placeholder: macOS 13.6 - pyannote.audio 3.1.1 - M1 Pro
37+
validations:
38+
required: true
39+
40+
- type: textarea
41+
attributes:
42+
label: Issue description
43+
description: |
44+
Describe your issue briefly. What doesn't work, and how do you expect it to work instead?
45+
You can include audio, images or videos with drag and drop, and format code blocks or logs with <code>```</code> tags.
46+
validations:
47+
required: true
48+
49+
- type: input
50+
attributes:
51+
label: Minimal reproduction example (MRE)
52+
description: |
53+
Having reproducible issues is a prerequisite for contributors to be able to solve them.
54+
Include a link to minimal reproduction example using [this Google Colab notebook](https://colab.research.google.com/github/pyannote/pyannote-audio/blob/develop/tutorials/MRE_template.ipynb) as a starting point.
55+
validations:
56+
required: true

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
blank_issues_enabled: false
2+
3+
contact_links:
4+
5+
- name: Feature request
6+
url: https://github.com/pyannote/pyannote-audio/discussions
7+
about: Suggest an idea for this project.
8+
9+
- name: Consulting
10+
url: https://herve.niderb.fr/consulting
11+
about: Using pyannote.audio in production? Make the most of it thanks to our consulting services.
12+
13+
- name: Premium models
14+
url: https://forms.office.com/e/GdqwVgkZ5C
15+
about: We are considering selling premium models, extensions, or services around pyannote.audio.

.github/ISSUE_TEMPLATE/feature_request.md

Lines changed: 0 additions & 20 deletions
This file was deleted.

.github/workflows/new_issue.yml

Lines changed: 0 additions & 29 deletions
This file was deleted.

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,4 +30,4 @@ jobs:
3030
pip install -e .[dev,testing]
3131
- name: Test with pytest
3232
run: |
33-
pytest
33+
pytest -k "not test_cli.py"

.github/workflows/test_cli.yml

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
name: CLI tests
2+
3+
on:
4+
push:
5+
branches: [develop]
6+
pull_request:
7+
branches: [develop]
8+
9+
jobs:
10+
build:
11+
timeout-minutes: 20
12+
runs-on: ${{ matrix.os }}
13+
strategy:
14+
matrix:
15+
os: [ubuntu-latest]
16+
python-version: ["3.10"]
17+
steps:
18+
- uses: actions/checkout@v2
19+
- name: Set up Python ${{ matrix.python-version }}
20+
uses: actions/setup-python@v2
21+
with:
22+
python-version: ${{ matrix.python-version }}
23+
- name: Install libsndfile
24+
if: matrix.os == 'ubuntu-latest'
25+
run: |
26+
sudo apt-get update
27+
sudo apt-get install libsndfile1
28+
- name: Install pyannote.audio
29+
run: |
30+
pip install -e .[dev,testing,cli]
31+
- name: Test with pytest
32+
run: |
33+
pytest tests/test_cli.py

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ repos:
1414

1515
# Sort imports
1616
- repo: https://github.com/PyCQA/isort
17-
rev: 5.10.1
17+
rev: 5.12.0
1818
hooks:
1919
- id: isort
2020
args: ["--profile", "black"]

CHANGELOG.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,40 @@
11
# Changelog
22

3+
## Version 3.2.0 (2024-05-08)
4+
5+
### New features
6+
7+
- feat(task): add option to cache task training metadata to speed up training (with [@clement-pages](https://github.com/clement-pages/))
8+
- feat(model): add `receptive_field`, `num_frames` and `dimension` to models (with [@Bilal-Rahou](https://github.com/Bilal-Rahou))
9+
- feat(model): add `fbank_only` property to `WeSpeaker` models
10+
- feat(util): add `Powerset.permutation_mapping` to help with permutation in powerset space (with [@FrenchKrab](https://github.com/FrenchKrab))
11+
- feat(sample): add sample file at `pyannote.audio.sample.SAMPLE_FILE`
12+
- feat(metric): add `reduce` option to `diarization_error_rate` metric (with [@Bilal-Rahou](https://github.com/Bilal-Rahou))
13+
- feat(pipeline): add `Waveform` and `SampleRate` preprocessors
14+
15+
### Fixes
16+
17+
- fix(task): fix random generators and their reproducibility (with [@FrenchKrab](https://github.com/FrenchKrab))
18+
- fix(task): fix estimation of training set size (with [@FrenchKrab](https://github.com/FrenchKrab))
19+
- fix(hook): fix `torch.Tensor` support in `ArtifactHook`
20+
- fix(doc): fix typo in `Powerset` docstring (with [@lukasstorck](https://github.com/lukasstorck))
21+
22+
### Improvements
23+
24+
- improve(metric): add support for number of speakers mismatch in `diarization_error_rate` metric
25+
- improve(pipeline): track both `Model` and `nn.Module` attributes in `Pipeline.to(device)`
26+
- improve(io): switch to `torchaudio >= 2.2.0`
27+
- improve(doc): update tutorials (with [@clement-pages](https://github.com/clement-pages/))
28+
29+
## Breaking changes
30+
31+
- BREAKING(model): get rid of `Model.example_output` in favor of `num_frames` method, `receptive_field` property, and `dimension` property
32+
- BREAKING(task): custom tasks need to be updated (see "Add your own task" tutorial)
33+
34+
## Community contributions
35+
36+
- community: add tutorial for offline use of `pyannote/speaker-diarization-3.1` (by [@simonottenhauskenbun](https://github.com/simonottenhauskenbun))
37+
338
## Version 3.1.1 (2023-12-01)
439

540
### TL;DR

MANIFEST.in

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
recursive-include pyannote *.py
22
recursive-include pyannote *.yaml
3+
recursive-include pyannote *.wav
4+
recursive-include pyannote *.rttm
35
global-exclude *.pyc
46
global-exclude __pycache__

README.md

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -70,26 +70,30 @@ for turn, _, speaker in diarization.itertracks(yield_label=True):
7070
- Videos
7171
- [Introduction to speaker diarization](https://umotion.univ-lemans.fr/video/9513-speech-segmentation-and-speaker-diarization/) / JSALT 2023 summer school / 90 min
7272
- [Speaker segmentation model](https://www.youtube.com/watch?v=wDH2rvkjymY) / Interspeech 2021 / 3 min
73-
- [First releaase of pyannote.audio](https://www.youtube.com/watch?v=37R_R82lfwA) / ICASSP 2020 / 8 min
73+
- [First release of pyannote.audio](https://www.youtube.com/watch?v=37R_R82lfwA) / ICASSP 2020 / 8 min
74+
- Community contributions (not maintained by the core team)
75+
- 2024-04-05 > [Offline speaker diarization (speaker-diarization-3.1)](tutorials/community/offline_usage_speaker_diarization.ipynb) by [Simon Ottenhaus](https://github.com/simonottenhauskenbun)
7476

7577
## Benchmark
7678

7779
Out of the box, `pyannote.audio` speaker diarization [pipeline](https://hf.co/pyannote/speaker-diarization-3.1) v3.1 is expected to be much better (and faster) than v2.x.
7880
Those numbers are diarization error rates (in %):
7981

80-
| Benchmark | [v2.1](https://hf.co/pyannote/speaker-diarization-2.1) | [v3.1](https://hf.co/pyannote/speaker-diarization-3.1) | [Premium](https://forms.office.com/e/GdqwVgkZ5C) |
81-
| ---------------------- | ------------------------------------------------------ | ------------------------------------------------------ | ---------------------------------------------- |
82-
| AISHELL-4 | 14.1 | 12.3 | 11.9 |
83-
| AliMeeting (channel 1) | 27.4 | 24.5 | 22.5 |
84-
| AMI (IHM) | 18.9 | 18.8 | 16.6 |
85-
| AMI (SDM) | 27.1 | 22.6 | 20.9 |
86-
| AVA-AVD | 66.3 | 50.0 | 39.8 |
87-
| CALLHOME (part 2) | 31.6 | 28.4 | 22.2 |
88-
| DIHARD 3 (full) | 26.9 | 21.4 | 17.2 |
89-
| Ego4D (dev.) | 61.5 | 51.2 | 43.8 |
90-
| MSDWild | 32.8 | 25.4 | 19.8 |
91-
| REPERE (phase2) | 8.2 | 7.8 | 7.6 |
92-
| VoxConverse (v0.3) | 11.2 | 11.2 | 9.4 |
82+
| Benchmark | [v2.1](https://hf.co/pyannote/speaker-diarization-2.1) | [v3.1](https://hf.co/pyannote/speaker-diarization-3.1) | [Premium](https://forms.office.com/e/GdqwVgkZ5C) |
83+
| --------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ | ------------------------------------------------ |
84+
| [AISHELL-4](https://arxiv.org/abs/2104.03603) | 14.1 | 12.2 | 11.9 |
85+
| [AliMeeting](https://www.openslr.org/119/) (channel 1) | 27.4 | 24.4 | 22.5 |
86+
| [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (IHM) | 18.9 | 18.8 | 16.6 |
87+
| [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (SDM) | 27.1 | 22.4 | 20.9 |
88+
| [AVA-AVD](https://arxiv.org/abs/2111.14448) | 66.3 | 50.0 | 39.8 |
89+
| [CALLHOME](https://catalog.ldc.upenn.edu/LDC2001S97) ([part 2](https://github.com/BUTSpeechFIT/CALLHOME_sublists/issues/1)) | 31.6 | 28.4 | 22.2 |
90+
| [DIHARD 3](https://catalog.ldc.upenn.edu/LDC2022S14) ([full](https://arxiv.org/abs/2012.01477)) | 26.9 | 21.7 | 17.2 |
91+
| [Earnings21](https://github.com/revdotcom/speech-datasets) | 17.0 | 9.4 | 9.0 |
92+
| [Ego4D](https://arxiv.org/abs/2110.07058) (dev.) | 61.5 | 51.2 | 43.8 |
93+
| [MSDWild](https://github.com/X-LANCE/MSDWILD) | 32.8 | 25.3 | 19.8 |
94+
| [RAMC](https://www.openslr.org/123/) | 22.5 | 22.2 | 18.4 |
95+
| [REPERE](https://www.islrn.org/resources/360-758-359-485-0/) (phase2) | 8.2 | 7.8 | 7.6 |
96+
| [VoxConverse](https://github.com/joonson/voxconverse) (v0.3) | 11.2 | 11.3 | 9.4 |
9397

9498
[Diarization error rate](http://pyannote.github.io/pyannote-metrics/reference.html#diarization) (in %)
9599

0 commit comments

Comments
 (0)