wavlm

Here are 19 public repositories matching this topic...

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

text-to-speech deep-learning pytorch tts speech-synthesis gan speaker-adaptation adversarial-training diffusion-models wavlm latent-diffusion latent-diffusion-models

Updated Aug 10, 2024
Python

s3prl / s3prl

Star

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Updated Jun 13, 2025
Python

wenet-e2e / wespeaker

Star

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Updated Feb 11, 2026
Python

lucadellalib / focalcodec

Star

A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation

deep-learning pytorch speech-synthesis codec vector-quantization wavlm vocos focal-modulation neural-speech-coding

Updated Nov 30, 2025
Jupyter Notebook

lucadellalib / audiocodecs

Star

A collections of audio codecs with a standardized API

text-to-speech pytorch speech-synthesis codec quantization mimi dac self-supervised-learning encodec wavlm speech-coding speechtokenizer speech-language-model

Updated May 27, 2025
Python

mjhydri / Singing-Vocal-Beat-Tracking

Star

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…

music music-information-retrieval beat-tracking self-supervised singing-voice hubert linear-transformer wavlm

Updated Sep 4, 2022
Python

lucadellalib / discrete-wavlm-codec

Star

A neural speech codec based on discrete WavLM representations

clustering pytorch speech-synthesis codec k-means quantization self-supervised-learning hifi-gan wavlm token-extraction neural-speech-coding

Updated Aug 28, 2024
Python

Amir-Ivry / MAPSS-measures

Star

The code for the MAPSS measures for source separation evaluation.

ai mos diffusion-maps source-separation psychoacoustics speech-separation perceptual-evaluation audio-quality mert hubert wav2vec2 wavlm music-sources-separation speech-measures

Updated Sep 17, 2025
Python

bunyaminergen / WavLMMSDD

Star

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.

microsoft speech embedding speaker-diarization diarization nvidia-nemo wavlm speech-embedding

Updated Jun 17, 2025
Jupyter Notebook

Sarasadeghii / Sharif-WavLM

Star

In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.

confusion-matrix speaker-verification farsi-datasets wavlm pycm

Updated May 27, 2023
Jupyter Notebook

alessandropec / data_driven_ai_voice_cloning

Star

This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

machine-learning text-to-speech ai deep-learning speaker-verification zero-shot-learning speaker-embeddings voice-cloning tacotron2 fastspeech2 ecapa-tdnn wavlm generative-ai

Updated Mar 5, 2023
Python

sadPororo / L-TDNN

Star

Layer-aware TDNN: Speaker Recognition Using Multi-Layer Features from Pre-Trained Models, to appear in ICAIIC 2026

pretrained-models speaker-recognition speaker-verification hubert wav2vec2 wavlm

Updated Dec 17, 2025
Python

theolepage / wavlm_ssl_sv

Star

SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.

pytorch speaker-recognition speaker-verification asr dino self-supervised-learning voxceleb wavlm

Updated Feb 19, 2025
Python

sadPororo / LAP

Star

Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification, ISCA Interspeech 2025

pretrained-models speaker-verification voxceleb voxceleb2 hubert wav2vec2 wavlm

Updated Dec 30, 2025
Python

zhu00121 / Universal-representation-dynamics-of-deepfake-speech

Star

This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"

self-supervised deepfake-detection wav2vec2 wavlm modulation-transformation

Updated Oct 19, 2023
Python

bunyaminergen / WavLMRawNetXSVBase

Star

WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture

audio speech feature-extraction speaker-verification speech-processing rawnet wavlm

Updated Mar 10, 2025
Python

aitor-alvarez / acoustic-transformer-models

Star

Acoustic Transformer Models for Audio Classification

classification acoustic transformer-models pytorch-lightning hubert wav2vec2 wavlm

Updated Feb 15, 2025
Python

andreacecchin / SpeechEmotionRecognition_Emozionalmente

Star

This project investigates the performance of different Machine Learning pipelines applied to the task of speech emotion recognition (SER) on the Italian Emozionalmente dataset. Pipelines: MFCC, Wav2Vec and WavLM (both as feature extractors and after fine-tuning), Audio Spectrogram Transformer, cross-linguistic evaluation with f-t WavLM on CREMA-D.

speech-emotion-recognition wav2vec2 wavlm crema-d-dataset emozionalmente-dataset

Updated Feb 17, 2026
Jupyter Notebook

lucadellalib / cryceleb2023

Star

CryCeleb2023 experiments

metric-learning speaker-verification triplet-loss eer am-softmax ecapa-tdnn titanet wavlm cryceleb2023

Updated Jul 5, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the wavlm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the wavlm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wavlm

Here are 19 public repositories matching this topic...

yl4579 / StyleTTS2

s3prl / s3prl

wenet-e2e / wespeaker

lucadellalib / focalcodec

lucadellalib / audiocodecs

mjhydri / Singing-Vocal-Beat-Tracking

lucadellalib / discrete-wavlm-codec

Amir-Ivry / MAPSS-measures

bunyaminergen / WavLMMSDD

Sarasadeghii / Sharif-WavLM

alessandropec / data_driven_ai_voice_cloning

sadPororo / L-TDNN

theolepage / wavlm_ssl_sv

sadPororo / LAP

zhu00121 / Universal-representation-dynamics-of-deepfake-speech

bunyaminergen / WavLMRawNetXSVBase

aitor-alvarez / acoustic-transformer-models

andreacecchin / SpeechEmotionRecognition_Emozionalmente

lucadellalib / cryceleb2023

Improve this page

Add this topic to your repo