Vaclis Tone Replication (arXiv paper):

An AI-powered system for tone replication in music production by predicting EQ settings directly from audio features.
Creating a dataset of piano recordings with varied EQ parameters, our VTR neural network model accurately estimates multi-band EQ values (MSE: 0.0216).
This approach provides musicians with practical, flexible, and automated tone matching, paving the way for future support of more complex audio effects.
Feel free to scroll down the README or read the report file, poster or 5-minute-presentation.

How to use it?

Step 01. Generate dataset batch

As my experience, the Reascript in Lua causes some memory leaks problem, so it is not recommend to generate ten thousands of datasets all by once. (I used MacBook Pro M4 pro with 24GB mem, still totally run out of memory and crush...)

Hence, I write a script to seperate the for loop into subproblems by gen_data_batch_seperate.py. Just simply run it and it will generate files like batch_01.lua. In my case, to create 5 bands and {-12, -8, -4, 0, +4, +8, +12} dB gains, it has 7^5 = 16,087 files, and it should create 17 batch files.

Step 02. Create dataset by REAPER

Open Reaper app → Actions > Show Action List
In the filter section, search: ReaScript → click `ReaScript: Run/edit Reascript (EEL2 or lua)
Open the file ./autoEQ.lua and replace the YOUR_VTR_DIR into where you place the project and batch_XX into the batch you want to run.

If you want to test how many files you have done later on, you can comment some lines below in gen_data_batch_seperate.py! It will list finished and unfinished files amount out.

Step 03. Extract features from audio files

Just simply run extract_features.py and you will get two files:

audio_features.csv : Audio features extract from audio files by python package librosa. The features we collect are mentioned below, and the reason why I chose feel free to read my report and also feel free to add any features you want.

spectral_centroid,spectral_bandwidth,spectral_rolloff,mfcc_1,mfcc_2,mfcc_3,mfcc_4,mfcc_5,mfcc_6,mfcc_7,mfcc_8,mfcc_9,mfcc_10,mfcc_11,mfcc_12,mfcc_13,rms_energy

dataset_labels.csv: Ground truth for supervised learning model. I chose the below five frequencies, based on The Mixing Engineer's Handbook (2nd ed.), and also feel free to adjust to your case.

EQ_80,EQ_240,EQ_2500,EQ_4000,EQ_10000

Step 04. Run the model

Just simply run vtr_model.ipynb, and there includes full code of my baseline model (regression, random forest regressor) and main model (FFNN model).

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
0-archive		0-archive
1-baseline-single-band		1-baseline-single-band
2-baseline-interpolation-single-band-4dB		2-baseline-interpolation-single-band-4dB
3-mutli-band		3-mutli-band
AHG_weely_update		AHG_weely_update
audio_samples		audio_samples
final_presentation		final_presentation
.gitignore		.gitignore
README.md		README.md
VTR_system_design.jpg		VTR_system_design.jpg
autoEQ.lua		autoEQ.lua
extract_features.py		extract_features.py
gen_data_batch_seperate.py		gen_data_batch_seperate.py
requirements.txt		requirements.txt
vtr_model.ipynb		vtr_model.ipynb
vtr_poster.jpg		vtr_poster.jpg
vtr_report.pdf		vtr_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vaclis Tone Replication (arXiv paper):

How to use it?

Step 01. Generate dataset batch

Step 02. Create dataset by REAPER

Step 03. Extract features from audio files

Step 04. Run the model

Poster

About

Uh oh!

Releases

Packages

Uh oh!

Languages

vaclisinc/Vaclis_Tone_Replication

Folders and files

Latest commit

History

Repository files navigation

Vaclis Tone Replication (arXiv paper):

How to use it?

Step 01. Generate dataset batch

Step 02. Create dataset by REAPER

Step 03. Extract features from audio files

Step 04. Run the model

Poster

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages