Straddled Matrix: Using linear initialisation to improve speed of convergence and fully-trained error in Autoencoders
Training neural networks is a complex task with many interdependent hyper-parameters that each have to undergo a non-intuitive tuning process. Weight initialisation is generally overlooked in this process but can have a drastic impact on performance. We propose the Straddled Matrix, a nonstochashtic weight initialisation scheme that is an extension of the standard matrix identity and show that when benchmarked with a simple autoencoder on various datasets that our initialiser outperforms the current state of the art as measured by both convergence time and loss reached.
The Straddled Matrix is a modification of the standard identity matrix where the zero padding is replaced by diagonally filling all rows. This ensures that every feature in the dataset receives equal weight, regardless of the network architecture.
The Straddled Matrix is defined simply by the function straddled_matrix()
in autoencoder_model.py.
If you are looking for an alternative weight initialisation to improve the performance of your autoencoder, then this is for you! Designed for mostly linear datasets, but also works well with high amounts of non-linearity.
- Create a local virtual environment (change python version as required):
python3.9 -m venv _venv
- Activate and run from the virtual environment:
source _venv/bin/activate
- Install the requirements:
pip install -r requirements.txt
Synthetic
: created locallyMNIST
: available via KerasSwarm Behaviour
: can be downloaded from here, and then copied toautoencoder-paper/resources/swarmBehaviour
.
To run all the experiments run the following command:
bash run.sh
Each experiment presented in the paper has a config file named X_experiments_config.json
, where X is one of {swarm, mnist, synthetic} with the following format:
{
"X Experiment":
[
{
"num_tests": 10,
"num_epochs": 1500,
"learning_rate": 0.1
}
]
}
Note: num_tests
is the number of times the experiment is run with different random seeds. The MNIST and Swarm Behaviour experiments take a significant time. Consider reducing num_tests
if you only want to test the experiments.
All the figures and tables that appear in the paper are generated inside the folder experiments/plots
.
These are generated after running make_plots.py
via the run.sh
bash script.