Code implementation of the following paper:
Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications (NeurIPS 2023) [bib]
@inproceedings{
doshi2023critical,
title={Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications},
author={Darshil Doshi and Tianyu He and Andrey Gromov},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=wRJqZRxDEX}
}
- Use the GradHook in utils/partialjaclib.py to compute APJN.
- See resnet_phase_diagram.py for an example computation of phase diagrams.
- All the data-arrays and plotting notebooks can be found in the Supplementary Material on OpenReview
- The implementation used for getting APJN for FCN and ViT is more tedious than necessary, since it records intermediate information.