Insights on Galaxy Evolution from Interpretable Sparse Feature Networks (SFNets)

We introduce sparse feature networks (SFNets), which contain a simple top-k sparsity constraint in their penultimate layers. We show that these SFNets can predict galaxy properties, such as gas metallicity or BPT line ratios, directly from image cutouts. SFNets produce interpretable feature activations, which can then be studied to better understand galaxy formation and evolution.

Requirements

This software uses fastai, built atop pytorch, and a few other packages that are commonly found in the data science stack. We've tested that this code works using fastai==2.7.17 and torch==2.4.1 on both Linux and macOS.

Install requirements with:

pip install torch fastai numpy pandas matplotlib cmasher tqdm

Directory Structure

./
├── data/
│   ├── images-sdss/
│   └── galaxies.csv
├── model/
├── results/
└── src/
    ├── config.py          
    ├── dataloader.py     
    ├── model.py         
    ├── main.py             
    └── trainer.py

Usage

Prepare your data:
- For convenience, the data can all be obtained via Zenodo. Simply download the images-sdss.tar.gz and unpack it (tar xzf images-sdss.tar.gz), and also download galaxies.csv.
- Alternatively, you can obtain the data directly from the source:
  - Construct galaxies.csv with the required columns (objID, oh_p50 for metallicity, or line flux measurements for BPT analysis). We used CASJobs to download galaxies using this query, and then enforced a signal-to-noise ratio (SNR) cut of 3 for all spectral lines.
  - Download SDSS galaxy images into data/images-sdss/. We used the DESI Legacy Viewer to download via the RESTful interface, e.g. http://legacysurvey.org/viewer/cutout.jpg?ra={ra}&dec={dec}&pixscale=0.262&layer=sdss&size=160.
Run experiments:
- Modify and run the main python main.py

from config import ExperimentConfig, DataConfig, TrainingConfig
from trainer import ModelTrainer

config = ExperimentConfig(
    name="metallicity_experiments",
    target="metallicity",
    k=2,
    model_dir=Path("../model"),
    results_dir=Path("../results"),
    data_config=DataConfig(),
    training_config=TrainingConfig()
)

# Train models
trainer = ModelTrainer(config)
trainer.train_model()

Models and results

The trained model weights can also be found on Zenodo.

Additionally, we have uploaded our trained model weights and sparse activation results here. The optimized ResNetTopK18 models should be able to reproduce the results shown in the paper.

Citation

This paper can be found on arXiv. For now, please use the following citation:

@ARTICLE{2025arXiv250100089W,
       author = {{Wu}, John F.},
        title = {Insights on Galaxy Evolution from Interpretable Sparse Feature Networks},
      journal = {arXiv e-prints},
     keywords = {Astrophysics - Astrophysics of Galaxies, Computer Science - Machine Learning},
         year = 2024,
        month = dec,
          eid = {arXiv:2501.00089},
        pages = {arXiv:2501.00089},
          doi = {10.48550/arXiv.2501.00089},
archivePrefix = {arXiv},
       eprint = {2501.00089},
 primaryClass = {astro-ph.GA},
       adsurl = {https://ui.adsabs.harvard.edu/abs/2025arXiv250100089W},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

License

This project is licensed under the MIT License; please see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Insights on Galaxy Evolution from Interpretable Sparse Feature Networks (SFNets)

Requirements

Directory Structure

Usage

Models and results

Citation

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Insights on Galaxy Evolution from Interpretable Sparse Feature Networks (SFNets)

Requirements

Directory Structure

Usage

Models and results

Citation

License