Modeling and simulating various queuing models such as: M/M/1, M/M/k, G/G/1, G/M/1, and M/G/1 for Large Language Model (LLM) inference systems. These simlulations bridge the gap between theoretical frameworks and practical simulations, this provides valuable insights for developing scalable, energy-efficient AI infrastructures.
-
Updated
Jan 5, 2025 - Jupyter Notebook