My research

Published

January 29, 2025

My long term research goal is to improve stochastic models for complex systems like weather, markets, human brain, etc. that are determined by a large number of factors. I wants to design interpretable mathematical models and sound statistical inference techniques to improve our scientific understanding of these systems.

Mathematical models for complex systems are intractable, being typically governed by a large number of factors. However, one can often describe the limiting behavior of such systems when the number of factors is large. This is similar to the idea from statistical mechanics that we can describe macroscopic behavior well (e.g. ideal gas laws) even if the microscopic behavior is harder to study. You can find examples of this in my work on controlling a population of identical agents and load balancing using randomized routing. In this direction, motivated by the search for a more realistic model for the design of data centers, I am currently studying a model for load balancing on networks.

Naturally, even our best mathematical models can only approximate real systems. This is particularly true of models that manage to capture core mechanisms by making reasonable assumptions for mathematical tractability like Gaussianity, independence, exponentially distributed event times, etc. However it is not clear how to do statistical inference for these models after acknowledging that our model may indeed only be an approximation. Motivated by the coarsened inference framework of Miller and Dunson, I proposed the principle of optimism in data analysis, which states that one should allow some degree of data re-interpretation to improve inference of approximate models. Along with the theory behind this principle, I am developing practical methods for statistical inference of approximate models and applying it to problems in economics and neuroscience. Apart from this, I am also developing principled statistical frameworks for novel exploratory data analysis tasks like clustering with uncertainty and extracting statistically informative networks from noisy data.