Research directions I've explored
A method for learning the quantization step size end-to-end during training, enabling low-precision neural network inference with minimal accuracy degradation.
An exploration of the fundamental limits of neural network inference across the axes of energy consumption, memory footprint, and computational time with implications for hardware-aware model design.
A simple and scalable quantization-aware training approach for large language models, enabling efficient deployment without significant loss in downstream task performance.
A principled approach to selecting which layers to quantize and at what precision, guided by an entropy-based sensitivity approximation reducing the search cost for mixed-precision configurations.
An investigation into how augmented feedback signals improve lateral knowledge transfer in progressive neural networks, drawing inspiration from cognitive science models of learning.
An extension of world model architectures with attention mechanisms, improving the accuracy and generalization of learned dynamics models for physical prediction tasks.