@codetalker7
Hello! I’m an upcoming CS PhD student at the Kahlert School of Computing, University of Utah. On this website, you’ll find a lot of information about my work and interests.
Currently, I’m interested in building the theoretical foundations of large language models (LLMs). This includes all aspects of the inner workings of these models , ranging from their ability to retain long-term information (memory), their reasoning capabilities (eg., in-context learning) and making them more efficient (model compression). I’m also interested in other general problems in theoretical machine learning, pertaining to areas like high-dimensional statistics, online/statistical learning, and privacy-related problems in these areas.
My socials:
- Email:
chaudhary@cmi.ac.in, urssidd@gmail.com. - Google Scholar: Siddhant Chaudhary
- Discord: Siddhant Chaudhary#8409.
- LinkedIn: Siddhant Chaudhary.
- Twitter: @codetalker07
- Bluesky: @codetalker7
- Keybase: @codetalker7.
- Instagram: @codetalker7.
Previous work and affiliations
As part of the Laboratory for Computational Social Systems (LCS2) @IIT Delhi (led by Dr. Tanmoy Chakraborty), I previously worked as a research assistant on problems revolving around LLMs, particularly problems on model and inference efficiency. During this time, we developed a novel, calibration-free model pruning technique; we studied scaling laws for model pruning, and we also developed a new technique for compressing KV caches during the inference phase of LLMs.
During my masters (@CMI), I also worked as a research assistant at the Networks and Learning Group at TIFR Mumbai (led by Dr. Abhishek Sinha), wherein I worked on problems in the intersection of learning theory, convex optimization and online algorithms. This work involved designing provably optimal online policies for many problems arising frequently in ML domains: the online subset selection problem, for which we introduced an efficient online policy (which is also the first of it’s kind to incorporate hints in learning optimal subsets), the contextual bandits problem with a new concave $\alpha$-fairness objective, for which we developed the first such policy in the bandit information setup, and problems in the domain of online meta-learning and multi-task learning. I try to update my research work on this website, but it can also be found on my Google Scholar.
Open source and writing
I also love contributing to and exploring open source software. My contributions can be found on my GitHub profile(@codetalker7) or here (but this page could be outdated). I’ve also described my configuration (though some parts of it could be a bit old). I used to speed type (this is my last TypeRacer profile, which I’ve abandoned. I’m currently using this one). I’m also a huge NBA fan, and I love listening to and playing metal music (one of my favorite bands is Periphery)!
Occasionally, I write about random things (mostly related to tech, theory or OSS). You can find all of my blogs on the navigation bar on the left.
News
-
[04/2025] My work on ColBERT.jl accepted as a main talk to JuliaCon 2025!
-
[01/2025] Our paper on $\texttt{PruneNet}$, a novel structured model compression technique, accepted to ICLR 2025!
-
[10/2024] Successfully completed by GSoC 2024 project, and released v0.1.0 of ColBERT.jl. Check out a related blog post on the Julia Forem or on this page.
-
[06/2024] Our paper on fair contextual bandits accepted at FoRLaC@ICML2024!
-
[05/2024] Joined LCS2@IITD as an RA!
-
[05/2024] Accepted as a Google Summer of Code 2024 contributor (my second time doing GSoC)! Will be working on a GenAI project for The Julia Language.
-
[10/2023] Our paper on fair contextual bandits is out.
-
[02/2023] Our paper on the online subset selection problem is out.
Blog posts
- [06/2025] I wrote about my experiences with grad school applications.
- [05/2025] Started writing some posts on theory cs tools. This will be dynamic and I’ll add more things to this as I like.
- [08/2024] A blog about my implementation of ColBERT in pure Julia.
- [07/2024] A small discussion of how I used to manage knowledge bases. Note: this is kinda outdated, I use a much simpler method now.
Upcoming:
Currently empty. Will update when I have something.