Since 2022 I have been working on the transformer models used by Leela Chess Zero to guide search.
My main contribution was a position encoding that enables models to play as if 2.5x larger with a ~15% latency bump.
The latest architecture iteration can at a scale of 190 million parameters produce a policy that is competitive with grandmasters.
Modern GPUs can evaluate this model thousands of times per second, allowing the engine to effectively emulate the strength of thousands of grandmasters.
A weaker feat in the same vein was achieved independently and concurrently in
a 2024 paper by DeepMind
with an agent that evaluates
a 270-million parameter model for each legal move. Our strongest model outperforms DeepMind's with 30x less computation.
This architecture has also been the subject of a NeurIPS article and two preprints.
One of the most interesting discoveries was an attention head which transmits information from the "to" square of the player's predicted follow-up move to the "to" square of the player's next move.
Each of my testing runs took a week or two on an A100 GPU, and training of full models took on the order of months on a cluster of 8 A100s.
All my work was in Tensorflow, and models were trained in the supervised setting on datasets of billions to tens of billions of positions
generated by prior reinforcement learning runs.
Some of the improvements I made to search include uncertainty weighting, which was first proposed in Go and allows the engine to put more
effort into positions the neural network reports uncertainty about, gaining 5 Elo;
and a scheme for sometimes reusing evaluations for positions that differ only in the 50-move rule count, gaining around 10 Elo.
Versions of Leela equipped with a Chessformer model and these search improvements defeated the reigning champion, Stockfish, at the TCEC Cup 11 and TCEC Swiss 6 and 7 championships.
See
my blog post and
our preprint. We have a newer submission under review.
Since 2024 I have been involved in the development of Stockfish, which is widely considered the strongest chess engine in existence. I have commited to the official Stockfish repository 60 times. Some of my favorite improvements, each of which gained Elo, follow:
In 2015 I created a distributed computing project in BOINC to calculate lower bounds for a combinatorial object called a van der Waerden number
using on a construction based the discrete logarithm modulo a prime. This started out as a way to get acquainted with Unix and C++ and to combine the computing power
of a few laptops into an interesting project, but the project soon grew to include over 500 users in 90 countries. We discovered around a dozen new bounds,
leading to a paper.
The infrastructure basically consisted of a server running a MySQL database to communicate with clients running a 200-line C++ program through BOINC.
The server assigned ranges of integers to clients, and the clients returned the minimum length of the
longest arithmetic progression in a coloring corresponding to each prime in that range.
Two clients validated each other's results and sent their results to the server, which updated the table of bounds in real time.
I revamped the project in high school to focus on the two-color case of the problem, which allowed packing bits and reducing the memory requirement by a factor of 8.
This allowed pushing the upper limit of the primes checked from a billion to 4 billion.
I recently noticed that there was a lot of interest in projects like Stockfish and Lc0, with videos about games between these engines garnering hundreds of thousands of views. However, there are plenty of misconceptions about how these engines work. I make videos about the development of these engines, with a focus on my work. The channel has ~4 thousand subscribers and over a hundred thousand views. These videos have led several new contributors to Stockfish.