Gemma3 1b instruct is an open-source LLM supporting a 128k context window. This demo uses only 2K context.
The BPP library implements matrix multiplication with far less multiplications.