HuggingFace Unveils RYS-XLarge: Game-Changing AI News

Headline:
Hacker Tops Hugging Face Open LLM Leaderboard by Duplicating Model Layers on Two Gaming GPUs

Key Facts

What: Independent researcher David Noel Ng achieved the top spot on the Hugging Face Open LLM Leaderboard with "dnhkng/RYS-XLarge" by duplicating seven middle layers of an existing 72B-parameter model without modifying any weights.
How: The technique, which Ng calls "LLM Neuroanatomy," involves copying specific layers responsible for abstract reasoning and stitching them back into the model architecture.
Hardware: The project was completed using only two gaming GPUs in a home basement setup.
When: The model reached #1 in mid-2024 on the original Hugging Face Open LLM Leaderboard, which evaluated models across six challenging benchmarks including IFEval, BBH, MATH Lvl 5, GPQA, MuSR, and MMLU-PRO.
Discovery: Ng's approach stemmed from observations about how LLMs process Base64-encoded inputs and analysis of the "Goliath-120B" merged model.

Lead paragraph
An independent researcher topped the competitive Hugging Face Open LLM Leaderboard in mid-2024 without training a new model, merging weights, or performing any gradient updates. David Noel Ng achieved the feat by taking a 72-billion-parameter model, duplicating seven of its middle layers, and reassembling the architecture — a technique he describes as "LLM Neuroanatomy." The resulting model, dnhkng/RYS-XLarge, claimed the top position on the leaderboard's six-benchmark suite, demonstrating that strategic architectural changes alone can outperform heavily optimized and fine-tuned competitors.

Body

Ng detailed his unconventional journey in a March 10, 2026 blog post titled "LLM Neuroanatomy: How I Topped the AI Leaderboard Without Changing a Single Weight." According to the post, the project originated from two curious observations made in late 2023.

The first clue came from experimenting with Base64-encoded prompts. Ng found that sufficiently capable 2023-era LLMs could decode Base64 input, reason about the underlying question, and output a Base64-encoded response — all while performing genuine multi-step reasoning. This behavior suggested that early transformer layers function primarily as input "translators" that convert various encodings into an abstract internal representation, while later layers serve as "re-translators" that convert that representation back into the desired output format.

This led Ng to speculate that the middle layers of the model are dedicated to pure, abstract reasoning in a format independent of human language or specific encodings.

The second clue emerged from examining "Goliath-120B," a Frankenmerged model released by Hugging Face user Alpindale in November 2023. Rather than simply stacking two fine-tuned Llama-2 70B models, Alpindale alternated layers between them and created feedback connections where outputs from later layers fed back into earlier ones. Ng found the performance interesting but not revolutionary. However, the unusual architecture prompted him to investigate how different layers contribute to model capabilities.

To explore these ideas further, Ng developed what he calls a "homebrew brain scanner" for transformer models. This tool allowed him to analyze the internal structure of LLMs and identify which specific layers appeared responsible for the core reasoning process. After months of experimentation, he determined that duplicating a particular block of seven middle layers from a 72B model produced significant performance gains when inserted back into the architecture.

The resulting RYS-XLarge model reached the top of the Hugging Face Open LLM Leaderboard, surpassing entries from well-funded labs and experienced fine-tuning practitioners using creatively named models such as Nous-Hermes, Dolphin, and NeuralBeagle14-7B.

Impact

Ng's work highlights an underexplored dimension of LLM optimization: architectural surgery rather than traditional training or fine-tuning. By demonstrating that simply providing additional copies of key reasoning layers can boost performance, the research suggests current models may contain redundant or bottlenecked computational pathways that can be amplified without expensive retraining.

For developers and researchers with limited resources, the approach is particularly significant because it was accomplished on consumer gaming hardware. This democratizes advanced model optimization techniques that have traditionally required massive compute clusters.

The findings also contribute to a growing body of work attempting to understand the "neuroanatomy" of transformer models — mapping which layers handle specific functions such as input processing, abstract reasoning, and output generation. This could lead to more efficient model designs and better interpretability tools.

The Hugging Face Open LLM Leaderboard itself has evolved since Ng's model topped the original version. The platform later introduced tougher benchmarks that produced more varied rankings, with some models shifting positions dramatically, according to reports on the leaderboard overhaul.

What's Next

Ng has not yet published his findings in a formal academic paper, noting that he found blogging the discovery process more engaging than drafting scientific papers. He indicates the core discovery about LLM internal structure had not been published elsewhere at the time of his post.

The technique raises questions about whether similar layer-duplication or "layer amplification" strategies could be applied systematically across different model families and sizes. It also suggests potential for hybrid approaches that combine architectural modifications with traditional fine-tuning or merging techniques.

As the AI community continues to explore more efficient ways to improve open models without massive training runs, methods like LLM Neuroanatomy may inspire new optimization paradigms that focus on understanding and enhancing existing model structures rather than simply scaling compute and data.

Sources

All technical specifications, pricing, and benchmark data in this article are sourced directly from official announcements. Competitor comparisons use publicly available data at time of publication. We update our coverage as new information becomes available.

RYS-XLarge: Breaking News

Sources

Original Source

Related Topics

Comments