Artificial Intelligence reaches its true value when it solves complex, real-world challenges. At WhiteBox, we see ourselves as data translators capable of solving problems across any industry, and recently, we decided to put this to the test in the high-stakes territory of molecular biology.
We are proud to share that we achieved 15th place out of 1,867 teams in the Stanford RNA 3D Folding competition on Kaggle. Ranking in the Global Top 1% validates that our methodology can compete with the world’s leading laboratories.
The Challenge: Mastering a Domain from Scratch
We didn’t approach this as a cold math problem. Having a Biomedical Engineer on the team was key to bridging the gap between abstract numbers and biological reality. She led the internal training, ensuring the technical squad understood the "why" behind the data before we even touched the code.
For a few weeks, we effectively became "data biologists." We spent weeks researching ligands, sequences and the physics of how RNA folds to ensure our models respected the rules of nature. To stay precise, we even developed our own 3D visualization dashboard to inspect every prediction. This allowed us to spot structural flaws, such as steric clashes, that traditional metrics might overlook.

The Engineering Behind the Solution
To climb the leaderboard, we designed an engineering strategy based on combining multiple layers of intelligence:
- Our Own Template-Based Modeling (TBM): Instead of starting from scratch, we built our own TBM system, a method that uses known biological structures as a scaffold or mold to map out new ones. By using Biopython to align unknown sequences with existing and proven data, we created a robust baseline that served as the backbone for our more complex architectures.
- State-of-the-Art Architectures (RNAPro & Protenix): We integrated and optimized cutting-edge models from industry leaders like NVIDIA and ByteDance. We fine-tuned their diffusion parameters and refinement cycles, an iterative process where the AI continuously "polishes" a structure to reach its most stable form, allowing our system to handle everything from short sequences to giant RNA structures with high precision.
- In-house Innovation: Our main innovation involved fine-tuning diffusion models, which are advanced architectures that generate high-fidelity structures by reversing "noise" into clear and organized patterns. We took these models a step further by modifying their training process to include approximations of the TM-score. Since this is the gold-standard metric in biology for measuring how accurately a predicted shape matches a real molecule, integrating it directly into the learning process was a game-changer for our precision. While the journey involved hurdles like catastrophic forgetting, where a model starts losing original skills while learning new tasks, every obstacle became an invaluable lesson in how to build more resilient AI.
Talent That Transforms
Securing 15th place globally is a true reflection of our culture. We want to give a special shout-out to our squad of Data Scientists that led this initiative with remarkable drive. At WhiteBox, we don't believe in watching from the sidelines but our team dives deep into every challenge, innovating and competing at the highest level alongside thousands of experts worldwide .
Want to dive into the technical details? Check out our full Solution Writeup on Kaggle.



