SimpleFold: Folding Proteins is Less complicated than You Suppose

Protein folding fashions have achieved groundbreaking outcomes for the reason that introduction of AlphaFold2, sometimes constructed through a
mixture of integrating domain-expertise into its architectural designs and coaching pipelines. Nonetheless, given the
success of generative fashions throughout totally different however associated issues, it’s pure to query whether or not these architectural
designs are a necessity to construct performant fashions. On this paper, we introduce SimpleFold, the primary flow-matching primarily based
protein folding mannequin that solely makes use of normal objective transformer layers. As an alternative of counting on costly modules
like triangle consideration or pair illustration biases, or fastidiously crafted coaching aims, SimpleFold employs normal
transformer blocks with adaptive layers and is skilled through a generative flow-matching goal. We scale SimpleFold to
3B parameters and practice it on greater than 8.6M distilled protein constructions along with experimental PDB knowledge. To the
better of our data, SimpleFold is the most important scale folding mannequin ever developed. On normal folding benchmarks,
SimpleFold-3B mannequin achieves aggressive efficiency in comparison with state-of-the-art baselines. As a consequence of its generative
coaching goal, SimpleFold additionally demonstrates sturdy efficiency in ensemble prediction. SimpleFold challenges the
reliance on complicated domain-specific architectures designs in folding, highlighting another but necessary avenue of
progress in protein construction prediction.

No Result