The code from Brian Hie and Peter Kim’s lab in Stanford described in this 2023 Nature Biotech paper can recommend mutations in proteins. Those mutations have been shown experimentally to be beneficial for antigen binding affinity, thermal stability, poly-specific binding, immunogenicity, and viral neutralization. This is amazing and cool, right?
I was curious to test this myself. Using conda, git, and yaml I had the python scripts up and running in short time. This is thanks to the professional code and documentation of the main author on the paper, Brian Hie, who is now assistant professor at Stanford. He has a background in computer science and was trained at Stanford and MIT, with added stints at Meta, Google, Illumina, Salesforce and Microsoft. No wonder he can code! Good for me. Also noteworthy, Brian was a contributor to the ESM models published in this Meta AI Science paper from 2023. Some of those models (ESM-1b and ESM-1v_*1…5) and were used here.
Besides the easy setup, I was surprised to see that the ESM models even suggest mutations on the best-selling antibody of all time, Humira. Output after ~30 seconds:

Sharing my insights on the paper’s results:
Three of the seven tested antibodies (e.g. MEDI8852, mAb114 and REGN10987) were in the clinic; in other words they have undergone thorough in vitro and in vivo evolution before. This fact adds an extra layer of significance to the obtained results. Another intriguing observation, also highlighted by the authors, is that approximately half of the recommended mutations, such as the E65K mutation for Humira VH, occur in framework regions that typically aren’t associated with antigen binding.
I interpret this phenomenon as a reflection of how the models scrutinize the provided sequence data and propose alterations based on their extensive training. This function proves invaluable, considering the sheer volume of sequences—nearly 100 million—no human could possibly commit to memory. It’s worth noting, however, that the enhancements achieved through these mutations are occasionally modest, typically resulting in a twofold improvement.
Despite this caveat, AI driven by unsupervised models exhibits remarkable predictive capabilities for identifying beneficial mutations across various parameters. Moreover, the commendable profiling of 122 mutants, with nearly all successfully expressed, underscores the robustness of the approach.