At Align, we’re building a shared roadmap to unlock predictive models for protein engineering—starting with a grand challenge: predict the properties of a protein directly from its sequence. We aim to transform protein engineering from trial-and-error into a data-driven discipline. Our roadmap is co-developed with the scientific community and grounded in open data, reproducible methods, and scalable experimentation.
We partnered with the community to define where new data could most accelerate protein engineering. Through workshops and collaborative planning, two initial high-impact goals emerged: predicting protein function and predicting expression from sequence. These now anchor our protein project roadmap and shape the datasets we’re building with partners.
This workshop brought together experts in protein engineering, high-throughput experimentation, and ML who were interested in modeling protein expression across microbes. Through open discussion and roadmap co-design, we identified key challenges in linking DNA sequence to expression levels, aligned on target organisms and measurement approaches, and scoped an initial dataset. These conversations shaped the proposals and collaborations now in motion. Read more here.
This workshop brought together scientists and ML researchers with expertise in protein function — from experimental design to prediction. The goal was collaborative roadmap-building: through open discussion and shared ideation, we surfaced key challenges in function prediction, explored target selection strategies, and aligned on measurement techniques. These conversations seeded the working groups and dataset proposals we’re now advancing. Read more here.