Machine Learning Engineer
About this role
Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Emeryville, CA, we are backed by leading investors including Altimeter Capital, Bezos Expeditions, Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures, and have raised over $150M to date.
We're looking for an experienced Machine Learning Engineer to build and improve the models and ML systems that drive our protein design efforts. In this role, you'll deploy and optimize large-scale generative models for protein design, and develop the surrounding infrastructure and tooling that enable our ML and protein design scientists to work faster and more confidently. As an early member of a small, fast-moving engineering team, you'll have significant ownership over our ML stack and the opportunity to shape how our platform evolves.
Responsibilities
• Build robust, reproducible and user-friendly pipelines for automated model fine-tuning, alignment and evaluation
• Design and implement modular, easy-to-maintain, multi-model pipelines for protein design.
• Develop highly scalable ETL pipelines to process petabyte-scale protein data for model pretraining
• Optimize model training and inference code to maximize throughput and resource utilization when deployed at scale
• Develop software and infrastructure that enable the ML team to work quickly and frictionlessly in distributed and multi-cloud environments
• Partner with ML and protein design scientists to prototype research ideas and bring them into production
Who You Are
• You're comfortable taking ownership and working independently in a fast-moving environment
• You're an execution-oriented engineer who maintains high standards, and focuses on the highest-impact work
• You're comfortable owning the full stack of your work, from training code to the infrastructure it runs on
• You care deeply about model quality, efficiency, and reliability
• You're willing to step beyond your core responsibilities when the team needs it
Representative Projects
• Building hyperparameter search frameworks for SFT and Alignment workflows
• Increasing protein language model throughput during long context generation
• Updating existing model architectures to work and run efficiently on new GPU hardware
• Implementing a protein design pipeline that integrates prompt retrieval, sequence generation, attribute prediction, and structure prediction
• Establishing an ETL pipeline for sampling and tokenizing training datasets from an internal database of billions of sequences
• Developing a benchmarking and evaluation system for newly trained sequence generation models
• Contributing to the development of an internal service that provides transparent multi-node job submission for ML scientists
Qualifications
• BS or MS in Computer Science, Machine Learning, or a related field
• 3+ years of hands-on experience building and training ML models in PyTorch
• Strong Python and software engineering fundamentals, including testing, code quality, and version control
• Experience profiling, benchmarking, and optimizing ML model training and inference
• Experience implementing or optimizing transformer-based architectures
• Familiarity with cloud infrastructure and containerization (GCP, AWS, Azure, Kubernetes, Docker)
• Strong fundamentals in ML, statistics, and/or linear algebra
Preferences (but not required)
• Familiarity with protein language models or computational biology
• Experience with GPU-level optimization (CUDA, Triton)
• Experience with distributed training (DDP, FSDP, multi-node GPU clusters)
• Experience with databases and data processing pipelines
• Experience orchestrating multi-step ML workflows
• Experience building backend systems that serve ML models in production
• Contributions to open source ML projects or published research
What We Offer
• High-growth opportunity with meaningful impact on the future of protein design
• Competitive compensation package with equity participation
• 401(k) with a strong employer match
• Comprehensive benefits including health/dental/vision insurance
• Generous PTO policy and commitment to work-life balance
• Professional development opportunities in a cutting-edge field at the intersection of AI and biology
Profluent Bio, Inc is an equal opportunity employer promoting diversity and inclusion in the workspace. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical conditions, veteran status, sexual orientation, gender (including gender identity and gender expression), sex (which includes pregnancy, childbirth, and breastfeeding), genetic information, taking or requesting statutorily protected leave, or any other basis protected by law.
Employment Eligibility Verification
Legal authorization to work in the United States is required. In compliance with federal law, all persons hired must verify their identity and work eligibility and complete the required employment verification form upon hire.
Hiring Salary Range
$180,000—$250,000 USD