Services

Professional and Cost-Saving Solutions

Generative AI De Novo Enzyme Engineering

Creative Enzymes combines machine learning with directed evolution to accelerate enzyme engineering while reducing screening burden. Our platform replaces random mutagenesis with intelligent library design, predicting high-impact mutations and prioritizing variants before they reach the bench. The result is faster convergence on optimized biocatalysts with fewer experimental cycles and lower resource consumption.

Why De Novo Design?

Natural enzymes have been shaped by billions of years of evolution for biological fitness, not industrial utility. This evolutionary legacy imposes fundamental constraints on what natural biocatalysts can achieve:

Natural Enzyme Limitations

The vast majority of natural enzymes remain undiscovered, and those that are characterized often lack the properties required for process chemistry: robustness under non-physiological conditions, tolerance of high substrate concentrations, or compatibility with manufacturing timelines.

Narrow Substrate Scope

Evolution optimizes enzymes for their native metabolic roles, not for the non-natural substrates that synthetic chemistry demands. Expanding scope through traditional engineering is slow and frequently encounters active-site geometry limits.

Insufficient Catalytic Properties

Turnover rates, selectivities, and stability profiles that suffice in vivo often fall short of the economic thresholds required for industrial adoption.

De novo design transcends these limitations by creating enzymes from first principles, unconstrained by evolutionary history. The design objective defines the enzyme, not the other way around.

Generative AI Design Platform

Our platform integrates five generative modules that transform enzyme design from modification to creation:

Sequence Generation

Diffusion and transformer-based models generate amino acid sequences conditioned on specified functional targets. Sequences are sampled from the learned distribution of proteins predicted to achieve the desired catalytic activity, not from existing natural homologs.

Structural Generation

Predicted structures are generated for each candidate sequence and evaluated for overall fold quality, active-site accessibility, and packing integrity. Designs with implausible architectures or steric clashes are filtered before experimental commitment.

Functional Motif Design

Catalytic residues, binding pockets, and cofactor coordination spheres are specified as design constraints and embedded into generated scaffolds. The model ensures that functional motifs are positioned within geometrically competent environments.

Scaffold Optimization

Generated scaffolds are refined for expressibility, solubility, and stability without compromising the catalytic motif. Surface charge distribution, loop flexibility, and core packing are optimized through iterative sequence adjustment.

Catalytic Residue Prediction

For target reactions with known mechanisms, the model predicts which residues should occupy catalytic positions based on transition-state analogs and reaction coordinate modeling. For novel reactions, exploratory designs sample diverse catalytic chemistries.

AI Design Workflow

AI Design Workflow

1. Target Function: The design objective is specified as a catalyzed reaction, substrate class, desired selectivity, and operating environment. This functional specification serves as the generative constraint, directing model output toward biochemically feasible solutions.

2. Generative Modeling: AI models sample sequence space to generate candidate enzymes predicted to satisfy the target function. Hundreds to thousands of candidates are produced, representing diverse architectural solutions to the same catalytic problem.

3. Sequence Generation: Generated sequences are filtered for predicted expressibility, folding propensity, and absence of aggregation-prone regions. Sequences with low predicted solubility or incompatible codon usage are removed.

4. Structure Evaluation: Predicted structures are assessed for active-site geometry, substrate pocket volume, catalytic residue positioning, and overall fold stability. Designs with geometrically implausible active sites are discarded.

5. Candidate Selection: Top-ranked candidates are selected for experimental synthesis based on composite scores combining predicted activity, stability, expressibility, and structural confidence. Selection balances exploration of diverse architectures with exploitation of highest-confidence designs.

Potential Engineering Goals

Our generative AI platform supports diverse research and development objectives:

Novel Activity

Enzymes for reactions with no known biological catalyst, enabling biocatalytic routes previously inaccessible to enzymatic transformation.

Substrate Specificity

De novo binding pockets tailored to non-natural substrates that natural enzymes cannot accommodate.

Improved Stability

Scaffolds designed from inception for thermal tolerance, organic solvent resistance, or long operational half-lives.

Industrial Adaptation

Enzymes optimized for manufacturing constraints: high concentration operation, cofactor independence, and straightforward downstream processing.

Experimental Validation Services

AI-generated enzyme candidates can be experimentally evaluated through our recombinant protein production, structural characterization, enzyme activity analysis, and stability testing services to support de novo enzyme engineering projects.

Example Design Scenario

Generative Artificial Intelligence For De Novo Protein Design

Overview of computational protein design workflows and SARS-CoV-2 spike protein structure Figure 1. a) Overview of computational protein design workflows. b) The SARS-CoV-2 spike protein folds into a trimeric structure and switches between closed and open conformations for ACE2 binding. (Winnifrith et al., 2024)

This review highlights how artificial intelligence is transforming de novo protein design by enabling the generation of novel proteins with desired structures and functions beyond those found in nature. Modern generative models, including language models, diffusion models, and graph neural networks, can explore plausible regions of protein space and design proteins with increasingly high experimental success rates approaching 20%. Sequence-based approaches mainly rely on large language models, while structure-based methods emphasize diffusion models. These AI systems can generate proteins with specified properties using programmable constraints or natural language instructions. Despite major progress, challenges remain in predicting experimental success, designing dynamic proteins, and incorporating regulatory mechanisms such as post-translational modifications.

FAQs

  • Q: Can AI generate entirely new enzymes?

    A: Yes. Generative models create sequences that do not exist in natural databases and are not derived from known homologs. These are genuinely novel proteins, designed from learned principles of sequence-structure-function relationships rather than evolutionary descent.
  • Q: Does de novo design require existing templates?

    A: No. The defining feature of de novo design is independence from natural templates. Design constraints are specified functionally—reaction type, substrate characteristics, operating conditions—without reference to existing enzymes. However, available structural or mechanistic data for related reactions can improve design accuracy when incorporated as additional constraints.
  • Q: How are candidates evaluated?

    A: Candidates undergo a multi-stage filtering pipeline: sequence-level prediction of expressibility and solubility, structure-level assessment of fold quality and active-site geometry, and functional-level estimation of catalytic feasibility. Top-ranked candidates proceed to experimental synthesis and functional screening. Results from experimental validation feed back into model refinement, improving subsequent design rounds.
  • Q: What is the success rate for de novo designs?

    A: Success rates depend on design complexity, functional specificity, and the maturity of relevant generative models. For well-characterized reaction types with clear mechanistic requirements, a meaningful fraction of experimentally tested candidates show detectable activity. For truly novel activities, success rates are lower but improve with each iteration as models accumulate training data.
  • Q: What is the typical timeline?

    A: Computational generation and filtering requires 2–3 months. Experimental synthesis, expression, and screening of top candidates adds 2–3 months. Full projects from functional specification to validated novel enzyme typically require 5–7 months.
  • Q: Can de novo enzymes be further optimized?

    A: Yes. Validated de novo designs serve as starting points for directed evolution, rational engineering, or stability optimization. Their novel sequence architecture often contains optimization trajectories unavailable from natural scaffolds.

References:

  1. Winnifrith A, Outeiral C, Hie BL. Generative artificial intelligence for de novo protein design. Current Opinion in Structural Biology. 2024;86:102794. doi:10.1016/j.sbi.2024.102794

For research and industrial use only. Not intended for personal medicinal use. Certain food-grade products are suitable for formulation development in food and related applications.

Services
Online Inquiry

For research and industrial use only. Not intended for personal medicinal use. Certain food-grade products are suitable for formulation development in food and related applications.