1. Burgard AP, Pharkya P, Maranas CD. Optknock: a bilevel programming framework for
identifying gene knockout strategies for microbial strain optimization. Biotechnology and
bioengineering. 2003;84(6):647-57.
2. Pharkya P, Burgard AP, Maranas CD. OptStrain: a computational framework for redesign
of microbial production systems. Genome research. 2004;14(11):2367-76.
3. Pharkya P, Maranas CD. An optimization framework for identifying reaction
activation/inhibition or elimination candidates for overproduction in microbial systems.
Metabolic engineering. 2006;8(1):1-13.
4. Patil KR, Rocha I, F¨orster J, Nielsen J. Evolutionary programming as a platform for in
silico metabolic engineering. BMC bioinformatics. 2005;6(1):308.
5. Ranganathan S, Suthers PF, Maranas CD. OptForce: an optimization procedure for
identifying all genetic manipulations leading to targeted overproductions. PLoS Comput
Biol. 2010;6(4):e1000744.
6. Rocha I, Maia P, Evangelista P, Vilac¸a P, Soares S, Pinto JP, et al. OptFlux: an open-source
software platform for in silico metabolic engineering. BMC systems biology.
2010;4(1):1-12.
7. Toya Y, Shimizu H. Flux analysis and metabolomics for systematic metabolic engineering
of microorganisms. Biotechnology advances. 2013;31(6):818-26.
8. Orth JD, Thiele I, Palsson BØ. What is flux balance analysis? Nature biotechnology.
2010;28(3):245-8.
9. Vieira V, Maia P, Rocha M, Rocha I. Comparison of pathway analysis and constraint-based
methods for cell factory design. BMC bioinformatics. 2019;20(1):1-15.
10. Lun DS, Rockwell G, Guido NJ, Baym M, Kelner JA, Berger B, et al. Large-scale
identification of genetic design strategies using local search. molecular systems biology.
2009;5(1):296.
11. Rockwell G, Guido NJ, Church GM. Redirector: designing cell factories by reconstructing
the metabolic objective. PLoS Comput Biol. 2013;9(1):e1002882.
12. Yang L, Cluett WR, Mahadevan R. EMILiO: a fast algorithm for genome-scale strain
design. Metabolic engineering. 2011;13(3):272-81.
13. Egen D, Lun DS. Truncated branch and bound achieves efficient constraint-based genetic
design. Bioinformatics. 2012;28(12):1619-23.
14. Lewis NE, Hixson KK, Conrad TM, Lerman JA, Charusanti P, Polpitiya AD, et al. Omic
data from evolved E. coli are consistent with computed optimal growth from genome-scale
models. Molecular systems biology. 2010;6(1):390.
15. Gu D, Zhang C, Zhou S, Wei L, Hua Q. IdealKnock: a framework for efficiently
identifying knockout strategies leading to targeted overproduction. Computational biology
and chemistry. 2016;61:229-37.
16. Ohno S, Shimizu H, Furusawa C. FastPros: screening of reaction knockout strategies for
metabolic engineering. Bioinformatics. 2014;30(7):981-7.
17. Tamura T. Grid-based computational methods for the design of constraint-based
parsimonious chemical reaction networks to simulate metabolite production: GridProd.
BMC bioinformatics. 2018;19(1):325.
18. von Kamp A, Klamt S. Growth-coupled overproduction is feasible for almost all
metabolites in five major production organisms. Nature communications. 2017;8:15956.
19. Machado D, Herrg˚ard MJ, Rocha I. Stoichiometric representation of gene–protein–reaction
associations leverages constraint-based analysis from reaction to gene-level phenotype
prediction. PLoS computational biology. 2016;12(10):e1005140.
20. Razaghi-Moghadam Z, Nikoloski Z. GeneReg: A constraint-based approach for design of
feasible metabolic engineering strategies at the gene level. Bioinformatics. 2020.
21. Rocha I, Maia P, Rocha M, Ferreira EC. OptGene: a framework for in silico metabolic
engineering. 2008.
22. Monk JM, Lloyd CJ, Brunk E, Mih N, Sastry A, King Z, et al. iML1515, a knowledgebase
that computes Escherichia coli traits. Nature biotechnology. 2017;35(10):904-8.
23. Mo ML, Palsson BØ, Herrg˚ard MJ. Connecting extracellular metabolomic measurements
to intracellular flux states in yeast. BMC systems biology. 2009;3(1):37.
24. Orth JD, Fleming RM, Palsson BO. Reconstruction and use of microbial metabolic
networks: the core Escherichia coli metabolic model as an educational guide. EcoSal plus.
2010.
25. Heirendt L, Arreckx S, Pfau T, Mendoza SN, Richelle A, Heinken A, et al. Creation and
analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0. Nature
protocols. 2019;14(3):639-702.
26. King ZA, Dr¨ager A, Ebrahim A, Sonnenschein N, Lewis NE, Palsson BO. Escher: a web
application for building, sharing, and embedding data-rich visualizations of biological
pathways. PLoS Comput Biol. 2015;11(8):e1004321.
27. Kanehisa M, Sato Y. KEGG Mapper for inferring cellular functions from protein
sequences. Protein Science. 2020;29(1):28-35.
28. Apaolaza I, Valcarcel LV, Planes FJ. gMCS: fast computation of genetic minimal cut sets in
large networks. Bioinformatics. 2019;35(3):535-7.
29. Mori Y, Noda S, Shirai T, Kondo A. Direct 1, 3-butadiene biosynthesis in Escherichia coli
via a tailored ferulic acid decarboxylase mutant. Nature communications. 2021;12(1):1-12.
30. Shimizu Y, Kuruma Y, Kanamori T, Ueda T. The PURE system for protein production. In:
Cell-Free Protein Synthesis. Springer; 2014. p. 275-84.
31. Yousofshahi M, Orshansky M, Lee K, Hassoun S. Probabilistic strain optimization under
constraint uncertainty. BMC systems biology. 2013;7(1):1-13.
32. Deng X. Complexity issues in bilevel linear programming. In: Multilevel optimization:
Algorithms and applications. Springer; 1998. p. 149-64.
33. Kim J, Reed JL, Maravelias CT. Large-scale bi-level strain design approaches and
mixed-integer programming solution techniques. PLoS One. 2011;6(9):e24162.
34. Tepper N, Shlomi T. Predicting metabolic engineering knockout strategies for chemical
production: accounting for competing pathways. Bioinformatics. 2010;26(4):536-43.
35. R¨ohl A, Bockmayr A. A mixed-integer linear programming approach to the reduction of
genome-scale metabolic networks. BMC bioinformatics. 2017;18(1):1-10.
36. Acevedo-Rocha C, Gronenberg L, Mack M, Commichau F, Genee H. Microbial cell
factories for the sustainable manufacturing of B vitamins. Curr Opin Biotechnol.
2019;56:18-29.
37. Xiao F, Wang H, Shi Z, Huang Q, Huang L, Lian J, et al. Multi-level metabolic engineering
of Pseudomonas mutabilis ATCC31014 for efficient production of biotin. Metab Eng.
2020;61:406-15.
38. Booth I, Ferguson G, Miller S, Li C, Gunasekera B, Kinghorn S. Bacterial production of
methylglyoxal: a survival strategy or death by misadventure. Biochem Soc Trans.
2003;31(Pt 6):1406-8.
39. Niu W, Kramer L, Mueller J, Liu K, Guo J. Metabolic engineering of Escherichia coli for
the de novo stereospecific biosynthesis of 1,2-propanediol through lactic acid. Metab Eng
Commun. 2019;8:e00082.
Table 1. (A) The flux distribution for each gene deletion strategy when GR is maximized under
the condition with GR≥1 and PR≥1. (B) The priority of each gene deletion strategy candidate and
the resulting flux distribution for the pessimistic case of PR at GR maximization.
KO
𝑔1
𝑔2
𝑔3
𝑔4
𝑔5
𝑔3 , 𝑔4
𝑣1
10
KO
𝑔3 , 𝑔4
𝑔3
𝑔4
𝑣2
𝑣3
priority
𝑣4
𝑣1
10
𝑣5 𝑣6
(A)
𝑣2
10
(B)
𝑣3
𝑣7
𝑣4
reactions
cannot satisfy GR≥1
cannot satisfy GR≥1
r2
cannot satisfy PR≥1
r2 , 𝑟 5
𝑣5
𝑣6
10
𝑣7
Table 2. Variables used in the MILP formalization in gDel minRN for the example of Fig.2(A).
Variables
𝑥 1 to 𝑥 7
𝑥 8 to 𝑥 12
𝑥 13
𝑥 14 to 𝑥17
Type
binary
binary
binary
Object
reactions 𝑟 1 to 𝑟 7
genes 𝑔1 to 𝑔5
internal term(s) (𝑔3 ∨ 𝑔4 )
whether repressed or not for 𝑟 2 , 𝑟 3 , 𝑟 4 , 𝑟 5
Table 3. The linear constraints for the GPR rules in Fig.2.
Boolean functions
𝐾𝑂3 = 𝑔1
𝑥13 = 𝑔3 ∨ 𝑔4
−→
−→
𝐾𝑂2 = 𝑔1 ∧ 𝑔2 ∧ 𝑔3
−→
𝐾𝑂4 = 𝑔2 ∧ 𝑔5
−→
𝐾𝑂5 = 𝑥13 ∧ 𝑔5
−→
Linear constraints
−𝑥8 + 𝑥15 = 0
𝑥 10 + 𝑥11 − 2𝑥13 ≤ 0
−𝑥10 − 𝑥11 + 𝑥 13 ≤ 0
−𝑥8 − 𝑥 9 − 𝑥10 + 3𝑥14 ≤ 0
𝑥 8 + 𝑥9 + 𝑥10 − 𝑥 14 ≤ 2
−𝑥9 − 𝑥 12 + 2𝑥 16 ≤ 0
𝑥 9 + 𝑥12 − 𝑥16 ≤ 1
−𝑥12 − 𝑥13 + 2𝑥17 ≤ 0
𝑥 12 + 𝑥 13 − 𝑥 17 ≤ 1
Table 4. The methods for representing Boolean functions by linear constraints.
Boolean functions
𝑦 = 𝑥1 ∧ 𝑥2 ∧ · · · ∧ 𝑥 𝑘
−→
𝑦 = 𝑥1 ∨ 𝑥2 ∨ · · · ∨ 𝑥 𝑘
−→
Linear constraints
−𝑥1 − . . . − 𝑥 𝑘 + 𝑘 𝑦 ≤ 0
𝑥1 + · · · + 𝑥 𝑘 − 𝑦 ≤ 𝑘 − 1
𝑥1 + · · · + 𝑥 𝑘 − 𝑘 𝑦 ≤ 0
−𝑥1 − · · · − 𝑥 𝑘 + 𝑦 ≤ 0
Table 5. The purpose and type of variables used in MILP for the general case of gDel minRN.
Variables
𝑥1 to 𝑥 𝑛𝑟
𝑥 𝑛𝑟+1 to 𝑥 𝑛𝑟+𝑛𝑔
𝑥 𝑛𝑟+𝑛𝑔+1 to 𝑥 𝑛𝑟+𝑛𝑔+𝑛𝑡
𝑥 𝑛𝑟+𝑛𝑔+𝑛𝑡+1 to 𝑥 𝑛𝑟+𝑛𝑔+𝑛𝑡+𝑛𝑘𝑜
Type
real
binary
binary
binary
For
reaction fluxes
genes
internal terms
reaction repressions
Table 6. The constraint-based models that were used in the computational experiments.
Model
#genes
#reactions
#metabolites
#target metabolites
#essential genes
iML1515
1516
2712
1877
1085
196
iMM904
905
1577
1226
773
110
e coli core
137
95
72
48
Table 7. The performance comparison between gDel minRN, GDLS, and optGene. Each gene
deletion strategy was considered as successful when the minimum GR and PR were 0.001 or more
at GR maximization.
Model
#success
#success ratio
Avg. #genes
Max #genes
Min #genes
Range
Time
iML1515 iMM904
441
105
40.6%
13.6%
555.2
291.85
571
314
539
275
35-38%
30-35%
7m35s
49s
(A) gDel minRN
e coli core
47
97.9%
69.47
73
66
48-54%
0.40s
Model
#success
#success ratio
Avg. #genes
Max #genes
Min #genes
Range
Time
iML1515 iMM904
0%
0%
39s
0.83s
(B) GDLS
e coli core
10.4%
68.4
71
66
48-52%
0.812s
Model
#success
#success ratio
Avg. #genes
Max #genes
Min #genes
Range
Avg. time
iML1515 iMM904
30
0%
3.9%
897.4
895
901
- 98-100%
20m3s
20m20s
(C) optGene
e coli core
22
45.8%
130.3
136
127
92-100%
20m6s
Table 8. Case study of gDel minRN performance for three vitamins.
Target
Pantothenate
Biotin
Riboflavin
#used genes
555
540
544
PR
0.7444
0.1313
0.1198
GR
0.2485
0.1493
0.1212
time
4m24s
5m8s
3m34s
(YDOXDWH
WDUJHWPHWDEROLWH
SURGXFWLRQUDWH35
0D[LPL]H
FHOOJURZWKUDWH
*5
6XE
QHWZRUN
2ULJLQDO
QHWZRUN
&RQVWUDLQW
EDVHGPRGHO
(A)
&RQVWUDLQWEDVHGPRGHO
*5 ˢPD[LPL]HQG
1XWULHQW
0LQLPXPQXPEHU
RIUHDFWLRQVPLQ51
YLDJHQHGHOHWLRQV
J'HO
35ˀWKUHVKROG
7KHHIIHFWLYHSDUWWKDWDFKLHYHVJURZWKFRXSOHGSURGXFWLRQ
7KH XQQHFHVVDU\ SDUW WR EH GHOHWHG
ˢPD[LPL]HVWWKHQXPEHURIVXSSUHVVHGUHDFWLRQV
(B)
Figure 1. (A) Problem setting of this study. The minimum PR of the target metabolite is evaluated
when the GR is maximized. (B) The idea of gDel minRN algorithm. The maximum number of
reactions are repressed via gene deletions for growth-coupled production.
Substrate uptake
m1
[0,10]
r1
g1ҍg2ҍg3
[0,10]
Cell growth
[0,10]
m2
r2
r6
r3
m3
[0,10]
[0,5]
g2ҍg5
Target production
[0,10]
r4
g1
GRLB=1
[0,5]
r7
m4
PRLB=1
r5
(g3Ҏg4) ҍ g5
(A)
ID
10
11
12
13
Gene KO
none
g1
g2
g3
g4
g5
g1, g2
..
best
worst
best
worst
best
worst
best
worst
best
worst
both
best
worst
..
𝑣1
10
10
10
10
10
10
..
𝑣2
10
10
10
..
(B)
𝑣3
..
𝑣4
..
𝑣5
..
𝑣6
10
10
10
10
10
..
𝑣7
10
..
Figure 2. (A) A toy example of the constraint-based model. Circles and rectangles represent
metabolites and reactions, respectively. Black and white rectangles are external and internal
reactions. 𝑟 1 , 𝑟 6 , and 𝑟 7 are the substrate uptake, cell growth, and target metabolite production
reactions. [𝛼, 𝛽] represents the lower and upper bounds of the reaction speeds. (B) The optimistic
and pessimistic flux distributions from the viewpoints of PR for each gene deletion strategy
when GR is maximized. Deleting 𝑔3 achieves growth-coupled production since PR≥PRLB and
GR≥GRLB are satisfied even for the pessimistic case of PR.
$HT
QU
QJQW
EHT
QJQW
QU
QNR
%HT
$
QNR
%E
XE
XE
OE
OE
E
(A)
𝐵𝑒𝑞 =
𝐵=
x8
−1
x8 𝑥 9 𝑥 10 𝑥11 𝑥12 𝑥13 𝑥14 𝑥 15 𝑥16 𝑥17
−1 0 0 0 0 0 0 1 0 0
𝑥9
−1
−1
𝑥 10 𝑥 11 𝑥 12 𝑥13 𝑥14
0 −2 0
−1 −1 0
−1 0
0 −1
0 −1 0
0 −1 −1 0
𝑥15 𝑥16 𝑥17
0 0
0 0
0 ®
0 0
0 ®
0 0
0 ®
0 2
0 ®
0 −1 0 ®
0 0
2 ®
0 0 −1 ¬
(𝐵𝑏) 𝑇 = (0, 0, 0, 2, 0, 1, 0, 1)
(B)
Figure 3. (A) How to construct the components 𝐴𝑒𝑞, 𝑏𝑒𝑞, 𝐴, and 𝑏 for the MILP formalization
that gDel minRN searches a gene deletion strategy candidate. (B) 𝐵𝑒𝑞, 𝐵, 𝐵𝑏 for the example of
the network of Fig. 2(A).
Figure 4. The constructed pathway for biotin production. (A) Overview of the biotin synthesis
pathway from iML1515 classified into two pathways as upper and lower pathway. (B) Precise
flow of upper pathway, from glucose to malonyl-ACP. The number indicated with each arrows
shows the flux value of each reaction. The abbreviations are as follows; NADPH, Nicotinamide
adenine dinucleotide phosphate reduced form; UQ8, ubiquinone-8; ACP, acyl carrier protein; PEP,
phosphoenolpyruvate; OAA, oxaloacetate; PRPP, Phosphoribosyl diphosphate.
...