Optimization Modulo Non-Linear Arithmetic via Incremental Linearization (experimental evaluation)

Table of Contents

This webpage contains benchmarks and tools necessary for reproducing the experimental evaluation results for the paper Optimization Modulo Non-Linear Arithmetic via Incremental Linearization (experimental evaluation) that was submitted to FroCos 2021.

1 Tools

In the experimental evaluation, we compared the following two tools:

We used the following settings for the solvers:

  • OptiMathSAT binary search

    -optimization=true -model_generation=true --opt.verbose=true -stats -opt.strategy=bin
    
  • OptiMathSAT linear search

    -optimization=true -model_generation=true --opt.verbose=true -stats -opt.strategy=lin
    
  • Z3

    -st smt.arith.solver=2
    

2 Benchmarks

We have used two sets of benchmarks, which are available from the following archive: benchmarks.tar.bz2:

  • omtplan, which are OMT(NRA) benchmarks that were generated by a tool OMTPlan,
  • QF_NIA, which are OMT(NIA) benchmarks that were generated from SMT(NIA) benchmarks from the SMT-LIB repository.

Because vast majority of QF_NIA benchmarks comes from the benchmark family VeryMax, we did not used all the benchmarks in that family, but randomly picked 10% of its benchmarks. The file names of the used benchmarks are documented in the file benchmarks/nia_benchmarks_subset.txt within the archive benchmarks.tar.bz2.

3 Results

Because our experiments were conducted on a cluster with a specific infrastructure, there is no ready script that could be used by everybody to run the experiments. However, data from our experiments can be found in the following archive: results.tar.bz2

The archive contains the following CSV files:

  • results/omtplan_z3_pddl.csv
  • results/omtplan_oms_lin.csv
  • results/omtplan_oms_bin.csv
  • results/nia_z3_pddl.csv
  • results/nia_oms_lin.csv
  • results/nia_oms_bin.csv

4 R environment

To analyse the results, load necessary R libraries.

library(ggplot2)
library(reshape2)
library(tidyr)
library(readr)
library(stringr)
library(scales)
library(dplyr)
library(ggthemes)
options(scipen=999)
theme_set(theme_light(base_size = 20))

Load the csv files with the results.

omtplan_z3 <- read_csv(file="results/omtplan_z3.csv")
omtplan_oms_lin <- read_csv(file="results/omtplan_oms_lin.csv")
omtplan_oms_bin <- read_csv(file="results/omtplan_oms_bin.csv")

nia_z3 <- read_csv(file="results/nia_z3.csv")
nia_oms_lin <- read_csv(file="results/nia_oms_lin.csv")
nia_oms_bin <- read_csv(file="results/nia_oms_bin.csv")

Union all data for all solvers into a single data sets:

data_omtplan <- bind_rows(omtplan_z3, omtplan_oms_lin, omtplan_oms_bin) %>% arrange(solver)
data_nia <- bind_rows(nia_z3, nia_oms_lin, nia_oms_bin) %>% arrange(solver)

5 OMTPlan

5.1 Overall

data_omtplan %>%
  group_by(solver) %>%
  count(status) %>%
  pivot_wider(names_from = status, values_from = n, values_fill=0)
solver partial sat timeout unsat error
omsbin 226 13 367 146 0
omslin 225 14 367 146 0
z3 142 65 379 163 3

5.2 By family

5.2.1 Sat

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "sat")) %>%
  pivot_wider(names_from = solver, values_from = sat)
family omsbin omslin z3
nl/carnl 0 0 0
nl/convoysnl 0 0 0
nl/hvac 0 0 0
nl/nlcounters 2 3 4
nl/nlcounterssimple 11 11 32
nl/secclearancesdac 0 0 29

5.2.2 Partial

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(partial = sum(status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = partial)
family omsbin omslin z3
nl/carnl 7 7 3
nl/convoysnl 0 0 0
nl/hvac 2 2 0
nl/nlcounters 14 13 15
nl/nlcounterssimple 16 16 11
nl/secclearancesdac 187 187 113

5.2.3 Sat or partial

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(sat_or_partial = sum(status == "sat" | status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = sat_or_partial)
family omsbin omslin z3
nl/carnl 7 7 3
nl/convoysnl 0 0 0
nl/hvac 2 2 0
nl/nlcounters 16 16 19
nl/nlcounterssimple 27 27 43
nl/secclearancesdac 187 187 142

5.2.4 Unsat

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(unsat = sum(status == "unsat")) %>%
  pivot_wider(names_from = solver, values_from = unsat)
family omsbin omslin z3
nl/carnl 0 0 0
nl/convoysnl 24 24 24
nl/hvac 122 122 124
nl/nlcounters 0 0 5
nl/nlcounterssimple 0 0 10
nl/secclearancesdac 0 0 0

5.3 Benchmark quality

Total benchmarks:

data_omtplan %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
752

Total benchmarks by family:

data_omtplan %>%
  select(family, benchmark) %>%
  distinct() %>%
  count(family)
family n
nl/carnl 64
nl/convoysnl 24
nl/hvac 128
nl/nlcounters 88
nl/nlcounterssimple 160
nl/secclearancesdac 288

Unsat by at least one solver:

data_omtplan %>%
  filter(status == "unsat") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
163

Sat or partial by at least one solver

data_omtplan %>%
  filter(status == "sat" | status == "partial") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
262

Sat or partial by at least one solver by family

data_omtplan %>%
  filter(status == "sat" | status == "partial") %>%
  select(family, benchmark) %>%
  distinct() %>%
  count(family)
family n
nl/carnl 8
nl/hvac 2
nl/nlcounters 21
nl/nlcounterssimple 43
nl/secclearancesdac 188

5.4 Added linearization lemmas

Investigate which kinds of linearization lemmas were used by OMSLIN during the search:

Added lemmas by type for the sat results:

omtplan_oms_lin %>%
  filter(solver == "oms_lin" & status == "sat") %>%
  pivot_longer(
    cols = ends_with("lemmas"),
    names_to = c("type"),
    names_pattern = "(.*)_lemmas",
    values_to = "lemmas") %>%
filter(lemmas != 0) %>%
group_by(family, benchmark) %>%
summarize(lemma_types = toString(type)) %>%
ungroup() %>%
count(lemma_types)
lemmatypes n
zero 4
zero, neutral, proportionality 3
zero, neutral, proportionality, bound, tangent 2
zero, neutral, proportionality, bound, tangent, monotonicity 2
zero, neutral, proportionality, tangent 3

Added lemmas by type for the partial results:

omtplan_oms_lin %>%
  filter(solver == "oms_lin" & status == "partial") %>%
  pivot_longer(
    cols = ends_with("lemmas"),
    names_to = c("type"),
    names_pattern = "(.*)_lemmas",
    values_to = "lemmas") %>%
filter(lemmas != 0) %>%
group_by(family, benchmark) %>%
summarize(lemma_types = toString(type)) %>%
ungroup() %>%
count(lemma_types)
lemmatypes n
zero 2
zero, neutral 2
zero, neutral, proportionality 5
zero, neutral, proportionality, bound, tangent 37
zero, neutral, proportionality, bound, tangent, monotonicity 173
zero, neutral, proportionality, tangent 3
zero, proportionality 2
zero, proportionality, tangent 1

5.5 Comparison of individual results

Compute the table that has one row per benchmark that contains results for each solver.

merged <- data_omtplan %>%
  select(solver, family, benchmark, status, time, lower, upper) %>%
  group_by(family, benchmark) %>%
  pivot_wider(names_from = solver, values_from = c(status, time, lower, upper)) %>%
  ungroup()

write.csv(merged, file = "table.csv")

5.5.1 Cross comparison

Cross comparison of results returned by oms_bin (row) and z3 (column):

merged %>%
  count(status_z3, status_oms_bin) %>%
  pivot_wider(names_from = status_z3, values_from = n)
statusomsbin error partial sat timeout unsat
partial 3 127 45 51 nil
timeout nil 15 7 328 17
sat nil nil 13 nil nil
unsat nil nil nil nil 146

Cross comparison of results returned by oms_lin (row) and z3 (column):

merged %>%
  count(status_z3, status_oms_lin) %>%
  pivot_wider(names_from = status_z3, values_from = n)
statusomslin error partial sat timeout unsat
partial 3 127 44 51 nil
timeout nil 15 7 328 17
sat nil nil 14 nil nil
unsat nil nil nil nil 146

5.6 VBS

Unsat by at least one solver:

data_omtplan %>%
  filter(status == "unsat") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
163

Unsat by at least one OMS:

data_omtplan %>%
  filter(status == "unsat" & (solver == "oms_bin" | solver == "oms_lin")) %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
146

Timeout by all solvers:

merged %>%
  filter(status_z3 == "timeout" & status_oms_bin == "timeout" & status_oms_lin == "timeout") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
327

Timeout by all OMS:

merged %>%
  filter(status_oms_bin == "timeout" & status_oms_lin == "timeout") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
366

5.7 OMS iterations

What is the number of iterations that oms_bin needed to find the optimum?

data_omtplan %>%
  filter(solver == "oms_bin" & status == "sat") %>%
  count(iterations)
iterations n
3 11
4 1
7 1

What is the number of iterations that oms_bin went through for the partial results?

data_omtplan %>%
  filter(solver == "oms_bin" & status == "partial") %>%
  count(iterations)
iterations n
1 128
2 18
3 16
4 13
5 20
6 11
7 11
8 3
9 2
11 2
13 1
16 1

What is the number of iterations that oms_lin needed to find the optimum?

data_omtplan %>%
  filter(solver == "oms_lin" & status == "sat") %>%
  count(iterations)
iterations n
2 10
3 1
4 1
5 2

What is the number of iterations that oms_lin went through for the partial results?

data_omtplan %>%
  filter(solver == "oms_lin" & status == "partial") %>%
  count(iterations)
iterations n
1 50
2 20
3 24
4 16
5 13
6 5
7 6
8 3
9 7
10 7
11 4
12 6
13 6
14 2
15 2
16 4
17 4
18 1
19 1
20 4
21 3
22 3
23 1
24 2
25 3
27 1
28 3
29 2
30 3
32 2
33 3
36 1
40 1
41 1
45 1
46 1
48 2
49 1
51 1
52 1
56 1
57 1
58 1
63 1

5.8 Plots

5.8.1 Upper bounds

data_omtplan %>%
    rowwise() %>%
    select(solver, family, benchmark, upper) %>%
    filter(upper != "None" & !grepl("oo", upper) & !grepl("inf", upper) & !grepl("epsilon", upper)) %>%
    mutate(upper = eval(parse(text=upper))) %>%
    group_by(family, benchmark) %>%
    pivot_wider(names_from = solver, values_from = c(upper)) %>%
    ggplot(mapping = aes(x = z3, y = oms_lin)) +
    geom_point() +
    scale_x_continuous("Z3", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    scale_y_continuous("OptiMathSAT(LIN)", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    geom_abline(slope = 1, color="lightgray")

upper_bounds_z3_oms_lin.png

data_omtplan %>%
    rowwise() %>%
    select(solver, family, benchmark, upper) %>%
    filter(upper != "None" & !grepl("oo", upper) & !grepl("inf", upper) & !grepl("epsilon", upper)) %>%
    mutate(upper = eval(parse(text=upper))) %>%
    group_by(family, benchmark) %>%
    pivot_wider(names_from = solver, values_from = c(upper)) %>%
    ggplot(mapping = aes(x = z3, y = oms_bin)) +
    geom_point() +
    scale_x_continuous("Z3", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    scale_y_continuous("OptiMathSAT(BIN)", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    geom_abline(slope = 1, color="lightgray")

upper_bounds_z3_oms_bin.png

6 NIA

6.1 Overall

data_nia %>%
  group_by(solver) %>%
  count(status) %>%
  pivot_wider(names_from = status, values_from = n, values_fill=0)
solver error partial sat timeout unknown
omsbin 92 1047 3258 1341 0
omslin 101 1019 3276 1348 0
z3 0 1105 1975 2589 75

6.2 By family

6.2.1 Sat

data_nia %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "sat")) %>%
  pivot_wider(names_from = solver, values_from = sat)
family omsbin omslin z3
QFNIA/20170427-VeryMax 1031 1051 273
QFNIA/AProVE 1729 1722 1284
QFNIA/calypto 221 226 237
QFNIA/LassoRanker 13 14 6
QFNIA/leipzig 250 248 129
QFNIA/mcm 1 1 33
QFNIA/UltimateLassoRanker 13 14 13

6.2.2 Partial

data_nia %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = sat)
family omsbin omslin z3
QFNIA/20170427-VeryMax 981 948 935
QFNIA/AProVE 49 53 130
QFNIA/calypto 2 2 0
QFNIA/LassoRanker 2 1 4
QFNIA/leipzig 7 9 26
QFNIA/mcm 0 0 7
QFNIA/UltimateLassoRanker 6 6 3

6.2.3 Sat or partial

data_nia %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "sat" | status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = sat)
family omsbin omslin z3
QFNIA/20170427-VeryMax 2012 1999 1208
QFNIA/AProVE 1778 1775 1414
QFNIA/calypto 223 228 237
QFNIA/LassoRanker 15 15 10
QFNIA/leipzig 257 257 155
QFNIA/mcm 1 1 40
QFNIA/UltimateLassoRanker 19 20 16

6.3 Benchmark statisticts

Total benchmarks:

data_nia %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
5744

6.4 VBS

Compute the data for the virtual best solver:

merged_nia <- data_nia %>%
  select(solver, family, benchmark, status, time, lower, upper) %>%
  group_by(family, benchmark) %>%
  pivot_wider(names_from = solver, values_from = c(status, time, lower, upper)) %>%
  ungroup()

Timeout/error by all solvers:

merged_nia %>%
  filter((status_z3 == "timeout" | status_z3 == "error" | status_z3 == "unknown") &
         (status_oms_lin == "timeout" | status_oms_lin == "error") &
         (status_oms_bin == "timeout" | status_oms_bin == "error")) %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
1130

Partial by some solver, not solved by any:

merged_nia %>%
  filter((status_z3 == "partial" | status_oms_lin == "partial" | status_oms_bin == "partial") &
         (status_z3 != "sat" & status_oms_lin != "sat" & status_oms_bin != "sat")) %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
1122

Timeout/error by all OMS:

merged_nia %>%
  filter((status_oms_lin == "timeout" | status_oms_lin == "error") &
         (status_oms_bin == "timeout" | status_oms_bin == "error")) %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
1415

Partial by some OMS, not solved by any:

merged_nia %>%
  filter((status_oms_lin == "partial" | status_oms_bin == "partial") &
         (status_oms_lin != "sat" & status_oms_bin != "sat")) %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()
x
972

6.5 Added linearization lemmas

Investigate which kinds of linearization lemmas were used by OMSLIN during the search:

nia_oms_lin %>%
  filter(solver == "oms_lin" & status == "sat") %>%
  pivot_longer(
    cols = ends_with("lemmas"),
    names_to = c("type"),
    names_pattern = "(.*)_lemmas",
    values_to = "lemmas") %>%
filter(lemmas != 0) %>%
group_by(family, benchmark) %>%
summarize(lemma_types = toString(type)) %>%
ungroup() %>%
count(lemma_types)
lemmatypes n
neutral 1
proportionality, tangent 1
zero 893
zero, neutral 47
zero, neutral, proportionality 999
zero, neutral, proportionality, bound 273
zero, neutral, proportionality, bound, monotonicity 291
zero, neutral, proportionality, bound, tangent 142
zero, neutral, proportionality, bound, tangent, monotonicity 510
zero, neutral, proportionality, monotonicity 4
zero, neutral, proportionality, tangent 9
zero, neutral, proportionality, tangent, monotonicity 10
zero, neutral, tangent 1
zero, proportionality 45
zero, proportionality, bound, tangent 1

6.6 OMS iterations

What is the number of iterations that oms_bin needed to find the optimum?

data_nia %>%
  filter(solver == "oms_bin" & status == "sat") %>%
  count(iterations)
iterations n
1 400
2 186
3 2169
4 326
5 77
6 25
7 11
8 4
9 1
10 2
12 1
14 1
17 1
21 2
23 3
24 1
27 1
29 1
30 2
32 1
40 1
41 2
48 1
60 1
62 1
69 1
72 1
73 1
74 1
75 1
77 1
78 1
82 1
88 1
89 1
91 1
95 1
112 1
113 1
123 1
135 1
142 1
151 1
167 1
185 1
227 1
235 1
237 1
260 1
262 1
266 1
273 1
280 1
311 1
314 1
320 1
324 1
349 1
426 1
507 1
600 1
614 1

What is the number of iterations that oms_lin needed to find the optimum?

data_nia %>%
  filter(solver == "oms_lin" & status == "sat") %>%
  count(iterations)
iterations n
1 400
2 1806
3 679
4 238
5 36
6 20
7 5
8 3
9 1
10 4
11 3
12 2
13 3
15 3
16 3
17 1
18 2
19 1
23 1
24 1
26 1
27 1
34 2
37 1
41 1
43 1
44 2
49 2
55 1
58 2
65 2
76 1
77 1
80 2
82 1
89 1
95 1
97 1
102 1
103 1
126 1
129 1
135 1
137 1
143 1
148 1
150 1
152 1
163 1
165 1
172 1
186 1
189 1
192 1
199 1
213 1
227 1
228 1
255 1
272 2
305 1
306 1
315 1
320 1
330 1
349 1
351 1
366 1
395 1
448 1
486 1
533 1
644 1
749 1
862 1
877 1
1081 1

6.7 Upper bounds

Scatter plots for the upper bounds computed by the solvers (even for partial results):

data_nia %>%
  rowwise() %>%
  select(solver, family, benchmark, upper) %>%
  filter(upper != "None" & !grepl("oo", upper) & !grepl("inf", upper) & !grepl("epsilon", upper)) %>%
  mutate(upper = eval(parse(text=upper))) %>%
  mutate(upper = max(upper, -10000)) %>%
  group_by(family, benchmark) %>%
  pivot_wider(names_from = solver, values_from = c(upper)) %>%
  ggplot(mapping = aes(x = z3, y = oms_lin)) +
  geom_jitter(width = 0.05, height = 0.05) +
  scale_x_continuous("Z3", trans=scales::pseudo_log_trans(base = 10), lim=c(-10000, 1000),
                     breaks=c(-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000), minor_breaks=c(),
                     labels=c("< -10000", "-1000", "-100", "-10", "-1", "0", "1", "10", "100", "1000")
                     ) +
  scale_y_continuous("OptiMathSAT(LIN)", trans=scales::pseudo_log_trans(base = 10), lim=c(-10000, 1000),
                     breaks=c(-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000), minor_breaks=c(),
                     labels=c("< -10000", "-1000", "-100", "-10", "-1", "0", "1", "10", "100", "1000")) +
geom_abline(slope = 1, color="lightgray")

upper_bounds_z3_oms_lin_nia.png

data_nia %>%
  rowwise() %>%
  select(solver, family, benchmark, upper) %>%
  filter(upper != "None" & !grepl("oo", upper) & !grepl("inf", upper) & !grepl("epsilon", upper)) %>%
  mutate(upper = eval(parse(text=upper))) %>%
  mutate(upper = max(upper, -10000)) %>%
  group_by(family, benchmark) %>%
  pivot_wider(names_from = solver, values_from = c(upper)) %>%
  ggplot(mapping = aes(x = z3, y = oms_bin)) +
  geom_jitter(width = 0.05, height = 0.05) +
  scale_x_continuous("Z3", trans=scales::pseudo_log_trans(base = 10), lim=c(-10000, 1000),
                     breaks=c(-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000), minor_breaks=c(),
                     labels=c("< -10000", "-1000", "-100", "-10", "-1", "0", "1", "10", "100", "1000")
                     ) +
  scale_y_continuous("OptiMathSAT(BIN)", trans=scales::pseudo_log_trans(base = 10), lim=c(-10000, 1000),
                     breaks=c(-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000), minor_breaks=c(),
                     labels=c("< -10000", "-1000", "-100", "-10", "-1", "0", "1", "10", "100", "1000")) +
geom_abline(slope = 1, color="lightgray")

upper_bounds_z3_oms_bin_nia.png

Created: 2021-05-21 Fri 14:07