Optimization Modulo Non-Linear Arithmetic via Incremental Linearization (experimental evaluation)

1. Tools
2. Benchmarks
3. Results
4. R environment
5. OMTPlan
6. NIA

This webpage contains benchmarks and tools necessary for reproducing the experimental evaluation results for the paper Optimization Modulo Non-Linear Arithmetic via Incremental Linearization (experimental evaluation) that was submitted to FroCos 2021.

1 Tools

In the experimental evaluation, we compared the following two tools:

OptiMathSAT, available from optimathsat.
Z3 (version 4.8.10), available from https://github.com/Z3Prover/z3/releases/tag/z3-4.8.10.

We used the following settings for the solvers:

OptiMathSAT binary search

-optimization=true -model_generation=true --opt.verbose=true -stats -opt.strategy=bin

OptiMathSAT linear search

-optimization=true -model_generation=true --opt.verbose=true -stats -opt.strategy=lin

Z3
```
-st smt.arith.solver=2
```

2 Benchmarks

We have used two sets of benchmarks, which are available from the following archive: benchmarks.tar.bz2:

omtplan, which are OMT(NRA) benchmarks that were generated by a tool OMTPlan,
QF_NIA, which are OMT(NIA) benchmarks that were generated from SMT(NIA) benchmarks from the SMT-LIB repository.

Because vast majority of QF_NIA benchmarks comes from the benchmark family VeryMax, we did not used all the benchmarks in that family, but randomly picked 10% of its benchmarks. The file names of the used benchmarks are documented in the file benchmarks/nia_benchmarks_subset.txt within the archive benchmarks.tar.bz2.

3 Results

Because our experiments were conducted on a cluster with a specific infrastructure, there is no ready script that could be used by everybody to run the experiments. However, data from our experiments can be found in the following archive: results.tar.bz2

The archive contains the following CSV files:

results/omtplan_z3_pddl.csv
results/omtplan_oms_lin.csv
results/omtplan_oms_bin.csv
results/nia_z3_pddl.csv
results/nia_oms_lin.csv
results/nia_oms_bin.csv

4 R environment

To analyse the results, load necessary R libraries.

library(ggplot2)
library(reshape2)
library(tidyr)
library(readr)
library(stringr)
library(scales)
library(dplyr)
library(ggthemes)
options(scipen=999)
theme_set(theme_light(base_size = 20))

Load the csv files with the results.

omtplan_z3 <- read_csv(file="results/omtplan_z3.csv")
omtplan_oms_lin <- read_csv(file="results/omtplan_oms_lin.csv")
omtplan_oms_bin <- read_csv(file="results/omtplan_oms_bin.csv")

nia_z3 <- read_csv(file="results/nia_z3.csv")
nia_oms_lin <- read_csv(file="results/nia_oms_lin.csv")
nia_oms_bin <- read_csv(file="results/nia_oms_bin.csv")

Union all data for all solvers into a single data sets:

data_omtplan <- bind_rows(omtplan_z3, omtplan_oms_lin, omtplan_oms_bin) %>% arrange(solver)
data_nia <- bind_rows(nia_z3, nia_oms_lin, nia_oms_bin) %>% arrange(solver)

5 OMTPlan

5.1 Overall

data_omtplan %>%
  group_by(solver) %>%
  count(status) %>%
  pivot_wider(names_from = status, values_from = n, values_fill=0)

solver	partial	sat	timeout	unsat	error
oms_bin	226	13	367	146	0
oms_lin	225	14	367	146	0
z3	142	65	379	163	3

5.2 By family

5.2.1 Sat

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "sat")) %>%
  pivot_wider(names_from = solver, values_from = sat)

family	oms_bin	oms_lin	z3
nl/car_nl	0	0	0
nl/convoys_nl	0	0	0
nl/hvac	0	0	0
nl/nl_counters	2	3	4
nl/nl_counters_simple	11	11	32
nl/sec_clearance_sdac	0	0	29

5.2.2 Partial

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(partial = sum(status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = partial)

family	oms_bin	oms_lin	z3
nl/car_nl	7	7	3
nl/convoys_nl	0	0	0
nl/hvac	2	2	0
nl/nl_counters	14	13	15
nl/nl_counters_simple	16	16	11
nl/sec_clearance_sdac	187	187	113

5.2.3 Sat or partial

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(sat_or_partial = sum(status == "sat" | status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = sat_or_partial)

family	oms_bin	oms_lin	z3
nl/car_nl	7	7	3
nl/convoys_nl	0	0	0
nl/hvac	2	2	0
nl/nl_counters	16	16	19
nl/nl_counters_simple	27	27	43
nl/sec_clearance_sdac	187	187	142

5.2.4 Unsat

data_omtplan %>%
  group_by(solver, family) %>%
  summarize(unsat = sum(status == "unsat")) %>%
  pivot_wider(names_from = solver, values_from = unsat)

family	oms_bin	oms_lin	z3
nl/car_nl	0	0	0
nl/convoys_nl	24	24	24
nl/hvac	122	122	124
nl/nl_counters	0	0	5
nl/nl_counters_simple	0	0	10
nl/sec_clearance_sdac	0	0	0

5.3 Benchmark quality

Total benchmarks:

data_omtplan %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
752

Total benchmarks by family:

data_omtplan %>%
  select(family, benchmark) %>%
  distinct() %>%
  count(family)

family	n
nl/car_nl	64
nl/convoys_nl	24
nl/hvac	128
nl/nl_counters	88
nl/nl_counters_simple	160
nl/sec_clearance_sdac	288

Unsat by at least one solver:

data_omtplan %>%
  filter(status == "unsat") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
163

Sat or partial by at least one solver

data_omtplan %>%
  filter(status == "sat" | status == "partial") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
262

Sat or partial by at least one solver by family

data_omtplan %>%
  filter(status == "sat" | status == "partial") %>%
  select(family, benchmark) %>%
  distinct() %>%
  count(family)

family	n
nl/car_nl	8
nl/hvac	2
nl/nl_counters	21
nl/nl_counters_simple	43
nl/sec_clearance_sdac	188

5.4 Added linearization lemmas

Investigate which kinds of linearization lemmas were used by OMS_LIN during the search:

Added lemmas by type for the sat results:

omtplan_oms_lin %>%
  filter(solver == "oms_lin" & status == "sat") %>%
  pivot_longer(
    cols = ends_with("lemmas"),
    names_to = c("type"),
    names_pattern = "(.*)_lemmas",
    values_to = "lemmas") %>%
filter(lemmas != 0) %>%
group_by(family, benchmark) %>%
summarize(lemma_types = toString(type)) %>%
ungroup() %>%
count(lemma_types)

lemma_types	n
zero	4
zero, neutral, proportionality	3
zero, neutral, proportionality, bound, tangent	2
zero, neutral, proportionality, bound, tangent, monotonicity	2
zero, neutral, proportionality, tangent	3

Added lemmas by type for the partial results:

omtplan_oms_lin %>%
  filter(solver == "oms_lin" & status == "partial") %>%
  pivot_longer(
    cols = ends_with("lemmas"),
    names_to = c("type"),
    names_pattern = "(.*)_lemmas",
    values_to = "lemmas") %>%
filter(lemmas != 0) %>%
group_by(family, benchmark) %>%
summarize(lemma_types = toString(type)) %>%
ungroup() %>%
count(lemma_types)

lemma_types	n
zero	2
zero, neutral	2
zero, neutral, proportionality	5
zero, neutral, proportionality, bound, tangent	37
zero, neutral, proportionality, bound, tangent, monotonicity	173
zero, neutral, proportionality, tangent	3
zero, proportionality	2
zero, proportionality, tangent	1

5.5 Comparison of individual results

Compute the table that has one row per benchmark that contains results for each solver.

merged <- data_omtplan %>%
  select(solver, family, benchmark, status, time, lower, upper) %>%
  group_by(family, benchmark) %>%
  pivot_wider(names_from = solver, values_from = c(status, time, lower, upper)) %>%
  ungroup()

write.csv(merged, file = "table.csv")

5.5.1 Cross comparison

Cross comparison of results returned by oms_bin (row) and z3 (column):

merged %>%
  count(status_z3, status_oms_bin) %>%
  pivot_wider(names_from = status_z3, values_from = n)

status_oms_bin	error	partial	sat	timeout	unsat
partial	3	127	45	51	nil
timeout	nil	15	7	328	17
sat	nil	nil	13	nil	nil
unsat	nil	nil	nil	nil	146

Cross comparison of results returned by oms_lin (row) and z3 (column):

merged %>%
  count(status_z3, status_oms_lin) %>%
  pivot_wider(names_from = status_z3, values_from = n)

status_oms_lin	error	partial	sat	timeout	unsat
partial	3	127	44	51	nil
timeout	nil	15	7	328	17
sat	nil	nil	14	nil	nil
unsat	nil	nil	nil	nil	146

5.6 VBS

Unsat by at least one solver:

data_omtplan %>%
  filter(status == "unsat") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
163

Unsat by at least one OMS:

data_omtplan %>%
  filter(status == "unsat" & (solver == "oms_bin" | solver == "oms_lin")) %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
146

Timeout by all solvers:

merged %>%
  filter(status_z3 == "timeout" & status_oms_bin == "timeout" & status_oms_lin == "timeout") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
327

Timeout by all OMS:

merged %>%
  filter(status_oms_bin == "timeout" & status_oms_lin == "timeout") %>%
  select(family, benchmark) %>%
  distinct() %>%
  nrow()

x
366

5.7 OMS iterations

What is the number of iterations that oms_bin needed to find the optimum?

data_omtplan %>%
  filter(solver == "oms_bin" & status == "sat") %>%
  count(iterations)

iterations	n
3	11
4	1
7	1

What is the number of iterations that oms_bin went through for the partial results?

data_omtplan %>%
  filter(solver == "oms_bin" & status == "partial") %>%
  count(iterations)

iterations	n
1	128
2	18
3	16
4	13
5	20
6	11
7	11
8	3
9	2
11	2
13	1
16	1

What is the number of iterations that oms_lin needed to find the optimum?

data_omtplan %>%
  filter(solver == "oms_lin" & status == "sat") %>%
  count(iterations)

iterations	n
2	10
3	1
4	1
5	2

What is the number of iterations that oms_lin went through for the partial results?

data_omtplan %>%
  filter(solver == "oms_lin" & status == "partial") %>%
  count(iterations)

iterations	n
1	50
2	20
3	24
4	16
5	13
6	5
7	6
8	3
9	7
10	7
11	4
12	6
13	6
14	2
15	2
16	4
17	4
18	1
19	1
20	4
21	3
22	3
23	1
24	2
25	3
27	1
28	3
29	2
30	3
32	2
33	3
36	1
40	1
41	1
45	1
46	1
48	2
49	1
51	1
52	1
56	1
57	1
58	1
63	1

5.8 Plots

5.8.1 Upper bounds

data_omtplan %>%
    rowwise() %>%
    select(solver, family, benchmark, upper) %>%
    filter(upper != "None" & !grepl("oo", upper) & !grepl("inf", upper) & !grepl("epsilon", upper)) %>%
    mutate(upper = eval(parse(text=upper))) %>%
    group_by(family, benchmark) %>%
    pivot_wider(names_from = solver, values_from = c(upper)) %>%
    ggplot(mapping = aes(x = z3, y = oms_lin)) +
    geom_point() +
    scale_x_continuous("Z3", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    scale_y_continuous("OptiMathSAT(LIN)", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    geom_abline(slope = 1, color="lightgray")

data_omtplan %>%
    rowwise() %>%
    select(solver, family, benchmark, upper) %>%
    filter(upper != "None" & !grepl("oo", upper) & !grepl("inf", upper) & !grepl("epsilon", upper)) %>%
    mutate(upper = eval(parse(text=upper))) %>%
    group_by(family, benchmark) %>%
    pivot_wider(names_from = solver, values_from = c(upper)) %>%
    ggplot(mapping = aes(x = z3, y = oms_bin)) +
    geom_point() +
    scale_x_continuous("Z3", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    scale_y_continuous("OptiMathSAT(BIN)", trans=scales::pseudo_log_trans(base = 10), lim=c(0, 10000), breaks=c(0, 0, 1, 10, 100, 1000, 10000), minor_breaks=c()) +
    geom_abline(slope = 1, color="lightgray")

6 NIA

6.1 Overall

data_nia %>%
  group_by(solver) %>%
  count(status) %>%
  pivot_wider(names_from = status, values_from = n, values_fill=0)

solver	error	partial	sat	timeout	unknown
oms_bin	92	1047	3258	1341	0
oms_lin	101	1019	3276	1348	0
z3	0	1105	1975	2589	75

6.2 By family

6.2.1 Sat

data_nia %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "sat")) %>%
  pivot_wider(names_from = solver, values_from = sat)

family	oms_bin	oms_lin	z3
QF_NIA/20170427-VeryMax	1031	1051	273
QF_NIA/AProVE	1729	1722	1284
QF_NIA/calypto	221	226	237
QF_NIA/LassoRanker	13	14	6
QF_NIA/leipzig	250	248	129
QF_NIA/mcm	1	1	33
QF_NIA/UltimateLassoRanker	13	14	13

6.2.2 Partial

data_nia %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = sat)

family	oms_bin	oms_lin	z3
QF_NIA/20170427-VeryMax	981	948	935
QF_NIA/AProVE	49	53	130
QF_NIA/calypto	2	2	0
QF_NIA/LassoRanker	2	1	4
QF_NIA/leipzig	7	9	26
QF_NIA/mcm	0	0	7
QF_NIA/UltimateLassoRanker	6	6	3

6.2.3 Sat or partial

data_nia %>%
  group_by(solver, family) %>%
  summarize(sat = sum(status == "sat" | status == "partial")) %>%
  pivot_wider(names_from = solver, values_from = sat)

family	oms_bin	oms_lin	z3
QF_NIA/20170427-VeryMax	2012	1999	1208
QF_NIA/AProVE	1778	1775	1414
QF_NIA/calypto	223	228	237
QF_NIA/LassoRanker	15	15	10
QF_NIA/leipzig	257	257	155
QF_NIA/mcm	1	1	40
QF_NIA/UltimateLassoRanker	19	20	16

6.3 Benchmark statisticts