moptipy.evaluation package

Components for parsing and evaluating log files generated by experiments.

Via the moptipy.api, it is possible to log the progress or end results of optimization algorithms runs in text-based log files. With the methods in this package here, you can load and evaluate them. This usually follows a multi-step approach: For example, you can first extract the end results from several algorithms and instances into a single file via the EndResult. This could then be processed to per-algorithm or per-instance statistics using EndStatistics.

Submodules

moptipy.evaluation.axis_ranger module

A utility to specify axis ranges.

class moptipy.evaluation.axis_ranger.AxisRanger(chosen_min=None, chosen_max=None, use_data_min=True, use_data_max=True, log_scale=False, log_base=None)[source]

Bases: object

An object for simplifying axis range computations.

apply(axes, which_axis)[source]

Apply this axis ranger to the given axis.

Parameters:
  • axes (Axes) – the axes object to which the ranger shall be applied

  • which_axis (str) – the axis to which it should be applied, either “x” or “y” or both (“xy”)

Return type:

None

static for_axis(name, chosen_min=None, chosen_max=None, use_data_min=None, use_data_max=None, log_scale=None, log_base=None)[source]

Create a default axis ranger based on the axis type.

The axis ranger will use the minimal values and log scaling options that usually make sense for the dimension, unless overridden by the optional arguments.

Parameters:
  • name (str) – the axis type name, supporting “ms”, “FEs”, “plainF”, “scaledF”, and “normalizedF”

  • chosen_min (Optional[float], default: None) – the chosen minimum

  • chosen_max (Optional[float], default: None) – the chosen maximum

  • use_data_min (Optional[bool], default: None) – should the data minimum be used

  • use_data_max (Optional[bool], default: None) – should the data maximum be used

  • log_scale (Optional[bool], default: None) – the log scale indicator

  • log_base (Optional[float], default: None) – the log base

Return type:

AxisRanger

Returns:

the AxisRanger

static for_axis_func(chosen_min=None, chosen_max=None, use_data_min=None, use_data_max=None, log_scale=None, log_base=None)[source]

Generate a function that provides the default per-axis ranger.

Parameters:
  • chosen_min (Optional[float], default: None) – the chosen minimum

  • chosen_max (Optional[float], default: None) – the chosen maximum

  • use_data_min (Optional[bool], default: None) – should the data minimum be used

  • use_data_max (Optional[bool], default: None) – should the data maximum be used

  • log_scale (Optional[bool], default: None) – the log scale indicator

  • log_base (Optional[float], default: None) – the log base

Return type:

Callable

Returns:

a function in the shape of for_axis() with the provided defaults

get_0_replacement()[source]

Get a reasonable positive finite value that can replace 0.

Return type:

float

Returns:

a reasonable finite value that can be used to replace 0

get_pinf_replacement()[source]

Get a reasonable finite value that can replace positive infinity.

Return type:

float

Returns:

a reasonable finite value that can be used to replace positive infinity

log_scale: Final[bool]

Should the axis be log-scaled?

pad_detected_range(pad_min=False, pad_max=False)[source]

Add some padding to the current detected range.

This function increases the current detected or chosen maximum value and/or decreases the current detected minimum by a small amount. This can be useful when we want to plot stuff that otherwise would become invisible because it would be directly located at the boundary of a plot.

This function works by computing a slightly smaller/larger value than the current detected minimum/maximum and then passing it to register_value(). It can only work if the end(s) chosen for padding are in “detect” mode and the other end is either in “detect” or “chosen” mode.

This method should be called only once and only after all data has been registered (via register_value() register_array()) and before calling apply().

Parameters:
  • pad_min (bool, default: False) – should we pad the minimum?

  • pad_max (bool, default: False) – should we pad the maximum?

Raises:

ValueError – if this axis ranger is not configured to use a detected minimum/maximum or does not have a detected minimum/maximum or any other invalid situation occurs

Return type:

None

register_array(data)[source]

Register a data array.

Parameters:

data (ndarray) – the data to register

Return type:

None

register_value(value)[source]

Register a single value.

Parameters:

value (float) – the data to register

Return type:

None

moptipy.evaluation.base module

Some internal helper functions and base classes.

class moptipy.evaluation.base.EvaluationDataElement[source]

Bases: object

A base class for all the data classes in this module.

moptipy.evaluation.base.F_NAME_NORMALIZED: Final[str] = 'normalizedF'

The name of the normalized objective values data.

moptipy.evaluation.base.F_NAME_RAW: Final[str] = 'plainF'

The name of the raw objective values data.

moptipy.evaluation.base.F_NAME_SCALED: Final[str] = 'scaledF'

The name of the scaled objective values data.

moptipy.evaluation.base.KEY_ENCODING: Final[str] = 'encoding'

a key for the encoding name

moptipy.evaluation.base.KEY_N: Final[str] = 'n'

The key for the total number of runs.

moptipy.evaluation.base.KEY_OBJECTIVE_FUNCTION: Final[str] = 'objective'

a key for the objective function name

class moptipy.evaluation.base.MultiRun2DData(algorithm, instance, objective, encoding, n, time_unit, f_name)[source]

Bases: MultiRunData

A multi-run data based on one time and one objective dimension.

>>> p = MultiRun2DData("a", "i", "f", None, 3,
...                    TIME_UNIT_FES, F_NAME_SCALED)
>>> p.instance
'i'
>>> p.algorithm
'a'
>>> p.objective
'f'
>>> print(p.encoding)
None
>>> p.n
3
>>> print(p.time_unit)
FEs
>>> print(p.f_name)
scaledF
>>> try:
...     MultiRun2DData("a", "i", "f", None, 3,
...                    3, F_NAME_SCALED)
... except TypeError as te:
...     print(te)
time_unit should be an instance of str but is int, namely '3'.
>>> try:
...     MultiRun2DData("a", "i", "f", None, 3,
...                    "sdfjsdf", F_NAME_SCALED)
... except ValueError as ve:
...     print(ve)
Invalid time unit 'sdfjsdf', only 'FEs' and 'ms' are permitted.
>>> try:
...     MultiRun2DData("a", "i", "f", None, 3,
...                    TIME_UNIT_FES, True)
... except TypeError as te:
...     print(te)
f_name should be an instance of str but is bool, namely 'True'.
>>> try:
...     MultiRun2DData("a", "i", "f", None, 3,
...                    TIME_UNIT_FES, "blablue")
... except ValueError as ve:
...     print(ve)
Invalid f name 'blablue', only 'plainF', 'scaledF', and 'normalizedF' are permitted.
f_name: str

the name of the objective value axis.

time_unit: str

The unit of the time axis.

class moptipy.evaluation.base.MultiRunData(algorithm, instance, objective, encoding, n)[source]

Bases: EvaluationDataElement

A class that represents statistics over a set of runs.

If one algorithm*instance is used, then algorithm and instance are defined. Otherwise, only the parameter which is the same over all recorded runs is defined.

>>> p = MultiRunData("a", "i", "f", None, 3)
>>> p.instance
'i'
>>> p.algorithm
'a'
>>> p.objective
'f'
>>> print(p.encoding)
None
>>> p.n
3
>>> p = MultiRunData(None, None, None, "x", 3)
>>> print(p.instance)
None
>>> print(p.algorithm)
None
>>> print(p.objective)
None
>>> p.encoding
'x'
>>> p.n
3
>>> try:
...     MultiRunData(1, "i", "f", "e", 234)
... except TypeError as te:
...     print(te)
algorithm name should be an instance of any in {None, str} but is int, namely '1'.
>>> try:
...     MultiRunData("x x", "i", "f", "e", 234)
... except ValueError as ve:
...     print(ve)
Invalid algorithm name 'x x'.
>>> try:
...     MultiRunData("a", 5.5, "f", "e", 234)
... except TypeError as te:
...     print(te)
instance name should be an instance of any in {None, str} but is float, namely '5.5'.
>>> try:
...     MultiRunData("x", "a-i", "f", "e", 234)
... except ValueError as ve:
...     print(ve)
Invalid instance name 'a-i'.
>>> try:
...     MultiRunData("a", "i", True, "e", 234)
... except TypeError as te:
...     print(te)
objective name should be an instance of any in {None, str} but is bool, namely 'True'.
>>> try:
...     MultiRunData("xx", "i", "d'@f", "e", 234)
... except ValueError as ve:
...     print(ve)
Invalid objective name "d'@f".
>>> try:
...     MultiRunData("yy", "i", "f", -9.4, 234)
... except TypeError as te:
...     print(te)
encoding name should be an instance of any in {None, str} but is float, namely '-9.4'.
>>> try:
...     MultiRunData("xx", "i", "f", "e-{a", 234)
... except ValueError as ve:
...     print(ve)
Invalid encoding name 'e-{a'.
>>> try:
...     MultiRunData("x", "i", "f", "e", -1.234)
... except TypeError as te:
...     print(te)
n should be an instance of int but is float, namely '-1.234'.
>>> try:
...     MultiRunData("xx", "i", "f", "e", 1_000_000_000_000_000_000_000)
... except ValueError as ve:
...     print(ve)
n=1000000000000000000000 is invalid, must be in 1..1000000000000000.
algorithm: str | None

The algorithm that was applied, if the same over all runs.

encoding: str | None

the encoding, if any, or None if no encoding was used or if it was not the same over all runs

instance: str | None

The problem instance that was solved, if the same over all runs.

n: int

The number of runs over which the statistic information is computed.

objective: str | None

the name of the objective function, if the same over all runs

class moptipy.evaluation.base.PerRunData(algorithm, instance, objective, encoding, rand_seed)[source]

Bases: EvaluationDataElement

An immutable record of information over a single run.

>>> p = PerRunData("a", "i", "f", None, 234)
>>> p.instance
'i'
>>> p.algorithm
'a'
>>> p.objective
'f'
>>> print(p.encoding)
None
>>> p.rand_seed
234
>>> p = PerRunData("a", "i", "f", "e", 234)
>>> p.instance
'i'
>>> p.algorithm
'a'
>>> p.objective
'f'
>>> p.encoding
'e'
>>> p.rand_seed
234
>>> try:
...     PerRunData(3, "i", "f", "e", 234)
... except TypeError as te:
...     print(te)
algorithm name should be an instance of str but is int, namely '3'.
>>> try:
...     PerRunData("@1 2", "i", "f", "e", 234)
... except ValueError as ve:
...     print(ve)
Invalid algorithm name '@1 2'.
>>> try:
...     PerRunData("x", 3.2, "f", "e", 234)
... except TypeError as te:
...     print(te)
instance name should be an instance of str but is float, namely '3.2'.
>>> try:
...     PerRunData("x", "sdf i", "f", "e", 234)
... except ValueError as ve:
...     print(ve)
Invalid instance name 'sdf i'.
>>> try:
...     PerRunData("a", "i", True, "e", 234)
... except TypeError as te:
...     print(te)
objective name should be an instance of str but is bool, namely 'True'.
>>> try:
...     PerRunData("x", "i", "d-f", "e", 234)
... except ValueError as ve:
...     print(ve)
Invalid objective name 'd-f'.
>>> try:
...     PerRunData("x", "i", "f", 54.2, 234)
... except TypeError as te:
...     print(te)
encoding name should be an instance of any in {None, str} but is float, namely '54.2'.
>>> try:
...     PerRunData("y", "i", "f", "x  x", 234)
... except ValueError as ve:
...     print(ve)
Invalid encoding name 'x  x'.
>>> try:
...     PerRunData("x", "i", "f", "e", 3.3)
... except TypeError as te:
...     print(te)
rand_seed should be an instance of int but is float, namely '3.3'.
>>> try:
...     PerRunData("x", "i", "f", "e", -234)
... except ValueError as ve:
...     print(ve)
rand_seed=-234 is invalid, must be in 0..18446744073709551615.
algorithm: str

The algorithm that was applied.

encoding: str | None

the encoding, if any, or None if no encoding was used

instance: str

The problem instance that was solved.

objective: str

the name of the objective function

rand_seed: int

The seed of the random number generator.

moptipy.evaluation.base.TIME_UNIT_FES: Final[str] = 'FEs'

The unit of the time axis of time is measured in FEs

moptipy.evaluation.base.TIME_UNIT_MILLIS: Final[str] = 'ms'

The unit of the time axis if time is measured in milliseconds.

moptipy.evaluation.base.check_f_name(f_name)[source]

Check whether an objective value name is valid.

Parameters:

f_name (Any) – the name of the objective function dimension

Return type:

str

Returns:

the name of the objective function dimension

>>> check_f_name("plainF")
'plainF'
>>> check_f_name("scaledF")
'scaledF'
>>> check_f_name("normalizedF")
'normalizedF'
>>> try:
...     check_f_name(1.0)
... except TypeError as te:
...     print(te)
f_name should be an instance of str but is float, namely '1.0'.
>>> try:
...     check_f_name("oops")
... except ValueError as ve:
...     print(ve)
Invalid f name 'oops', only 'plainF', 'scaledF', and 'normalizedF' are permitted.
moptipy.evaluation.base.check_time_unit(time_unit)[source]

Check that the time unit is OK.

Parameters:

time_unit (Any) – the time unit

Return type:

str

Returns:

the time unit string

>>> check_time_unit("FEs")
'FEs'
>>> check_time_unit("ms")
'ms'
>>> try:
...     check_time_unit(1)
... except TypeError as te:
...     print(te)
time_unit should be an instance of str but is int, namely '1'.
>>> try:
...     check_time_unit("blabedibla")
... except ValueError as ve:
...     print(ve)
Invalid time unit 'blabedibla', only 'FEs' and 'ms' are permitted.
moptipy.evaluation.base.get_algorithm(obj)[source]

Get the algorithm of a given object.

Parameters:

obj (PerRunData | MultiRunData) – the object

Return type:

str | None

Returns:

the algorithm string, or None if no algorithm is specified

>>> p1 = MultiRunData("a1", "i1", "f", "y", 3)
>>> get_algorithm(p1)
'a1'
>>> p2 = PerRunData("a2", "i2", "y", None, 31)
>>> get_algorithm(p2)
'a2'
moptipy.evaluation.base.get_instance(obj)[source]

Get the instance of a given object.

Parameters:

obj (PerRunData | MultiRunData) – the object

Return type:

str | None

Returns:

the instance string, or None if no instance is specified

>>> p1 = MultiRunData("a", "i1", None, "x", 3)
>>> get_instance(p1)
'i1'
>>> p2 = PerRunData("a", "i2", "f", "x", 31)
>>> get_instance(p2)
'i2'
moptipy.evaluation.base.sort_key(obj)[source]

Get the default sort key for the given object.

The sort key is a tuple with well-defined field elements that should allow for a default and consistent sorting over many different elements of the experiment evaluation data API. Sorting should work also for lists containing elements of different classes.

Parameters:

obj (PerRunData | MultiRunData) – the object

Return type:

tuple[Any, ...]

Returns:

the sort key

>>> p1 = MultiRunData("a1", "i1", "f", None, 3)
>>> p2 = PerRunData("a2", "i2", "f", None, 31)
>>> sort_key(p1) < sort_key(p2)
True
>>> sort_key(p1) >= sort_key(p2)
False
>>> p3 = MultiRun2DData("a", "i", "f", None, 3,
...                     TIME_UNIT_FES, F_NAME_SCALED)
>>> sort_key(p3) < sort_key(p1)
True
>>> sort_key(p3) >= sort_key(p1)
False

moptipy.evaluation.ecdf module

Approximate the Ecdf to reach certain goals.

The empirical cumulative distribution function (ECDF) for short illustrates the fraction of runs that have reached a certain goal over time. Let’s say that you have performed 10 runs of a certain algorithm on a certain problem. As goal quality, you could define the globally optimal solution quality. For any point in time, the ECDF then shows how many of these runs have solved the problem to this goal, to optimality. Let’s say the first run solves the problem after 100 FEs. Then the ECDF is 0 until 99 FEs and at 100 FEs, it becomes 1/10. The second-fastest run solves the problem after 200 FEs. The ECDF thus stays 0.1 until 199 FEs and at 200 FEs, it jumps to 0.2. And so on. This means that the value of the ECDF is always between 0 and 1.

  1. Nikolaus Hansen, Anne Auger, Steffen Finck, Raymond Ros. Real-Parameter Black-Box Optimization Benchmarking 2010: Experimental Setup. Research Report RR-7215, INRIA. 2010. inria-00462481. https://hal.inria.fr/inria-00462481/document/

  2. Dave Andrew Douglas Tompkins and Holger H. Hoos. UBCSAT: An Implementation and Experimentation Environment for SLS Algorithms for SAT and MAX-SAT. In Revised Selected Papers from the Seventh International Conference on Theory and Applications of Satisfiability Testing (SAT’04), May 10-13, 2004, Vancouver, BC, Canada, pages 306-320. Lecture Notes in Computer Science (LNCS), volume 3542. Berlin, Germany: Springer-Verlag GmbH. ISBN: 3-540-27829-X. doi: https://doi.org/10.1007/11527695_24.

  3. Holger H. Hoos and Thomas Stützle. Evaluating Las Vegas Algorithms - Pitfalls and Remedies. In Gregory F. Cooper and Serafín Moral, editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI’98), July 24-26, 1998, Madison, WI, USA, pages 238-245. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. ISBN: 1-55860-555-X.

class moptipy.evaluation.ecdf.Ecdf(algorithm, objective, encoding, n, n_insts, time_unit, f_name, goal_f, ecdf)[source]

Bases: MultiRun2DData

The ECDF data.

classmethod create(source, goal_f=None, use_default_goal_f=True)[source]

Create one single Ecdf record from an iterable of Progress records.

Parameters:
  • source (Iterable[Progress]) – the set of progress instances

  • goal_f (Union[int, float, Callable, None], default: None) – the goal objective value

  • use_default_goal_f (bool, default: True) – should we use the default lower bounds as goals?

Returns:

the Ecdf record

Return type:

Ecdf

ecdf: ndarray

The ECDF data function

classmethod from_progresses(source, consumer, f_goal=None, join_all_algorithms=False, join_all_objectives=False, join_all_encodings=False)[source]

Compute one or multiple ECDFs from a stream of end results.

Parameters:
  • source (Iterable[Progress]) – the set of progress instances

  • f_goal (Union[int, float, Callable, Iterable[Union[int, float, Callable]], None], default: None) – one or multiple goal values

  • consumer (Callable[[Ecdf], Any]) – the destination to which the new records will be passed, can be the append method of a list

  • join_all_algorithms (bool, default: False) – should the Ecdf be aggregated over all algorithms

  • join_all_objectives (bool, default: False) – should the Ecdf be aggregated over all objective functions

  • join_all_encodings (bool, default: False) – should the Ecdf be aggregated over all encodings

Return type:

None

goal_f: int | float | None

The goal value, or None if different goals were used for different instances

n_insts: int

The number of instances over which the ERT-ECDF is computed.

time_label()[source]

Get the time label for x-axes.

Return type:

str

Returns:

the time key

to_csv(file, put_header=True)[source]

Store a Ecdf record in a CSV file.

Parameters:
  • file (str) – the file to generate

  • put_header (bool, default: True) – should we put a header with meta-data?

Return type:

Path

Returns:

the fully resolved file name

moptipy.evaluation.ecdf.KEY_F_NAME: Final[str] = 'fName'

The objective dimension name.

moptipy.evaluation.ecdf.KEY_N_INSTS: Final[str] = 'nInsts'

The number of instances.

moptipy.evaluation.ecdf.get_goal(ecdf)[source]

Get the goal value from the given ecdf instance.

Parameters:

ecdf (Ecdf) – the ecdf instance

Return type:

int | float | None

Returns:

the goal value

moptipy.evaluation.ecdf.goal_to_str(goal_f)[source]

Transform a goal to a string.

Parameters:

goal_f (int | float | None) – the goal value

Return type:

str

Returns:

the string representation

moptipy.evaluation.end_results module

Record for EndResult as well as parsing, serialization, and parsing.

When doing experiments with moptipy, you apply algorithm setups to problem instances. For each setup x instance combination, you may conduct a series of repetitions (so-called runs) with different random seeds. Each single run of an algorithm setup on a problem instances can produce a separate log file. From each log file, we can load a EndResult instance, which represents, well, the end result of the run, i.e., information such as the best solution quality reached, when it was reached, and the termination criterion. These end result records then can be the basis for, e.g., computing summary statistics via end_statistics or for plotting the end result distribution via plot_end_results.

class moptipy.evaluation.end_results.EndResult(algorithm, instance, objective, encoding, rand_seed, best_f, last_improvement_fe, last_improvement_time_millis, total_fes, total_time_millis, goal_f, max_fes, max_time_millis)[source]

Bases: PerRunData

An immutable end result record of one run of one algorithm on one problem.

This record provides the information of the outcome of one application of one algorithm to one problem instance in an immutable way.

best_f: int | float

The best objective value encountered.

static from_csv(file, consumer, filterer=<function EndResult.<lambda>>)[source]

Parse a given CSV file to get EndResult Records.

Parameters:
  • file (str) – the path to parse

  • consumer (Callable[[EndResult], Any]) – the collector, can be the append method of a list

  • filterer (Callable[[EndResult], bool], default: <function EndResult.<lambda> at 0x7fd8a6689990>) – an optional filter function

Return type:

None

static from_logs(path, consumer, max_fes=None, max_time_millis=None, goal_f=None)[source]

Parse a given path and pass all end results found to the consumer.

If path identifies a file with suffix .txt, then this file is parsed. The appropriate EndResult is created and appended to the collector. If path identifies a directory, then this directory is parsed recursively for each log file found, one record is passed to the consumer. As consumer, you could pass any callable that accepts instances of EndResult, e.g., the append method of a list.

Via the parameters max_fes, max_time_millis, and goal_f, you can set virtual limits for the objective function evaluations, the maximum runtime, and the objective value. The EndResult records will then not represent the actual final state of the runs but be synthesized from the logged progress information. This, of course, requires such information to be present. It will also raise a ValueError if the goals are invalid, e.g., if a runtime limit is specified that is before the first logged points.

There is one caveat when specifying max_time_millis: Let’s say that the log files only log improvements. Then you might have a log point for 7000 FEs, 1000ms, and f=100. The next log point could be 8000 FEs, 1200ms, and f=90. Now if your time limit specified is 1100ms, we know that the end result is f=100 (because f=90 was reached too late) and that the total runtime is 1100ms, as this is the limit you specified and it was also reached. But we do not know the number of consumed FEs. We know you consumed at least 7000 FEs, but you did not consume 8000 FEs. It would be wrong to claim that 7000 FEs were consumed, since it could have been more. We therefore set a virtual end point at 7999 FEs. In terms of performance metrics such as the ert, this would be the most conservative choice in that it does not over-estimate the speed of the algorithm. It can, however, lead to very big deviations from the actual values. For example, if your algorithm quickly converged to a local optimum and there simply is no log point that exceeds the virtual time limit but the original run had a huge FE-based budget while your virtual time limit was small, this could lead to an estimate of millions of FEs taking part within seconds…

Parameters:
  • path (str) – the path to parse

  • consumer (Callable[[EndResult], Any]) – the consumer

  • max_fes (Union[int, None, Callable[[str, str], int | None]], default: None) – the maximum FEs, a callable to compute the maximum FEs from the algorithm and instance name, or None if unspecified

  • max_time_millis (Union[int, None, Callable[[str, str], int | None]], default: None) – the maximum runtime in milliseconds, a callable to compute the maximum runtime from the algorithm and instance name, or None if unspecified

  • goal_f (Union[int, float, None, Callable[[str, str], int | float | None]], default: None) – the goal objective value, a callable to compute the goal objective value from the algorithm and instance name, or None if unspecified

Return type:

None

static getter(dimension)[source]

Produce a function that obtains the given dimension from EndResults.

The following dimensions are supported:

  1. lastImprovementFE: last_improvement_fe

  2. lastImprovementTimeMillis:

    last_improvement_time_millis

  3. totalFEs: total_fes

  4. totalTimeMillis: total_time_millis

  5. goalF: goal_f

  6. plainF, bestF: best_f

  7. scaledF: best_f/goal_f

  8. normalizedF: (best_f-attr:~EndResult.goal_f)/

    goal_f

  9. maxFEs: max_fes

  10. maxTimeMillis: max_time_millis

  11. fesPerTimeMilli: total_fes /total_time_millis

Parameters:

dimension (str) – the dimension

Return type:

Callable[[EndResult], int | float]

Returns:

a callable that returns the value corresponding to the dimension from its input value, which must be an EndResult

goal_f: int | float | None

The goal objective value if provided

last_improvement_fe: int

The index of the function evaluation when best_f was reached.

last_improvement_time_millis: int

The time when best_f was reached.

max_fes: int | None

The (optional) maximum permitted FEs.

max_time_millis: int | None

The (optional) maximum runtime.

path_to_file(base_dir)[source]

Get the path that would correspond to the log file of this end result.

Obtain a path that would correspond to the log file of this end result, resolved from a base directory base_dir.

Parameters:

base_dir (str) – the base directory

Return type:

Path

Returns:

the path to a file corresponding to the end result record

success()[source]

Check if a run is successful.

This method returns True if and only if goal_f is defined and best_f <= goal_f (and False otherwise).

Return type:

bool

Returns:

True if and only if best_f<=goal_f

static to_csv(results, file)[source]

Write a sequence of end results to a file in CSV format.

Parameters:
Return type:

Path

Returns:

the path of the file that was written

total_fes: int

The total number of performed FEs.

total_time_millis: int

The total time consumed by the run.

moptipy.evaluation.end_statistics module

Statistics aggregated over multiple instances of EndResult.

The end_results records hold the final result of a run of an optimization algorithm on a problem instance. Often, we do not want to compare these single results directly, but instead analyze summary statistics, such as the mean best objective value found. For this purpose, EndStatistics exists. It summarizes the singular results from the runs into a record with the most important statistics.

class moptipy.evaluation.end_statistics.EndStatistics(algorithm, instance, objective, encoding, n, best_f, last_improvement_fe, last_improvement_time_millis, total_fes, total_time_millis, goal_f, best_f_scaled, n_success, success_fes, success_time_millis, ert_fes, ert_time_millis, max_fes, max_time_millis)[source]

Bases: MultiRunData

Statistics over end results of one or multiple algorithm*instance setups.

If one algorithm*instance is used, then algorithm and instance are defined. Otherwise, only the parameter which is the same over all recorded runs is defined.

best_f: Statistics

The statistics about the best encountered result.

best_f_scaled: Statistics | None

best_f / goal_f if goal_f is consistently defined and always positive.

static create(source)[source]

Create an EndStatistics Record from an Iterable of EndResult.

Parameters:

source (Iterable[EndResult]) – the source

Returns:

the statistics

Return type:

EndStatistics

ert_fes: int | float | None

The ERT if FEs, while is inf if n_success=0, None if goal_f is None, and finite otherwise.

ert_time_millis: int | float | None

The ERT if milliseconds, while is inf if n_success=0, None if goal_f is None, and finite otherwise.

static from_csv(file, consumer)[source]

Parse a CSV file and collect all encountered EndStatistics.

Parameters:
Return type:

None

static from_end_results(source, consumer, join_all_algorithms=False, join_all_instances=False, join_all_objectives=False, join_all_encodings=False)[source]

Aggregate statistics over a stream of end results.

Parameters:
  • source (Iterable[EndResult]) – the stream of end results

  • consumer (Callable[[EndStatistics], Any]) – the destination to which the new records will be sent, can be the append method of a list

  • join_all_algorithms (bool, default: False) – should the statistics be aggregated over all algorithms

  • join_all_instances (bool, default: False) – should the statistics be aggregated over all algorithms

  • join_all_objectives (bool, default: False) – should the statistics be aggregated over all objectives?

  • join_all_encodings (bool, default: False) – should statistics be aggregated over all encodings

Return type:

None

static getter(dimension)[source]

Create a function that obtains the given dimension from EndStatistics.

Parameters:

dimension (str) – the dimension

Return type:

Callable[[EndStatistics], int | float | None]

Returns:

a callable that returns the value corresponding to the dimension

goal_f: Statistics | int | float | None

The goal objective value.

last_improvement_fe: Statistics

The statistics about the last improvement FE.

last_improvement_time_millis: Statistics

The statistics about the last improvement time.

max_fes: Statistics | int | None

The budget in FEs, if every run had one; None otherwise.

max_time_millis: Statistics | int | None

The budget in milliseconds, if every run had one; None otherwise.

n_success: int | None

The number of successful runs, if goal_f != None, else None.

success_fes: Statistics | None

The FEs to success, if n_success > 0, None otherwise.

success_time_millis: Statistics | None

The time to success, if n_success > 0, None otherwise.

static to_csv(data, file)[source]

Store a set of EndStatistics in a CSV file.

Parameters:
Return type:

Path

Returns:

the path to the generated CSV file

total_fes: Statistics

The statistics about the total number of FEs.

total_time_millis: Statistics

The statistics about the total time.

moptipy.evaluation.end_statistics.KEY_BEST_F_SCALED: Final[str] = 'bestFscaled'

The key for the best F.

moptipy.evaluation.end_statistics.KEY_ERT_FES: Final[str] = 'ertFEs'

The key for the ERT in FEs.

moptipy.evaluation.end_statistics.KEY_ERT_TIME_MILLIS: Final[str] = 'ertTimeMillis'

The key for the ERT in milliseconds.

moptipy.evaluation.end_statistics.KEY_N_SUCCESS: Final[str] = 'successN'

The key for the number of successful runs.

moptipy.evaluation.end_statistics.KEY_SUCCESS_FES: Final[str] = 'successFEs'

The key for the success FEs.

moptipy.evaluation.end_statistics.KEY_SUCCESS_TIME_MILLIS: Final[str] = 'successTimeMillis'

The key for the success time millis.

moptipy.evaluation.ert module

Approximate the expected running time to reach certain goals.

The (empirically estimated) Expected Running Time (ERT) tries to give an impression of how long an algorithm needs to reach a certain solution quality.

The ERT for a problem instance is estimated as the ratio of the sum of all FEs that all the runs consumed until they either have discovered a solution of a given goal quality or exhausted their budget, divided by the number of runs that discovered a solution of the goal quality. The ERT is the mean expect runtime under the assumption of independent restarts after failed runs, which then may either succeed (consuming the mean runtime of the successful runs) or fail again (with the observed failure probability, after consuming the available budget).

The ERT itself can be considered as a function that associates the estimated runtime given above to all possible solution qualities that can be attained by an algorithm for a give problem. For qualities/goals that an algorithm did not attain in any run, the ERT becomes infinite.

  1. Kenneth V. Price. Differential Evolution vs. The Functions of the 2nd ICEO. In Russ Eberhart, Peter Angeline, Thomas Back, Zbigniew Michalewicz, and Xin Yao, editors, IEEE International Conference on Evolutionary Computation, April 13-16, 1997, Indianapolis, IN, USA, pages 153-157. IEEE Computational Intelligence Society. ISBN: 0-7803-3949-5. doi: https://doi.org/10.1109/ICEC.1997.592287

  2. Nikolaus Hansen, Anne Auger, Steffen Finck, Raymond Ros. Real-Parameter Black-Box Optimization Benchmarking 2010: Experimental Setup. Research Report RR-7215, INRIA. 2010. inria-00462481. https://hal.inria.fr/inria-00462481/document/

class moptipy.evaluation.ert.Ert(algorithm, instance, objective, encoding, n, time_unit, f_name, ert)[source]

Bases: MultiRun2DData

Estimate the Expected Running Time (ERT).

static create(source, f_lower_bound=None, use_default_lower_bounds=True)[source]

Create one single Ert record from an iterable of Progress records.

Parameters:
  • source (Iterable[Progress]) – the set of progress instances

  • f_lower_bound (Union[int, float, Callable, None], default: None) – the lower bound for the objective value, or a callable that is applied to a progress object to get the lower bound

  • use_default_lower_bounds (bool, default: True) – should we use the default lower bounds

Return type:

Ert

Returns:

the Ert record

ert: ndarray

The ert function

static from_progresses(source, consumer, f_lower_bound=None, use_default_lower_bounds=True, join_all_algorithms=False, join_all_instances=False, join_all_objectives=False, join_all_encodings=False)[source]

Compute one or multiple ERTs from a stream of end results.

Parameters:
  • source (Iterable[Progress]) – the set of progress instances

  • f_lower_bound (Optional[float], default: None) – the lower bound for the objective value

  • use_default_lower_bounds (bool, default: True) – should we use the default lower bounds

  • consumer (Callable[[Ert], Any]) – the destination to which the new records will be passed, can be the append method of a list

  • join_all_algorithms (bool, default: False) – should the Ert be aggregated over all algorithms

  • join_all_instances (bool, default: False) – should the Ert be aggregated over all algorithms

  • join_all_objectives (bool, default: False) – should the statistics be aggregated over all objective functions?

  • join_all_encodings (bool, default: False) – should the statistics be aggregated over all encodings?

Return type:

None

to_csv(file, put_header=True)[source]

Store a Ert record in a CSV file.

Parameters:
  • file (str) – the file to generate

  • put_header (bool, default: True) – should we put a header with meta-data?

Return type:

Path

Returns:

the fully resolved file name

moptipy.evaluation.ert.compute_single_ert(source, goal_f)[source]

Compute a single ERT.

The ERT is the sum of the time that the runs spend with a best-so-far quality greater or equal than goal_f divided by the number of runs that reached goal_f. The idea is that the unsuccessful runs spent their complete computational budget and once they have terminated, we would immediately start a new, independent run.

Warning: source must only contain progress objects that contain monotonously improving points. It must not contain runs that may get worse over time.

Parameters:
Return type:

float

Returns:

the ERT

>>> from moptipy.evaluation.progress import Progress as Pr
>>> from numpy import array as a
>>> f = "plainF"
>>> t = "FEs"
>>> r = [Pr("a", "i", "f", "e", 1, a([1, 4, 8]), t, a([10, 8, 5]), f),
...      Pr("a", "i", "f", "e", 2, a([1, 3, 6]), t, a([9, 7, 4]), f),
...      Pr("a", "i", "f", "e", 3, a([1, 2, 7, 9]), t, a([8, 7, 6, 3]), f),
...      Pr("a", "i", "f", "e", 4, a([1, 12]), t, a([9, 3]), f)]
>>> print(compute_single_ert(r, 11))
1.0
>>> print(compute_single_ert(r, 10))
1.0
>>> print(compute_single_ert(r, 9.5))  # (4 + 1 + 1 + 1) / 4 = 1.75
1.75
>>> print(compute_single_ert(r, 9))  # (4 + 1 + 1 + 1) / 4 = 1.75
1.75
>>> print(compute_single_ert(r, 8.5))  # (4 + 3 + 1 + 12) / 4 = 5
5.0
>>> print(compute_single_ert(r, 8))  # (4 + 3 + 1 + 12) / 4 = 5
5.0
>>> print(compute_single_ert(r, 7.3))  # (8 + 3 + 2 + 12) / 4 = 6.25
6.25
>>> print(compute_single_ert(r, 7))  # (8 + 3 + 2 + 12) / 4 = 6.25
6.25
>>> print(compute_single_ert(r, 6.1))  # (8 + 6 + 7 + 12) / 4 = 8.25
8.25
>>> print(compute_single_ert(r, 6))  # (8 + 6 + 7 + 12) / 4 = 8.25
8.25
>>> print(compute_single_ert(r, 5.7))  # (8 + 6 + 9 + 12) / 4 = 8.75
8.75
>>> print(compute_single_ert(r, 5))  # (8 + 6 + 9 + 12) / 4 = 8.75
8.75
>>> print(compute_single_ert(r, 4.2))  # (8 + 6 + 9 + 12) / 3 = 11.666...
11.666666666666666
>>> print(compute_single_ert(r, 4))  # (8 + 6 + 9 + 12) / 3 = 11.666...
11.666666666666666
>>> print(compute_single_ert(r, 3.8))  # (8 + 6 + 9 + 12) / 2 = 17.5
17.5
>>> print(compute_single_ert(r, 3))  # (8 + 6 + 9 + 12) / 2 = 17.5
17.5
>>> print(compute_single_ert(r, 2.9))
inf
>>> print(compute_single_ert(r, 2))
inf

moptipy.evaluation.ertecdf module

Approximate the ECDF over the ERT to reach certain goals.

The empirical cumulative distribution function (ECDF, see ecdf) is a function that shows the fraction of runs that were successful in attaining a certain goal objective value over the time. The (empirically estimated) Expected Running Time (ERT, see ert) is a function that tries to give an estimate how long a given algorithm setup will need (y-axis) to achieve given solution qualities (x-axis). It uses a set of runs of the algorithm on the problem to make this estimate under the assumption of independent restarts.

Now in the ERT-ECDF we combine both concepts to join several different optimization problems or problem instances into one plot. The goal becomes “solving the problem”. For each problem instance, we compute the ERT, i.e., estimate how long a given algorithm will need to reach the goal. This becomes the time axis. Over this time axis, the ERT-ECDF displays the fraction of instances that were solved.

  1. Thomas Weise, Zhize Wu, Xinlu Li, and Yan Chen. Frequency Fitness Assignment: Making Optimization Algorithms Invariant under Bijective Transformations of the Objective Function Value. IEEE Transactions on Evolutionary Computation 25(2):307-319. April 2021. Preprint available at arXiv:2001.01416v5 [cs.NE] 15 Oct 2020. http://arxiv.org/abs/2001.01416. doi: https://doi.org/10.1109/TEVC.2020.3032090

class moptipy.evaluation.ertecdf.ErtEcdf(algorithm, objective, encoding, n, n_insts, time_unit, f_name, goal_f, ecdf)[source]

Bases: Ecdf

The ERT-ECDF.

time_label()[source]

Get the time axis label.

Return type:

str

Returns:

the time key

moptipy.evaluation.frequency module

Load the encounter frequencies or the set of different objective values.

This tool can load the different objective values that exist or are encountered by optimization processes. This may be useful for statistical evaluations or fitness landscape analyses.

This tool is based on code developed by Mr. Tianyu LIANG (梁天宇), MSc student at the Institute of Applied Optimization (IAO, 应用优化研究所) of the School of Artificial Intelligence and Big Data (人工智能与大数据学院) of Hefei University (合肥学院).

moptipy.evaluation.frequency.aggregate_from_logs(path, consumer, per_instance=True, per_algorithm_instance=True, report_progress=True, report_lower_bound=False, report_upper_bound=False, report_goal_f=False, report_h=True, per_instance_known=<function <lambda>>)[source]

Parse a path, aggregate all discovered objective values to a consumer.

A version of from_logs() that aggregates results on a per-instance and/or per-algorithm-instance combination. The basic process of loading the data is described in from_logs().

Parameters:
  • path (str) – the path to parse

  • consumer (Callable[[MultiRunData, Counter[int | float]], Any]) – the consumer receiving the aggregated results

  • per_instance (bool, default: True) – pass results to the consumer that are aggregated over all algorithms and setups and runs for a given instance

  • per_algorithm_instance (bool, default: True) – pass results to the consumer that are aggregated over all runs and setups for a given algorithm-instance combination

  • report_progress (bool, default: True) – see from_logs()

  • report_lower_bound (bool, default: False) – see from_logs()

  • report_upper_bound (bool, default: False) – see from_logs()

  • report_h (bool, default: True) – see from_logs()

  • report_goal_f (bool, default: False) – see from_logs()

  • per_instance_known (Callable[[str], Iterable[int | float]], default: <function <lambda> at 0x7fd8a531dea0>) – see from_logs()

Return type:

None

moptipy.evaluation.frequency.from_logs(path, consumer, report_progress=True, report_lower_bound=False, report_upper_bound=False, report_goal_f=False, report_h=True, per_instance_known=<function <lambda>>)[source]

Parse a path, pass all discovered objective values per-run to a consumer.

This function parses the log files in a directory recursively. For each log file, it produces a Counter filled with all encountered objective values and their “pseudo” encounter frequencies. “pseudo” because the values returned depend very much on how the function is configured.

First, if all other parameters are set to False, the function passes a Counter to the consumer where the best encountered objective value has frequency 1 and no other data is present. If report_progress is True, then each time any objective value is encountered in the PROGRESS section, its counter is incremented by 1. If the PROGRESS section is present, that it. The best encountered objective value will have a count of at least one either way. If report_goal_f, report_lower_bound, or report_upper_bound are True, then it is ensured that the goal objective value of the optimization process, the lower bound of the objective function, or the upper bound of the objective function will have a corresponding count of at least 1 if they are present in the log files (in the SETUP section). If report_h is True, then a frequency fitness assignment H section is parsed, if present (see fea1plus1). Such a section contains tuples of objective values and encounter frequencies. These encounter frequencies are added to the counter. This means that if you set both report_progress and report_h to True, you will get frequencies that are too high. Finally, the function per_instance_known may return a set of known objective values for a given instance (based on its parameter, the instance name). Each such objective value will have a frequency of at least 1.

Generally, if we want the actual encounter frequencies of objective values, we could log all FEs to the log files and set report_progress to True and everything else to False. Then we get correct encounter frequencies. Alternatively, if we have a purly FFA-based algorithm (see, again, fea1plus1), then we can set report_progress to True and everything else to False to get a similar result, but the encounter frequencies then depend on the selection scheme. Alternatively, if we only care about whether an objective value was encountered or not, we can simply set both to True. Finally, if we want to get all possible objective values, then we may also set report_goal_f, report_lower_bound, or report_upper_bound to True if we are sure that the corresponding objective values do actually exist (and are not just bounds that can never be reached).

Parameters:
  • path (str) – the path to parse

  • consumer (Callable[[PerRunData, Counter[int | float]], Any]) – the consumer receiving, for each log file, an instance of PerRunData identifying the run and a dictionary with the objective values and lower bounds of their existence or encounter frequency. Warning: The dictionary will be cleared and re-used for all files.

  • report_progress (bool, default: True) – should all values in the PROGRESS section be reported, if such section exists?

  • report_lower_bound (bool, default: False) – should the lower bound reported, if any lower bound for the objective function is listed?

  • report_upper_bound (bool, default: False) – should the upper bound reported, if any upper bound for the objective function is listed?

  • report_h (bool, default: True) – should all values in the H section be reported, if such section exists?

  • report_goal_f (bool, default: False) – should we report the goal objective value, if it is specified?

  • per_instance_known (Callable[[str], Iterable[int | float]], default: <function <lambda> at 0x7fd8a531d240>) – a function that returns a set of known objective values per instance

Return type:

None

moptipy.evaluation.frequency.number_of_objective_values_to_csv(input_dir, output_file, per_instance=True, per_algorithm_instance=True, report_lower_bound=False, report_upper_bound=False, report_goal_f=False, per_instance_known=<function <lambda>>)[source]

Print the number of unique objective values to a CSV file.

A version of aggregate_from_logs() that collects the existing objective values and prints an overview to a file.

Parameters:
  • input_dir (str) – the path to parse

  • output_file (str) – the output file to generate

  • per_instance (bool, default: True) – pass results to the consumer that are aggregated over all algorithms and setups and runs for a given instance

  • per_algorithm_instance (bool, default: True) – pass results to the consumer that are aggregated over all runs and setups for a given algorithm-instance combination

  • report_lower_bound (bool, default: False) – see from_logs()

  • report_upper_bound (bool, default: False) – see from_logs()

  • report_goal_f (bool, default: False) – see from_logs()

  • per_instance_known (Callable[[str], Iterable[int | float]], default: <function <lambda> at 0x7fd8a531dfc0>) – see from_logs()

Return type:

None

moptipy.evaluation.ioh_analyzer module

Convert moptipy data to IOHanalyzer data.

The IOHanalyzer (https://iohanalyzer.liacs.nl/) is a tool that can analyze the performance of iterative optimization heuristics in a wide variety of ways. It is available both for local installation as well as online for direct and free use (see, again, https://iohanalyzer.liacs.nl/). The IOHanalyzer supports many of the diagrams that our evaluation utilities provide - and several more. Here we provide the function moptipy_to_ioh_analyzer() which converts the data generated by the moptipy experimentation function run_experiment() to the format that the IOHanalyzer understands, as documented at https://iohprofiler.github.io/IOHanalyzer/data/.

Notice that we here have implemented the meta data format version “0.3.2 and below”, as described at https://iohprofiler.github.io/IOHanalyzer/data/#iohexperimenter-version-032-and-below.

  1. Carola Doerr, Furong Ye, Naama Horesh, Hao Wang, Ofer M. Shir, and Thomas Bäck. Benchmarking Discrete Optimization Heuristics with IOHprofiler. Applied Soft Computing 88(106027):1-21. March 2020. doi: https://doi.org/10.1016/j.asoc.2019.106027},

  2. Carola Doerr, Hao Wang, Furong Ye, Sander van Rijn, and Thomas Bäck. IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuristics. October 15, 2018. New York, NY, USA: Cornell University, Cornell Tech. arXiv:1810.05281v1 [cs.NE] 11 Oct 2018. https://arxiv.org/pdf/1810.05281.pdf

  3. Hao Wang, Diederick Vermetten, Furong Ye, Carola Doerr, and Thomas Bäck. IOHanalyzer: Detailed Performance Analyses for Iterative Optimization Heuristics. ACM Transactions on Evolutionary Learning and Optimization 2(1)[3]:1-29. March 2022.doi: https://doi.org/10.1145/3510426.

  4. Jacob de Nobel and Furong Ye and Diederick Vermetten and Hao Wang and Carola Doerr and Thomas Bäck. IOHexperimenter: Benchmarking Platform for Iterative Optimization Heuristics. November 2021. New York, NY, USA: Cornell University, Cornell Tech. arXiv:2111.04077v2 [cs.NE] 17 Apr 2022. https://arxiv.org/pdf/2111.04077.pdf

  5. Data Format: Iterative Optimization Heuristics Profiler. https://iohprofiler.github.io/IOHanalyzer/data/

moptipy.evaluation.ioh_analyzer.moptipy_to_ioh_analyzer(results_dir, dest_dir, inst_name_to_func_id=<function __prefix>, inst_name_to_dimension=<function __int_suffix>, inst_name_to_inst_id=<function <lambda>>, suite='moptipy', f_name='plainF', f_standard=None)[source]

Convert moptipy log data to IOHanalyzer log data.

Parameters:
  • results_dir (str) – the directory where we can find the results in moptipy format

  • dest_dir (str) – the directory where we would write the IOHanalyzer style data

  • inst_name_to_func_id (Callable[[str], str], default: <function __prefix at 0x7fd8a531f400>) – convert the instance name to a function ID

  • inst_name_to_dimension (Callable[[str], int], default: <function __int_suffix at 0x7fd8a531f6d0>) – convert an instance name to a function dimension

  • inst_name_to_inst_id (Callable[[str], int], default: <function <lambda> at 0x7fd8a531fa30>) – convert the instance name an instance ID, which must be a positive integer number

  • suite (str, default: 'moptipy') – the suite name

  • f_name (str, default: 'plainF') – the objective name

  • f_standard (Optional[dict[str, int | float]], default: None) – a dictionary mapping instances to standard values

Return type:

None

moptipy.evaluation.log_parser module

Parsers for structured log data produced by the moptipy experiment API.

The moptipy Execution and experiment-running facility (run_experiment()) uses the class Logger from module logger to produce log files complying with https://thomasweise.github.io/moptipy/#log-files.

Here we provide a skeleton for parsing such log files in form of the class LogParser. It works similar to SAX-XML parsing in that the data is read is from files and methods that consume the data are invoked. By overwriting these methods, we can do useful things with the data.

For example in module end_results, the method from_logs() can load EndResult records from the logs and the method from_logs() in module progress reads the whole Progress that the algorithms make over time.

class moptipy.evaluation.log_parser.ExperimentParser[source]

Bases: LogParser

A log parser following our pre-defined experiment structure.

algorithm: str | None

The name of the algorithm to which the current log file belongs.

end_file()[source]

Finalize parsing a file.

Return type:

bool

instance: str | None

The name of the instance to which the current log file belongs.

rand_seed: int | None

The random seed of the current log file.

start_file(path)[source]

Decide whether to start parsing a file and setup meta-data.

Parameters:

path (Path) – the file path

Return type:

bool

Returns:

True if the file should be parsed, False if it should be skipped (and parse_file() should return True).

class moptipy.evaluation.log_parser.LogParser(print_begin_end=True, print_file_start=False, print_file_end=False, print_dir_start=True, print_dir_end=True)[source]

Bases: object

A log parser can parse a log file and separate the sections.

The log parser is designed to load data from text files generated by FileLogger. It can also recursively parse directories.

end_dir(path)[source]

Enter a directory to parse all files inside.

This method is called by parse_dir(). If it returns True, every sub-directory inside of it will be passed to start_dir() and every file will be passed to start_file().

Parameters:

path (Path) – the path of the directory

Return type:

bool

Returns:

True if all the files and sub-directories inside the directory should be processed, False if this directory should be skipped and parsing should continue with the next sibling directory

end_file()[source]

End a file.

This method is invoked when we have reached the end of the current file. Its return value, True or False, will then be returned by parse_file(), which is the entry point for the file parsing process.

Return type:

bool

Returns:

the return value to be returned by parse_file()

lines(lines)[source]

Consume all the lines from a section.

This method receives the complete text of a section, where all lines are separated and put into one list lines. Each line is stripped from whitespace and comments, empty lines are omitted. If this method returns True, we will continue parsing the file and move to the next section, if any, or directly to the end of the file parsing process (and call end_file()) if there is no more section in the file.

Parameters:

lines (list[str]) – the lines to consume

Return type:

bool

Returns:

True if further parsing is necessary and the next section should be fed to start_section(), False if the parsing process can be terminated, in which case we will fast-forward to end_file()

parse(path)[source]

Parse either a directory or a file.

If path identifies a file, parse_file() is invoked and its result is returned. If path identifies a directory, then parse_dir() is invoked and its result is returned.

Parameters:

path (str) – a path identifying either a directory or a file.

Return type:

bool

Returns:

the result of the appropriate parsing routing

Raises:

ValueError – if path does not identify a directory or file

parse_dir(path)[source]

Recursively parse the given directory.

Parameters:

path (str) – the path to the directory

Return type:

bool

Returns:

True either if start_dir() returned False or end_dir() returned True, False otherwise

parse_file(path)[source]

Parse the contents of a file.

This method first calls the function start_file() to see whether the file should be parsed. If start_file() returns True, then the file is parsed. If start_file() returns False, then this method returns False directly. If the file is parsed, then start_section() will be invoked for each section (until the parsing is finished) and lines() for each section content (if requested). At the end, end_file() is invoked.

This method can either be called directly or is called by parse_dir(). In the latter case, if parse_file() returned True, the next file in the current directory will be parsed. If it returns False, then no file located in the current directory will be parsed, while other directories and/or sub-directories will still be processed.

Parameters:

path (str) – the file to parse

Return type:

bool

Returns:

the return value received from invoking end_file()

start_dir(path)[source]

Enter a directory to parse all files inside.

This method is called by parse_dir(). If it returns True, every sub-directory inside of it will be passed to start_dir() and every file will be passed to start_file(). Only if True is returned, end_dir() will be invoked and its return value will be the return value of parse_dir(). If False is returned, then parse_dir() will return immediately and return True.

Parameters:

path (Path) – the path of the directory

Return type:

bool

Returns:

True if all the files and sub-directories inside the directory should be processed, False if this directory should be skipped and parsing should continue with the next sibling directory

start_file(path)[source]

Decide whether to start parsing a file.

This method is called by parse_file(). If it returns True, then we will open and parse the file. If it returns False, then the fill will not be parsed and parse_file() will return True immediately.

Parameters:

path (Path) – the file path

Return type:

bool

Returns:

True if the file should be parsed, False if it should be skipped (and parse_file() should return True).

start_section(title)[source]

Start a section.

If this method returns True, then all the lines of text of the section title will be read and together passed to lines(). If this method returns False, then the section will be skipped and we fast-forward to the next section, if any, or to the call of end_file().

Parameters:

title (str) – the section title

Return type:

bool

Returns:

True if the section data should be loaded and passed to lines(), False of the section can be skipped. In that case, we will fast-forward to the next start_section().

class moptipy.evaluation.log_parser.SetupAndStateParser[source]

Bases: ExperimentParser

A log parser which loads and processes the basic data from the logs.

This parser processes the SETUP and STATE sections of a log file and stores the performance-related information in member variables.

best_f: int | float | None

the best objective function value encountered

encoding: str | None

The name of the encoding to which the current log file belongs.

end_file()[source]

Finalize parsing a file and invoke the process() method.

This method invokes the process() method to process the parsed data.

Return type:

bool

Returns:

True if parsing should be continued, False otherwise

goal_f: int | float | None

the goal objective value, if any

last_improvement_fe: int | None

the objective function evaluation when the last improvement happened, in milliseconds

last_improvement_time_millis: int | None

the time step when the last improvement happened, in milliseconds

lines(lines)[source]

Process the lines loaded from a section.

If you process more sections, you should override this method. Your overridden method then can parse the data if you are in the right section. It should end with return super().lines(lines).

Parameters:

lines (list[str]) – the lines that have been loaded

Return type:

bool

Returns:

True if parsing should be continued, False otherwise

max_fes: int | None

the maximum permitted number of objective function evaluations, if any

max_time_millis: int | None

the maximum runtime limit in milliseconds, if any

needs_more_lines()[source]

Check whether we need to process more lines.

You can overwrite this method if your parser parses additional log sections. Your overwritten method should return True if more sections except STATE and SETUP still need to be parsed and return super().needs_more_lines() otherwise.

Return type:

bool

Returns:

True if more data needs to be processed, False otherwise

objective: str | None

The name of the objective to which the current log file belongs.

process()[source]

Process the result of the log parsing.

This function is invoked by end_file() if the end of the parsing process is reached. By now, all the data should have been loaded and it can be passed on to wherever it should be passed to.

Return type:

None

setup_section(data)[source]

Parse the data from the setup section.

Parameters:

data (dict[str, str]) – the parsed data

Return type:

None

start_file(path)[source]

Begin parsing the file identified by path.

Parameters:

path (Path) – the path identifying the file

Return type:

bool

start_section(title)[source]

Begin a section.

Parameters:

title (str) – the section title

Return type:

bool

Returns:

True if the text of the section should be processed, False otherwise

state_section(lines)[source]

Process the data of the final state section.

Parameters:

lines (list[str]) – the lines of that section

Return type:

None

total_fes: int | None

the total consumed runtime, in objective function evaluations

total_time_millis: int | None

the total consumed runtime in milliseconds

moptipy.evaluation.plot_ecdf module

Plot a set of ECDF or ERT-ECDF objects into one figure.

The empirical cumulative distribution function (ECDF, see ecdf) is a function that shows the fraction of runs that were successful in attaining a certain goal objective value over the time. The combination of ERT and ECDF is discussed in ertecdf.

  1. Nikolaus Hansen, Anne Auger, Steffen Finck, Raymond Ros. Real-Parameter Black-Box Optimization Benchmarking 2010: Experimental Setup. Research Report RR-7215, INRIA. 2010. inria-00462481. https://hal.inria.fr/inria-00462481/document/

  2. Dave Andrew Douglas Tompkins and Holger H. Hoos. UBCSAT: An Implementation and Experimentation Environment for SLS Algorithms for SAT and MAX-SAT. In Revised Selected Papers from the Seventh International Conference on Theory and Applications of Satisfiability Testing (SAT’04), May 10-13, 2004, Vancouver, BC, Canada, pages 306-320. Lecture Notes in Computer Science (LNCS), volume 3542. Berlin, Germany: Springer-Verlag GmbH. ISBN: 3-540-27829-X. doi: https://doi.org/10.1007/11527695_24.

  3. Holger H. Hoos and Thomas Stützle. Evaluating Las Vegas Algorithms - Pitfalls and Remedies. In Gregory F. Cooper and Serafín Moral, editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI’98), July 24-26, 1998, Madison, WI, USA, pages 238-245. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. ISBN: 1-55860-555-X.

moptipy.evaluation.plot_ecdf.plot_ecdf(ecdfs, figure, x_axis=<function AxisRanger.for_axis>, y_axis=<function AxisRanger.for_axis>, legend=True, distinct_colors_func=<function distinct_colors>, distinct_line_dashes_func=<function distinct_line_dashes>, importance_to_line_width_func=<function importance_to_line_width>, importance_to_alpha_func=<function importance_to_alpha>, importance_to_font_size_func=<function importance_to_font_size>, x_grid=True, y_grid=True, x_label=<function <lambda>>, x_label_inside=True, y_label=<function Lang.translate_func.<locals>.__tf>, y_label_inside=True, algorithm_priority=5.0, goal_priority=0.333, algorithm_sort_key=<function <lambda>>, goal_sort_key=<function <lambda>>, algorithm_namer=<function <lambda>>, color_algorithms_as_fallback_group=True)[source]

Plot a set of ECDF functions into one chart.

Parameters:
  • ecdfs (Iterable[Ecdf]) – the iterable of ECDF functions

  • figure (Axes | Figure) – the figure to plot in

  • x_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the x_axis ranger

  • y_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the y_axis ranger

  • legend (bool, default: True) – should we plot the legend?

  • distinct_colors_func (Callable[[int], Any], default: <function distinct_colors at 0x7fd8a4e51ab0>) – the function returning the palette

  • distinct_line_dashes_func (Callable[[int], Any], default: <function distinct_line_dashes at 0x7fd8a4e51b40>) – the function returning the line styles

  • importance_to_line_width_func (Callable[[int], float], default: <function importance_to_line_width at 0x7fd8a4e51c60>) – the function converting importance values to line widths

  • importance_to_alpha_func (Callable[[int], float], default: <function importance_to_alpha at 0x7fd8a4e51cf0>) – the function converting importance values to alphas

  • importance_to_font_size_func (Callable[[int], float], default: <function importance_to_font_size at 0x7fd8a4e51e10>) – the function converting importance values to font sizes

  • x_grid (bool, default: True) – should we have a grid along the x-axis?

  • y_grid (bool, default: True) – should we have a grid along the y-axis?

  • x_label (Union[None, str, Callable[[str], str]], default: <function <lambda> at 0x7fd8a531dc60>) – a callable returning the label for the x-axis, a label string, or None if no label should be put

  • x_label_inside (bool, default: True) – put the x-axis label inside the plot (so that it does not consume additional vertical space)

  • y_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate_func.<locals>.__tf at 0x7fd8a4e6c820>) – a callable returning the label for the y-axis, a label string, or None if no label should be put

  • y_label_inside (bool, default: True) – put the y-axis label inside the plot (so that it does not consume additional horizontal space)

  • algorithm_priority (float, default: 5.0) – the style priority for algorithms

  • goal_priority (float, default: 0.333) – the style priority for goal values

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4e6d2d0>) – the name function for algorithms receives an algorithm ID and returns an instance name; default=identity function

  • color_algorithms_as_fallback_group (bool, default: True) – if only a single group of data was found, use algorithms as group and put them in the legend

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4e6c790>) – the sort key function for algorithms

  • goal_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4e6d240>) – the sort key function for goals

Return type:

Axes

Returns:

the axes object to allow you to add further plot elements

moptipy.evaluation.plot_end_results module

Violin plots for end results.

moptipy.evaluation.plot_end_results.plot_end_results(end_results, figure, dimension='scaledF', y_axis=<function AxisRanger.for_axis>, distinct_colors_func=<function distinct_colors>, importance_to_line_width_func=<function importance_to_line_width>, importance_to_font_size_func=<function importance_to_font_size>, y_grid=True, x_grid=True, x_label=<function Lang.translate>, x_label_inside=True, x_label_location=1.0, y_label=<function Lang.translate>, y_label_inside=True, y_label_location=0.5, legend_pos='best', instance_sort_key=<function <lambda>>, algorithm_sort_key=<function <lambda>>, instance_namer=<function <lambda>>, algorithm_namer=<function <lambda>>)[source]

Plot a set of end result boxes/violins functions into one chart.

In this plot, we combine two visualizations of data distributions: box plots in the foreground and violin plots in the background.

The box plots show you the median, the 25% and 75% quantiles, the 95% confidence interval around the median (as notches), the 5% and 95% quantiles (as whiskers), the arithmetic mean (as triangle), and the outliers on both ends of the spectrum. This allows you also to compare data from different distributions rather comfortably, as you can, e.g., see whether the confidence intervals overlap.

The violin plots in the background are something like smoothed-out, vertical, and mirror-symmetric histograms. They give you a better impression about shape and modality of the distribution of the results.

Parameters:
  • end_results (Iterable[EndResult]) – the iterable of end results

  • figure (Axes | Figure) – the figure to plot in

  • dimension (str, default: 'scaledF') – the dimension to display

  • y_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the y_axis ranger

  • distinct_colors_func (Callable[[int], Any], default: <function distinct_colors at 0x7fd8a4e51ab0>) – the function returning the palette

  • importance_to_line_width_func (Callable[[int], float], default: <function importance_to_line_width at 0x7fd8a4e51c60>) – the function converting importance values to line widths

  • importance_to_font_size_func (Callable[[int], float], default: <function importance_to_font_size at 0x7fd8a4e51e10>) – the function converting importance values to font sizes

  • y_grid (bool, default: True) – should we have a grid along the y-axis?

  • x_grid (bool, default: True) – should we have a grid along the x-axis?

  • x_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate at 0x7fd8a63bb5b0>) – a callable returning the label for the x-axis, a label string, or None if no label should be put

  • x_label_inside (bool, default: True) – put the x-axis label inside the plot (so that it does not consume additional vertical space)

  • x_label_location (float, default: 1.0) – the location of the x-label

  • y_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate at 0x7fd8a63bb5b0>) – a callable returning the label for the y-axis, a label string, or None if no label should be put

  • y_label_inside (bool, default: True) – put the y-axis label inside the plot (so that it does not consume additional horizontal space)

  • y_label_location (float, default: 0.5) – the location of the y-label

  • legend_pos (str, default: 'best') – the legend position

  • instance_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4e6ec20>) – the sort key function for instances

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4e6ecb0>) – the sort key function for algorithms

  • instance_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4e6ed40>) – the name function for instances receives an instance ID and returns an instance name; default=identity function

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4e6edd0>) – the name function for algorithms receives an algorithm ID and returns an instance name; default=identity function

Return type:

Axes

Returns:

the axes object to allow you to add further plot elements

moptipy.evaluation.plot_end_statistics_over_parameter module

Plot the end results over a parameter.

moptipy.evaluation.plot_end_statistics_over_parameter.plot_end_statistics_over_param(data, figure, x_getter, y_dim='scaledF.geom', algorithm_getter=<function <lambda>>, instance_getter=<function <lambda>>, x_axis=<class 'moptipy.evaluation.axis_ranger.AxisRanger'>, y_axis=<function __make_y_axis>, legend=True, legend_pos='upper right', distinct_colors_func=<function distinct_colors>, distinct_line_dashes_func=<function distinct_line_dashes>, importance_to_line_width_func=<function importance_to_line_width>, importance_to_font_size_func=<function importance_to_font_size>, x_grid=True, y_grid=True, x_label=None, x_label_inside=True, x_label_location=0.5, y_label=<function __make_y_label>, y_label_inside=True, y_label_location=1.0, instance_priority=0.666, algorithm_priority=0.333, stat_priority=0.0, instance_sort_key=<function <lambda>>, algorithm_sort_key=<function <lambda>>, instance_namer=<function <lambda>>, algorithm_namer=<function <lambda>>, stat_sort_key=<function <lambda>>, color_algorithms_as_fallback_group=True)[source]

Plot a series of end result statistics over a parameter.

Parameters:
  • data (Iterable[EndStatistics]) – the iterable of EndStatistics

  • figure (Axes | Figure) – the figure to plot in

  • x_getter (Callable[[EndStatistics], int | float]) – the function computing the x-value for each statistics object

  • y_dim (str, default: 'scaledF.geom') – the dimension to be plotted along the y-axis

  • algorithm_getter (Callable[[EndStatistics], str | None], default: <function <lambda> at 0x7fd8a4e6fb50>) – the algorithm getter

  • instance_getter (Callable[[EndStatistics], str | None], default: <function <lambda> at 0x7fd8a4e6fbe0>) – the instance getter

  • x_axis (Union[AxisRanger, Callable[[], AxisRanger]], default: <class 'moptipy.evaluation.axis_ranger.AxisRanger'>) – the x_axis ranger

  • y_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function __make_y_axis at 0x7fd8a4e6fac0>) – the y_axis ranger

  • legend (bool, default: True) – should we plot the legend?

  • legend_pos (str, default: 'upper right') – the legend position

  • distinct_colors_func (Callable[[int], Any], default: <function distinct_colors at 0x7fd8a4e51ab0>) – the function returning the palette

  • distinct_line_dashes_func (Callable[[int], Any], default: <function distinct_line_dashes at 0x7fd8a4e51b40>) – the function returning the line styles

  • importance_to_line_width_func (Callable[[int], float], default: <function importance_to_line_width at 0x7fd8a4e51c60>) – the function converting importance values to line widths

  • importance_to_font_size_func (Callable[[int], float], default: <function importance_to_font_size at 0x7fd8a4e51e10>) – the function converting importance values to font sizes

  • x_grid (bool, default: True) – should we have a grid along the x-axis?

  • y_grid (bool, default: True) – should we have a grid along the y-axis?

  • x_label (Optional[str], default: None) – the label for the x-axi or None if no label should be put

  • x_label_inside (bool, default: True) – put the x-axis label inside the plot (so that it does not consume additional vertical space)

  • x_label_location (float, default: 0.5) – the location of the x-axis label

  • y_label (Union[None, str, Callable[[str], str]], default: <function __make_y_label at 0x7fd8a4e6fa30>) – a callable returning the label for the y-axis, a label string, or None if no label should be put

  • y_label_inside (bool, default: True) – put the y-axis label inside the plot (so that it does not consume additional horizontal space)

  • y_label_location (float, default: 1.0) – the location of the y-axis label

  • instance_priority (float, default: 0.666) – the style priority for instances

  • algorithm_priority (float, default: 0.333) – the style priority for algorithms

  • stat_priority (float, default: 0.0) – the style priority for statistics

  • instance_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4e6fc70>) – the sort key function for instances

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4e6fd00>) – the sort key function for algorithms

  • instance_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4e6fd90>) – the name function for instances receives an instance ID and returns an instance name; default=identity function

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4e6fe20>) – the name function for algorithms receives an algorithm ID and returns an instance name; default=identity function

  • stat_sort_key (Callable[[str], str], default: <function <lambda> at 0x7fd8a4e6feb0>) – the sort key function for statistics

  • color_algorithms_as_fallback_group (bool, default: True) – if only a single group of data was found, use algorithms as group and put them in the legend

Return type:

Axes

Returns:

the axes object to allow you to add further plot elements

moptipy.evaluation.plot_ert module

Plot a set of Ert objects into one figure.

The (empirically estimated) Expected Running Time (ERT, see ert) is a function that tries to give an estimate how long a given algorithm setup will need (y-axis) to achieve given solution qualities (x-axis). It uses a set of runs of the algorithm on the problem to make this estimate under the assumption of independent restarts.

  1. Kenneth V. Price. Differential Evolution vs. The Functions of the 2nd ICEO. In Russ Eberhart, Peter Angeline, Thomas Back, Zbigniew Michalewicz, and Xin Yao, editors, IEEE International Conference on Evolutionary Computation, April 13-16, 1997, Indianapolis, IN, USA, pages 153-157. IEEE Computational Intelligence Society. ISBN: 0-7803-3949-5. doi: https://doi.org/10.1109/ICEC.1997.592287

  2. Nikolaus Hansen, Anne Auger, Steffen Finck, Raymond Ros. Real-Parameter Black-Box Optimization Benchmarking 2010: Experimental Setup. Research Report RR-7215, INRIA. 2010. inria-00462481. https://hal.inria.fr/inria-00462481/document/

moptipy.evaluation.plot_ert.plot_ert(erts, figure, x_axis=<function AxisRanger.for_axis>, y_axis=<function AxisRanger.for_axis>, legend=True, distinct_colors_func=<function distinct_colors>, distinct_line_dashes_func=<function distinct_line_dashes>, importance_to_line_width_func=<function importance_to_line_width>, importance_to_alpha_func=<function importance_to_alpha>, importance_to_font_size_func=<function importance_to_font_size>, x_grid=True, y_grid=True, x_label=<function Lang.translate>, x_label_inside=True, y_label=<function Lang.translate_func.<locals>.__tf>, y_label_inside=True, instance_sort_key=<function <lambda>>, algorithm_sort_key=<function <lambda>>, instance_namer=<function <lambda>>, algorithm_namer=<function <lambda>>, instance_priority=0.666, algorithm_priority=0.333)[source]

Plot a set of Ert functions into one chart.

Parameters:
  • erts (Iterable[Ert]) – the iterable of Ert functions

  • figure (Axes | Figure) – the figure to plot in

  • x_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the x_axis ranger

  • y_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the y_axis ranger

  • legend (bool, default: True) – should we plot the legend?

  • distinct_colors_func (Callable[[int], Any], default: <function distinct_colors at 0x7fd8a4e51ab0>) – the function returning the palette

  • distinct_line_dashes_func (Callable[[int], Any], default: <function distinct_line_dashes at 0x7fd8a4e51b40>) – the function returning the line styles

  • importance_to_line_width_func (Callable[[int], float], default: <function importance_to_line_width at 0x7fd8a4e51c60>) – the function converting importance values to line widths

  • importance_to_alpha_func (Callable[[int], float], default: <function importance_to_alpha at 0x7fd8a4e51cf0>) – the function converting importance values to alphas

  • importance_to_font_size_func (Callable[[int], float], default: <function importance_to_font_size at 0x7fd8a4e51e10>) – the function converting importance values to font sizes

  • x_grid (bool, default: True) – should we have a grid along the x-axis?

  • y_grid (bool, default: True) – should we have a grid along the y-axis?

  • x_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate at 0x7fd8a63bb5b0>) – a callable returning the label for the x-axis, a label string, or None if no label should be put

  • x_label_inside (bool, default: True) – put the x-axis label inside the plot (so that it does not consume additional vertical space)

  • y_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate_func.<locals>.__tf at 0x7fd8a4ed9480>) – a callable returning the label for the y-axis, a label string, or None if no label should be put

  • y_label_inside (bool, default: True) – put the y-axis label inside the plot (so that it does not consume additional horizontal space)

  • instance_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4ed9510>) – the sort key function for instances

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4ed95a0>) – the sort key function for algorithms

  • instance_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4ed9630>) – the name function for instances receives an instance ID and returns an instance name; default=identity function

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4ed96c0>) – the name function for algorithms receives an algorithm ID and returns an instance name; default=identity function

  • instance_priority (float, default: 0.666) – the style priority for instances

  • algorithm_priority (float, default: 0.333) – the style priority for algorithms

Return type:

Axes

Returns:

the axes object to allow you to add further plot elements

moptipy.evaluation.plot_progress module

Plot a set of Progress or StatRun objects into one figure.

moptipy.evaluation.plot_progress.plot_progress(progresses, figure, x_axis=<function AxisRanger.for_axis>, y_axis=<function AxisRanger.for_axis>, legend=True, distinct_colors_func=<function distinct_colors>, distinct_line_dashes_func=<function distinct_line_dashes>, importance_to_line_width_func=<function importance_to_line_width>, importance_to_alpha_func=<function importance_to_alpha>, importance_to_font_size_func=<function importance_to_font_size>, x_grid=True, y_grid=True, x_label=<function Lang.translate>, x_label_inside=True, x_label_location=0.5, y_label=<function Lang.translate>, y_label_inside=True, y_label_location=1.0, instance_priority=0.666, algorithm_priority=0.333, stat_priority=0.0, instance_sort_key=<function <lambda>>, algorithm_sort_key=<function <lambda>>, stat_sort_key=<function <lambda>>, color_algorithms_as_fallback_group=True, instance_namer=<function <lambda>>, algorithm_namer=<function <lambda>>)[source]

Plot a set of progress or statistical run lines into one chart.

Parameters:
  • progresses (Iterable[Progress | StatRun]) – the iterable of progresses and statistical runs

  • figure (Axes | Figure) – the figure to plot in

  • x_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the x_axis ranger

  • y_axis (Union[AxisRanger, Callable[[str], AxisRanger]], default: <function AxisRanger.for_axis at 0x7fd8a5bcf1c0>) – the y_axis ranger

  • legend (bool, default: True) – should we plot the legend?

  • distinct_colors_func (Callable[[int], Any], default: <function distinct_colors at 0x7fd8a4e51ab0>) – the function returning the palette

  • distinct_line_dashes_func (Callable[[int], Any], default: <function distinct_line_dashes at 0x7fd8a4e51b40>) – the function returning the line styles

  • importance_to_line_width_func (Callable[[int], float], default: <function importance_to_line_width at 0x7fd8a4e51c60>) – the function converting importance values to line widths

  • importance_to_alpha_func (Callable[[int], float], default: <function importance_to_alpha at 0x7fd8a4e51cf0>) – the function converting importance values to alphas

  • importance_to_font_size_func (Callable[[int], float], default: <function importance_to_font_size at 0x7fd8a4e51e10>) – the function converting importance values to font sizes

  • x_grid (bool, default: True) – should we have a grid along the x-axis?

  • y_grid (bool, default: True) – should we have a grid along the y-axis?

  • x_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate at 0x7fd8a63bb5b0>) – a callable returning the label for the x-axis, a label string, or None if no label should be put

  • x_label_inside (bool, default: True) – put the x-axis label inside the plot (so that it does not consume additional vertical space)

  • x_label_location (float, default: 0.5) – the location of the x-axis label

  • y_label (Union[None, str, Callable[[str], str]], default: <function Lang.translate at 0x7fd8a63bb5b0>) – a callable returning the label for the y-axis, a label string, or None if no label should be put

  • y_label_inside (bool, default: True) – put the y-axis label inside the plot (so that it does not consume additional horizontal space)

  • y_label_location (float, default: 1.0) – the location of the y-axis label

  • instance_priority (float, default: 0.666) – the style priority for instances

  • algorithm_priority (float, default: 0.333) – the style priority for algorithms

  • stat_priority (float, default: 0.0) – the style priority for statistics

  • instance_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4eda0e0>) – the sort key function for instances

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4eda7a0>) – the sort key function for algorithms

  • stat_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a4c181f0>) – the sort key function for statistics

  • color_algorithms_as_fallback_group (bool, default: True) – if only a single group of data was found, use algorithms as group and put them in the legend

  • instance_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4c18160>) – the name function for instances receives an instance ID and returns an instance name; default=identity function

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a4c183a0>) – the name function for algorithms receives an algorithm ID and returns an algorithm name; default=identity function

Return type:

Axes

Returns:

the axes object to allow you to add further plot elements

moptipy.evaluation.progress module

Objects embodying the progress of a run over time.

An instance of Progress holds one time vector and an objective value (f) vector. The time dimension (stored in time_unit) can either be in FEs or in milliseconds and the objective value dimension (stored in f_name) can be raw objective values, standardized objective values, or normalized objective values. The two vectors together thus describe how a run of an optimization algorithm improves the objective value over time.

class moptipy.evaluation.progress.Progress(algorithm, instance, objective, encoding, rand_seed, time, time_unit, f, f_name, f_standard=None, only_improvements=True)[source]

Bases: PerRunData

An immutable record of progress information over a single run.

f: ndarray

The objective value data.

f_name: str

the name of the objective value axis.

f_standard: int | float | None

the standard value of the objective dimension. If f_name is F_NAME_SCALED or F_NAME_NORMALIZED. then this value has been used to normalize the data.

static from_logs(path, consumer, time_unit='FEs', f_name='plainF', f_standard=None, only_improvements=True)[source]

Parse a given path and pass all progress data found to the consumer.

If path identifies a file with suffix .txt, then this file is parsed. The appropriate Progress is created and appended to the collector. If path identifies a directory, then this directory is parsed recursively for each log file found, one record is passed to the consumer. The consumer is simply a callable function. You could pass in the append method of a list.

Parameters:
  • path (str) – the path to parse

  • consumer (Callable[[Progress], Any]) – the consumer, can be the append method of a list

  • time_unit (str, default: 'FEs') – the time unit

  • f_name (str, default: 'plainF') – the objective name

  • f_standard (Optional[dict[str, int | float]], default: None) – a dictionary mapping instances to standard values

  • only_improvements (bool, default: True) – enforce that f-values should be improving and time values increasing

Return type:

None

time: ndarray

The time axis data.

time_unit: str

The unit of the time axis.

to_csv(file, put_header=True)[source]

Store a Progress record in a CSV file.

Parameters:
  • file (str) – the file to generate

  • put_header (bool, default: True) – should we put a header with meta-data?

Return type:

str

Returns:

the fully resolved file name

moptipy.evaluation.stat_run module

Statistic runs are time-depending statistics over several runs.

moptipy.evaluation.stat_run.STAT_MAXIMUM: Final[str] = 'max'

The statistics key for the maximum

moptipy.evaluation.stat_run.STAT_MEAN_ARITH: Final[str] = 'mean'

The statistics key for the arithmetic mean.

moptipy.evaluation.stat_run.STAT_MEAN_GEOM: Final[str] = 'geom'

The statistics key for the geometric mean.

moptipy.evaluation.stat_run.STAT_MEAN_MINUS_STDDEV: Final[str] = 'mean-sd'

The key for the arithmetic mean minus the standard deviation.

moptipy.evaluation.stat_run.STAT_MEAN_PLUS_STDDEV: Final[str] = 'mean+sd'

The key for the arithmetic mean plus the standard deviation.

moptipy.evaluation.stat_run.STAT_MEDIAN: Final[str] = 'med'

The statistics key for the median.

moptipy.evaluation.stat_run.STAT_MINIMUM: Final[str] = 'min'

The statistics key for the minimum

moptipy.evaluation.stat_run.STAT_Q10: Final[str] = 'q10'

The key for the 10% quantile.

moptipy.evaluation.stat_run.STAT_Q159: Final[str] = 'q159'

The key for the 15.9% quantile. In a normal distribution, this quantile is where “mean - standard deviation” is located-

moptipy.evaluation.stat_run.STAT_Q841: Final[str] = 'q841'

The key for the 84.1% quantile. In a normal distribution, this quantile is where “mean + standard deviation” is located-

moptipy.evaluation.stat_run.STAT_Q90: Final[str] = 'q90'

The key for the 90% quantile.

moptipy.evaluation.stat_run.STAT_STDDEV: Final[str] = 'sd'

The statistics key for the standard deviation

class moptipy.evaluation.stat_run.StatRun(algorithm, instance, objective, encoding, n, time_unit, f_name, stat_name, stat)[source]

Bases: MultiRun2DData

A time-value statistic over a set of runs.

static create(source, statistics, consumer)[source]

Compute statistics from an iterable of Progress objects.

Parameters:
Return type:

None

static from_progress(source, statistics, consumer, join_all_algorithms=False, join_all_instances=False, join_all_objectives=False, join_all_encodings=False)[source]

Aggregate statist runs over a stream of progress data.

Parameters:
  • source (Iterable[Progress]) – the stream of progress data

  • statistics (Union[str, Iterable[str]]) – the statistics that should be computed per group

  • consumer (Callable[[StatRun], Any]) – the destination to which the new stat runs will be passed, can be the append method of a list

  • join_all_algorithms (bool, default: False) – should the statistics be aggregated over all algorithms

  • join_all_instances (bool, default: False) – should the statistics be aggregated over all algorithms

  • join_all_objectives (bool, default: False) – should the statistics be aggregated over all objective functions?

  • join_all_encodings (bool, default: False) – should the statistics be aggregated over all encodings?

Return type:

None

stat: ndarray

The time-dependent statistic.

stat_name: str

The name of this statistic.

moptipy.evaluation.stat_run.get_statistic(obj)[source]

Get the statistic of a given object.

Parameters:

obj (PerRunData | MultiRunData) – the object

Return type:

str | None

Returns:

the statistic string, or None if no statistic is specified

moptipy.evaluation.statistics module

A simple and immutable basic statistics record.

moptipy.evaluation.statistics.CSV_COLS: Final[int] = 6

The number of CSV columns.

moptipy.evaluation.statistics.EMPTY_CSV_ROW: Final[str] = ';;;;;'

The empty csv row of statistics

moptipy.evaluation.statistics.KEY_MAXIMUM: Final[str] = 'max'

The maximum value key.

moptipy.evaluation.statistics.KEY_MEAN_ARITH: Final[str] = 'mean'

The arithmetic mean value key.

moptipy.evaluation.statistics.KEY_MEAN_GEOM: Final[str] = 'geom'

The geometric mean value key.

moptipy.evaluation.statistics.KEY_MEDIAN: Final[str] = 'med'

The median value key.

moptipy.evaluation.statistics.KEY_MINIMUM: Final[str] = 'min'

The minimum value key.

moptipy.evaluation.statistics.KEY_STDDEV: Final[str] = 'sd'

The standard deviation value key.

class moptipy.evaluation.statistics.Statistics(n, minimum, median, mean_arith, mean_geom, maximum, stddev)[source]

Bases: object

An immutable record with statistics of one quantity.

static create(source)[source]

Create a statistics object from an iterable.

Parameters:

source (Iterable[int | float]) – the source

Returns:

a statistics representing the statistics over source

Return type:

Statistics

>>> from moptipy.evaluation.statistics import Statistics
>>> s = Statistics.create([3, 1, 2, 5])
>>> print(s.minimum)
1
>>> print(s.maximum)
5
>>> print(s.mean_arith)
2.75
>>> print(s.median)
2.5
>>> print(f"{s.mean_geom:.4f}")
2.3403
>>> print(f"{s.min_mean():.4f}")
2.3403
>>> print(f"{s.max_mean()}")
2.75
static csv_col_names(prefix)[source]

Make the column names suitable for a CSV-formatted file.

Parameters:

prefix (str) – the prefix name of the columns

Returns:

the column header strings

Return type:

Iterable[str]

static from_csv(n, row)[source]

Convert a CSV string (or separate CSV cells) to a Statistics object.

Parameters:
  • n (int) – the number of observations

  • row (Union[str, Iterable[str]]) – either the single string or the iterable of separate strings

Returns:

the Statistics instance

Return type:

Statistics

static getter(dimension)[source]

Produce a function that obtains the given dimension from Statistics.

Parameters:

dimension (str) – the dimension

Return type:

Callable[[Statistics], int | float | None]

Returns:

a callable that returns the value corresponding to the dimension

max_mean()[source]

Obtain the largest of the three mean values.

Returns:

the largest of mean_arith, mean_geom, and median

Return type:

Union[int, float]

maximum: int | float

The maximum.

mean_arith: int | float

The arithmetic mean value.

mean_geom: int | float | None

The geometric mean value, if defined.

median: int | float

The median.

min_mean()[source]

Obtain the smallest of the three mean values.

Returns:

the smallest of mean_arith, mean_geom, and median

Return type:

Union[int, float]

minimum: int | float

The minimum.

stddev: int | float

The standard deviation.

to_csv()[source]

Generate a string with the data of this record in CSV format.

Returns:

the string

Return type:

str

static value_to_csv(value)[source]

Expand a single value to a CSV row.

Parameters:

value (Union[int, float]) – the value

Returns:

the CSV row.

Return type:

str

moptipy.evaluation.styler module

Styler allows to discover groups of data and associate styles with them.

class moptipy.evaluation.styler.Styler(key_func=<function Styler.<lambda>>, namer=<class 'str'>, none_name='None', priority=0, name_sort_function=<function Styler.<lambda>>)[source]

Bases: object

A class for determining groups of elements and styling them.

add(obj)[source]

Add an object to the style collection.

Parameters:

obj – the object

Return type:

None

add_line_style(obj, style)[source]

Apply this styler’s contents based on the given object.

Parameters:
  • obj – the object for which the style should be created

  • style (dict[str, object]) – the decode to which the styles should be added

Return type:

None

add_to_legend(consumer)[source]

Add this styler to the legend.

Parameters:

consumer (Callable[[Artist], Any]) – the consumer to add to

Return type:

None

count: int

The number of registered keys.

finalize()[source]

Compile the styler collection.

Return type:

None

has_none: bool

Is there a None key? Valid after compilation.

has_style: bool

Does this styler have any style associated with it?

key_func: Final[Callable]

The key function of the grouper

keys: tuple[Any, ...]

The tuple with the keys becomes valid after compilation.

name_func: Final[Callable[[Any], str]]

The name function of the grouper

names: tuple[str, ...]

The tuple with the names becomes valid after compilation.

priority: float

The base priority of this grouper

set_line_alpha(line_alpha_func)[source]

Set that this styler should apply a line alpha.

Parameters:

line_alpha_func (Callable) – the line alpha function

Return type:

None

set_line_color(line_color_func)[source]

Set that this styler should apply line colors.

Parameters:

line_color_func (Callable) – a function returning the palette

Return type:

None

set_line_dash(line_dash_func)[source]

Set that this styler should apply line dashes.

Parameters:

line_dash_func (Callable) – a function returning the dashes

Return type:

None

set_line_width(line_width_func)[source]

Set that this styler should apply a line width.

Parameters:

line_width_func (Callable) – the line width function

Return type:

None

moptipy.evaluation.tabulate_end_results module

Provides function tabulate_end_results() to tabulate end results.

moptipy.evaluation.tabulate_end_results.DEFAULT_ALGORITHM_INSTANCE_STATISTICS: Final[tuple[str, str, str, str, str, str]] = ('bestF.min', 'bestF.mean', 'bestF.sd', 'bestFscaled.mean', 'lastImprovementFE.mean', 'lastImprovementTimeMillis.mean')

the default algorithm-instance statistics

moptipy.evaluation.tabulate_end_results.DEFAULT_ALGORITHM_SUMMARY_STATISTICS: Final[tuple[str, str, str, str, str, str]] = ('bestFscaled.min', 'bestFscaled.geom', 'bestFscaled.max', 'bestFscaled.sd', 'lastImprovementFE.mean', 'lastImprovementTimeMillis.mean')

the default algorithm summary statistics

moptipy.evaluation.tabulate_end_results.command_column_namer(col, put_dollars=True, summary_name=<function <lambda>>, setup_name=<function <lambda>>)[source]

Get the command-based names for columns, but in command format.

This function returns LaTeX-style commands for the column headers.

Parameters:
  • col (str) – the column identifier

  • put_dollars (bool, default: True) – surround the command with $

  • summary_name (Callable[[bool], str], default: <function <lambda> at 0x7fd8a49a0160>) – the name function for the key “summary”

  • setup_name (Callable[[bool], str], default: <function <lambda> at 0x7fd8a49a1240>) – the name function for the key KEY_ALGORITHM

Return type:

str

Returns:

the column name

moptipy.evaluation.tabulate_end_results.default_column_best(col)[source]

Get a function to compute the best value in a column.

The returned function can compute the best value in a column. If no value is best, it should return nan.

Parameters:

col (str) – the column name string

Return type:

Callable[[Iterable[int | float | None]], int | float]

Returns:

a function that can compute the best value per column

moptipy.evaluation.tabulate_end_results.default_column_namer(col)[source]

Get the default name for columns.

Parameters:

col (str) – the column identifier

Return type:

str

Returns:

the column name

moptipy.evaluation.tabulate_end_results.default_number_renderer(col)[source]

Get the number renderer for the specified column.

Time columns are rendered with less precision.

Parameters:

col (str) – the column name

Return type:

NumberRenderer

Returns:

the number renderer

moptipy.evaluation.tabulate_end_results.tabulate_end_results(end_results, file_name='table', dir_name='.', algorithm_instance_statistics=('bestF.min', 'bestF.mean', 'bestF.sd', 'bestFscaled.mean', 'lastImprovementFE.mean', 'lastImprovementTimeMillis.mean'), algorithm_summary_statistics=('bestFscaled.min', 'bestFscaled.geom', 'bestFscaled.max', 'bestFscaled.sd', 'lastImprovementFE.mean', 'lastImprovementTimeMillis.mean'), text_format_driver=<function Markdown.instance>, algorithm_sort_key=<function <lambda>>, instance_sort_key=<function <lambda>>, col_namer=<function default_column_namer>, col_best=<function default_column_best>, col_renderer=<function default_number_renderer>, put_lower_bound=True, lower_bound_getter=<function __getter.<locals>.__fixed>, lower_bound_name='lower_bound', use_lang=True, instance_namer=<function <lambda>>, algorithm_namer=<function <lambda>>)[source]

Tabulate the statistics about the end results of an experiment.

A two-part table is produced. In the first part, it presents summary statistics about each instance-algorithm combination, sorted by instance. In the second part, it presents summary statistics of the algorithms over all instances. The following default columns are provided:

  1. Part 1: Algorithm-Instance statistics
    • I: the instance name

    • lb(f): the lower bound of the objective value of the instance

    • setup: the name of the algorithm or algorithm setup

    • best: the best objective value reached by any run on that instance

    • mean: the arithmetic mean of the best objective values reached over all runs

    • sd: the standard deviation of the best objective values reached over all runs

    • mean1: the arithmetic mean of the best objective values reached over all runs, divided by the lower bound (or goal objective value)

    • mean(FE/ms): the arithmetic mean of objective function evaluations

      performed per millisecond, over all runs

    • mean(t): the arithmetic mean of the time in milliseconds when the last improving move of a run was applied, over all runs

  2. Part 2: Algorithm Summary Statistics
    • setup: the name of the algorithm or algorithm setup

    • best1: the minimum of the best objective values reached divided by the lower bound (or goal objective value) over all runs

    • gmean1: the geometric mean of the best objective values reached

      divided by the lower bound (or goal objective value) over all runs

    • worst1: the maximum of the best objective values reached divided by the lower bound (or goal objective value) over all runs

    • sd1: the standard deviation of the best objective values reached

      divided by the lower bound (or goal objective value) over all runs

    • gmean(FE/ms): the geometric mean of objective function evaluations performed per millisecond, over all runs

    • gmean(t): the geometric mean of the time in milliseconds when the last improving move of a run was applied, over all runs

You can freely configure which columns you want for each part and whether you want to have the second part included. Also, for each group of values, the best one is marked in bold face.

Depending on the parameter text_format_driver, the tables can be rendered in different formats, such as Markdown, LaTeX, and HTML.

Parameters:
  • end_results (Iterable[EndResult]) – the end results data

  • file_name (str, default: 'table') – the base file name

  • dir_name (str, default: '.') – the base directory

  • algorithm_instance_statistics (Iterable[str], default: ('bestF.min', 'bestF.mean', 'bestF.sd', 'bestFscaled.mean', 'lastImprovementFE.mean', 'lastImprovementTimeMillis.mean')) – the statistics to print

  • algorithm_summary_statistics (Optional[Iterable[str | None]], default: ('bestFscaled.min', 'bestFscaled.geom', 'bestFscaled.max', 'bestFscaled.sd', 'lastImprovementFE.mean', 'lastImprovementTimeMillis.mean')) – the summary statistics to print per algorithm

  • text_format_driver (Union[TextFormatDriver, Callable[[], TextFormatDriver]], default: <function Markdown.instance at 0x7fd8a4c1bc70>) – the text format driver

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a49a1750>) – a function returning sort keys for algorithms

  • instance_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a49a17e0>) – a function returning sort keys for instances

  • col_namer (Callable[[str], str], default: <function default_column_namer at 0x7fd8a4c1a170>) – the column namer function

  • col_best (Callable[[str], Callable[[Iterable[int | float | None]], int | float]], default: <function default_column_best at 0x7fd8a49a1510>) – the column-best getter function

  • col_renderer (Callable[[str], NumberRenderer], default: <function default_number_renderer at 0x7fd8a49a1630>) – the number renderer for the column

  • put_lower_bound (bool, default: True) – should we put the lower bound or goal objective value?

  • lower_bound_getter (Optional[Callable[[EndStatistics], int | float | None]], default: <function __getter.<locals>.__fixed at 0x7fd8a49a1870>) – the getter for the lower bound

  • lower_bound_name (str | None, default: 'lower_bound') – the name key for the lower bound to be passed to col_namer

  • use_lang (bool, default: True) – should we use the language to define the filename?

  • instance_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a49a1900>) – the name function for instances receives an instance ID and returns an instance name; default=identity function

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a49a1990>) – the name function for algorithms receives an algorithm ID and returns an instance name; default=identity function

Return type:

Path

Returns:

the path to the file with the tabulated end results

moptipy.evaluation.tabulate_result_tests module

Provides tabulate_result_tests() creating statistical comparison tables.

The function tabulate_result_tests() can compare two or more algorithms on multiple problem instances by using the Mann-Whitney U test [1-3] with the Bonferroni correction [4].

  1. Daniel F. Bauer. Constructing Confidence Sets Using Rank Statistics. In Journal of the American Statistical Association. 67(339):687-690. September 1972. doi: https://doi.org/10.1080/01621459.1972.10481279.

  2. Sidney Siegel and N. John Castellan Jr. Nonparametric Statistics for The Behavioral Sciences. 1988 In the Humanities/Social Sciences/Languages series. New York, NY, USA: McGraw-Hill. ISBN: 0-07-057357-3.

  3. Myles Hollander and Douglas Alan Wolfe. Nonparametric Statistical Methods. 1973. New York, NY, USA: John Wiley and Sons Ltd. ISBN: 047140635X.

  4. Olive Jean Dunn. Multiple Comparisons Among Means. In Journal of the American Statistical Association. 56(293):52-64. March 1961. doi: https://doi.org/10.1080/01621459.1961.10482090.

moptipy.evaluation.tabulate_result_tests.tabulate_result_tests(end_results, file_name='tests', dir_name='.', alpha=0.02, text_format_driver=<function Markdown.instance>, algorithm_sort_key=<function <lambda>>, instance_sort_key=<function <lambda>>, instance_namer=<function <lambda>>, algorithm_namer=<function <lambda>>, use_lang=False, p_renderer=<moptipy.utils.number_renderer.NumberRenderer object>, value_getter=<function <lambda>>)[source]

Tabulate the results of statistical comparisons of end result qualities.

end_results contains a sequence of EndResult records, each of which represents the result of one run of one algorithm on one instance. This function performs a two-tailed Mann-Whitney U test for each algorithm pair on each problem instance to see if the performances are statistically significantly different. The results of these tests are tabulated, together with their p-values, i.e., the probabilities that the observed differences would occur if the two algorithms would perform the same.

If p is sufficiently small, this means that it is unlikely that the difference in performance of the two compared algorithms that was observed stems from randomness. But what does “sufficiently small” mean? As parameter, this function accepts a significance threshold 0<alpha<0.5. alpha is, so to say, the upper limit of the “probability to be wrong” if we claim something like “algorithm A is better than algorithm B” that we are going to accept. In other words, if the table says that algorithm A is better than algorithm B, the chance that this is wrong is not more than alpha.

However, if we do many such tests, our chance to make at least one mistake grows. If we do n_tests tests, then the chance that all of them are “right” would be 1-[(1-alpha)^n_tests]. Since we are going to do multiple tests, the Bonferroni correction is therefore applied and alpha’=alpha/n_tests is computed. Then, the chance to have at least one of the n_tests test results to be wrong is not higher than alpha.

The test results are presented as follows: The first column of the generated table denotes the problem instances. Each of the other columns represents a pair of algorithms. In each cell, the pair is compared based on the results on the instance of the row. The cell ten holds the p-value of the two-tailed Mann-Whitney U test. If the first algorithm is significantly better (at p<alpha’) than the second algorithm, then the cell is marked with <. If the first algorithm is significantly worse (at p<alpha’) than the second algorithm, then the cell is marked with >. If the observed differences are not significant (p>=alpha’), then the cell is marked with ?.

However, there could also be a situation where a statistical comparison makes no sense as no difference could reliably be detected anyway. For example, if one algorithm has a smaller median result but a larger mean result, or if the medians are the same, or if the means are the same. Regardless of what outcome a test would have, we could not really claim that any of the algorithms was better or worse. In such cases, no test is performed and - is printed instead (signified by &mdash; in the markdown format).

Finally, the bottom row sums up the numbers of <, ?, and > outcomes for each algorithm pair.

Depending on the parameter text_format_driver, the tables can be rendered in different formats, such as Markdown, LaTeX, and HTML.

Parameters:
  • end_results (Iterable[EndResult]) – the end results data

  • file_name (str, default: 'tests') – the base file name

  • dir_name (str, default: '.') – the base directory

  • alpha (float, default: 0.02) – the threshold at which the two-tailed test result is accepted.

  • text_format_driver (Union[TextFormatDriver, Callable[[], TextFormatDriver]], default: <function Markdown.instance at 0x7fd8a4c1bc70>) – the text format driver

  • algorithm_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a49a2290>) – a function returning sort keys for algorithms

  • instance_sort_key (Callable[[str], Any], default: <function <lambda> at 0x7fd8a49a2830>) – a function returning sort keys for instances

  • instance_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a49a28c0>) – the name function for instances receives an instance ID and returns an instance name; default=identity function

  • algorithm_namer (Callable[[str], str], default: <function <lambda> at 0x7fd8a49a2950>) – the name function for algorithms receives an algorithm ID and returns an instance name; default=identity function

  • use_lang (bool, default: False) – should we use the language to define the filename

  • p_renderer (NumberRenderer, default: <moptipy.utils.number_renderer.NumberRenderer object at 0x7fd8a512ce20>) – the renderer for all probabilities

  • value_getter (Callable[[EndResult], int | float], default: <function <lambda> at 0x7fd8a6688ee0>) – the getter for the values that should be compared. By default, the best obtained objective values are compared. However, if you let the runs continue until they reach a certain goal quality, then you may want to compare the runtimes consumed until that quality is reached. Basically, you can use any of the getters provided by moptipy.evaluation.end_results.EndResult.getter(), but you must take care that the comparison makes sense, i.e., compare qualities under fixed-budget scenarios (the default behavior) or compare runtimes under scenarios with goal qualities - but do not mix up the scenarios.

Return type:

Path

Returns:

the path to the file with the tabulated test results