`adastop`.MultipleAgentsComparator#

class adastop.MultipleAgentsComparator(n=5, K=5, B=10000, stat_fun=<function MultipleAgentsComparator.<lambda>>, comparisons=None, alpha=0.01, seed=None)[source]#

Bases: object

Compare sequentially agents, with possible early stopping. At maximum, there can be n times K fits done.

For now, implement only a two-sided test.

Parameters:

n: int, or array of ints of size self.n_agents, default=5: If int, number of fits before each early stopping check. If array of int, a different number of fits is used for each agent.
K: int, default=5: number of check.
B: int, default=None: Number of random permutations used to approximate permutation distribution.
comparisons: list of tuple of indices or None: if None, all the pairwise comparison are done. If = [(0,1), (0,2)] for instance, the compare only 0 vs 1 and 0 vs 2
alpha: float, default=0.01: level of the test
seed: int or None, default = None

Attributes:

agent_names: list of str: list of the agents’ names.
decision: dict: decision of the tests for each comparison, keys are the comparisons and values are in {“equal”, “larger”, “smaller”}.
n_iters: dict: number of iterations (i.e. number of fits) used for each agent. Keys are the agents’ names and values are ints.

Methods

`compute_mean_diffs`(k, Z)	Compute the absolute value of the mean differences.
`get_results`()	Returns a dataframe with the results of the tests.
`partial_compare`(eval_values[, verbose])	Do the test of the k^th interim.
`plot_results`([agent_names, axes])	visual representation of results.
`plot_results_sota`([agent_names, axes])	visual representation of results when the first agent is compared to all the others.

Examples

Adastop can be used with the following code compatible with basically anything:

>>> comparator = MultipleAgentsComparator(n=6, K=6, B=10000, alpha=0.05)
>>>
>>> eval_values = {agent.name: [] for agent in agents}
>>>
>>> for k in range(comparator.K):
>>>    for  agent in enumerate(agents):
>>>        # If the agent is still in one of the comparison considered, then generate new evaluations.
>>>        if agent in comparator.current_comparisons.ravel():
>>>            eval_values[agent.name].append(train_evaluate(agent, n))
>>>    comparator.partial_compare(eval_values, verbose)
>>>    decisions = comparator.decisions # results of the decisions for step k
>>>    if comparator.is_finished:
>>>        break

Where train_evaluate(agent, n) is a function that trains n copies of agent and returns n evaluation values.

compute_mean_diffs(k, Z)[source]#: Compute the absolute value of the mean differences.

get_results()[source]#: Returns a dataframe with the results of the tests.

partial_compare(eval_values, verbose=True)[source]#

Do the test of the k^th interim.

Parameters:

eval_values: dict of agents and evaluations: keys are agent names and values are concatenation of evaluations till interim k, e.g. {“PP0”: [1,1,1,1,1], “SAC”: [42,42,42,42,42]}
verbose: bool: print Steps
Returns
——-
decisions: dictionary with comparisons as index and with values str in {“equal”, “larger”, “smaller”, “continue”}: Decision of the test at this step.
id_finished: bool: Whether the test is finished or not.
T: float: Test statistic.
bk: float: Thresholds of the tests.

plot_results(agent_names=None, axes=None)[source]#

visual representation of results.

Parameters:

agent_nameslist of str or None
axestuple of two matplotlib axes of None: if None, use the following: fig, (ax1, ax2) = plt.subplots(2, 1, gridspec_kw={“height_ratios”: [1, 2]}, figsize=(6,5))

plot_results_sota(agent_names=None, axes=None)[source]#

visual representation of results when the first agent is compared to all the others.

Parameters:

agent_nameslist of str or None
axestuple of two matplotlib axes of None: if None, use the following: fig, (ax1, ax2) = plt.subplots(2, 1, gridspec_kw={“height_ratios”: [1, 2]}, figsize=(6,5))

adastop.MultipleAgentsComparator#

`adastop`.MultipleAgentsComparator#