FL Strategies

Federated Averaging

class flamby.strategies.FedAvg(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, log=False, log_period=100, bits_counting_function=None, logdir='./runs', log_basename='fed_avg', seed=None)[source]

Federated Averaging Strategy class.

The Federated Averaging strategy is the most simple centralized FL strategy. Each client first trains his version of a global model locally on its data, the states of the model of each client are then weighted-averaged and returned to each client for further training.

References

https://arxiv.org/abs/1602.05629

Parameters:

training_dataloaders (List) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (torch.optim.Optimizer) – The class of the torch model optimizer to use at each step.
learning_rate (float) – The learning rate to be given to the optimizer_class.
num_updates (int) – The number of updates to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) – The target epsilon for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_target_delta (float) – The target delta for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_max_grad_norm (float) – The maximum L2 norm of per-sample gradients; used to enforce differential privacy. Defaults to None.
log (bool, optional) – Whether or not to store logs in tensorboard. Defaults to False.
log_period (int, optional) – If log is True then log the loss every log_period batch updates. Defauts to 100.
bits_counting_function (Union[callable, None], optional) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int. Defaults to None.
logdir (str, optional) – Where logs are stored. Defaults to ./runs.
log_basename (str, optional) – The basename of the created log_file. Defaults to fed_avg.

perform_round()[source]

Does a single federated averaging round. The following steps will be performed:

each model will be trained locally for num_updates batches.
the parameter updates will be collected and averaged. Averages will be weighted by the number of samples in each client
the averaged updates willl be used to update the local model

run()[source]: This method performs self.nrounds rounds of averaging and returns the list of models.

FedAdam

class flamby.strategies.FedAdam(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, seed=None, log=False, log_period=100, bits_counting_function=None, tau=0.001, server_learning_rate=0.01, beta1=0.9, beta2=0.999, logdir='./runs', log_basename='fed_adam')[source]

FedAdam Strategy class

References

https://arxiv.org/abs/2003.00295

Parameters:

training_dataloaders (List) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (torch.optim.Optimizer) – This is the client optimizer, it has to be SGD is FedAdam is chosen for the server optimizer. The adaptive logic sits with the server optimizer and is coded below with the aggregation.
learning_rate (float) – The learning rate to be given to the client optimizer_class.
num_updates (int) – The number of updates to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) –

The target epsilon for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_target_delta (float) – The target delta for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_max_grad_norm (float) – The maximum L2 norm of per-sample gradients; used to enforce differential privacy. Defaults to None.
seed (int) – Seed to use for differential privacy. Defaults to None
log (bool, optional) – Whether or not to store logs in tensorboard. Defaults to False.
log_period (int, optional) – If log is True then log the loss every log_period batch updates. Defauts to 100.
bits_counting_function (callable, optional) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int. Defaults to None.
tau (float, optional) – adaptivity hyperparameter for the Adam optimizer. Defaults to 1e-3.
server_learning_rate (float, optional) – The learning rate used by the server optimizer. Defaults to 1e-2.
beta1 (float, optional) – between 0 and 1, momentum parameter. Defaults to 0.9.
beta2 (float, optional) – between 0 and 1, second moment parameter. Defaults to 0.999.
logdir (str, optional) – The path where to store the logs. Defaults to ./runs.
log_basename (str, optional) – The basename of the logs that are created. Defaults to fed_adam.

calc_aggregated_delta_weights()

perform_round()[source]

Does a single federated round. The following steps will be performed:

each model will be trained locally for num_updates batches.
the parameter updates will be collected and averaged. Averages will be weighted by the number of samples in each client.
the averaged updates will be processed the same way as Adam or Yogi algorithms do in a non-federated setting.
the averaged updates will be used to update the local models.

run(): This method performs self.nrounds rounds of averaging and returns the list of models.

FedYogi

class flamby.strategies.FedYogi(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, seed=None, log=False, log_period=100, bits_counting_function=None, tau=0.001, server_learning_rate=0.01, beta1=0.9, beta2=0.999, logdir='./runs', log_basename='fed_yogi')[source]

FedYogi Strategy class

References

https://arxiv.org/abs/2003.00295

Parameters:

training_dataloaders (List) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (torch.optim.Optimizer) – This is the client optimizer, it has to be SGD is FedAdam is chosen for the server optimizer. The adaptive logic sits with the server optimizer and is coded below with the aggregation.
learning_rate (float) – The learning rate to be given to the client optimizer_class.
num_updates (int) – The number of updates to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) – The target epsilon for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_target_delta (float) – The target delta for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_max_grad_norm (float) – The maximum L2 norm of per-sample gradients; used to enforce differential privacy. Defaults to None.
seed (int) – Seed to use for differential privacy. Defaults to None
log (bool, optional) – Whether or not to store logs in tensorboard. Defaults to False.
log_period (int, optional) – If log is True then log the loss every log_period batch updates. Defauts to 100.
bits_counting_function (callable, optional) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int. Defaults to None.
tau (float, optional) – adaptivity hyperparameter for the Adam optimizer. Defaults to 1e-3.
server_learning_rate (float, optional) – The learning rate used by the server optimizer. Defaults to 1e-2.
beta1 (float, optional) – between 0 and 1, momentum parameter. Defaults to 0.9.
beta2 (float, optional) – between 0 and 1, second moment parameter. Defaults to 0.999.
logdir (str, optional) – The path where to store the logs. Defaults to ./runs.
log_basename (str, optional) – The basename of the logs that are created. Defaults to fed_yogi.

calc_aggregated_delta_weights()

perform_round()[source]

Does a single federated round. The following steps will be performed:

each model will be trained locally for num_updates batches.
the parameter updates will be collected and averaged. Averages will be weighted by the number of samples in each client.
the averaged updates will be processed the same way as Adam or Yogi algorithms do in a non-federated setting.
the averaged updates will be used to update the local models.

run(): This method performs self.nrounds rounds of averaging and returns the list of models.

FedAdagrad

class flamby.strategies.FedAdagrad(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, seed=None, log=False, log_period=100, bits_counting_function=None, tau=0.001, server_learning_rate=0.01, beta1=0.9, beta2=0.999, logdir='./runs', log_basename='fed_adagrad')[source]

FedYogi Strategy class

References

https://arxiv.org/abs/2003.00295

Parameters:

training_dataloaders (List) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (torch.optim.Optimizer) – This is the client optimizer, it has to be SGD is FedAdam is chosen for the server optimizer. The adaptive logic sits with the server optimizer and is coded below with the aggregation.
learning_rate (float) – The learning rate to be given to the client optimizer_class.
num_updates (int) – The number of updates to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) –

The target epsilon for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_target_delta (float) – The target delta for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_max_grad_norm (float) – The maximum L2 norm of per-sample gradients; used to enforce differential privacy. Defaults to None.
seed (int) – Seed to use for differential privacy. Defaults to None
log (bool, optional) – Whether or not to store logs in tensorboard. Defaults to False.
log_period (int, optional) – If log is True then log the loss every log_period batch updates. Defauts to 100.
bits_counting_function (callable, optional) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int. Defaults to None.
tau (float, optional) – adaptivity hyperparameter for the Adam optimizer. Defaults to 1e-3.
server_learning_rate (float, optional) – The learning rate used by the server optimizer. Defaults to 1e-2.
beta1 (float, optional) – between 0 and 1, momentum parameter. Defaults to 0.9.
beta2 (float, optional) – between 0 and 1, second moment parameter. Defaults to 0.999.
logdir (str, optional) – The path where to store the logs. Defaults to ./runs.
log_basename (str, optional) – The basename of the logs that are created. Defaults to fed_adagrad.

calc_aggregated_delta_weights()

perform_round()[source]

Does a single federated round. The following steps will be performed:

each model will be trained locally for num_updates batches.
the parameter updates will be collected and averaged. Averages will be weighted by the number of samples in each client.
the averaged updates will be processed the same way as Adam or Yogi algorithms do in a non-federated setting.
the averaged updates will be used to update the local models.

run(): This method performs self.nrounds rounds of averaging and returns the list of models.

Cyclic & RandomWalk

class flamby.strategies.Cyclic(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, seed=None, log=False, log_period=100, bits_counting_function=None, deterministic_cycle=False, rng=None, log_basename='cyclic', logdir='./runs')[source]

Cyclic Weight Transfer Strategy Class.

Under the cyclical weight transfer training strategy, the model is transferred, in a cyclic manner, to each client more than once.

References

https://pubmed.ncbi.nlm.nih.gov/29617797/

Parameters:

training_dataloaders (List[torch.utils.data.DataLoader]) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (callable torch.optim.Optimizer) – The class of the torch model optimizer to use at each step.
learning_rate (float) – The learning rate to be given to the optimizer_class.
num_updates (int) – The number of epochs to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) –

The target epsilon for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_target_delta (float) –

The target delta for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_max_grad_norm (float) –

The maximum L2 norm of per-sample gradients;
used to enforce differential privacy. Defaults to None.
log (bool, optional) – Whether or not to store logs in tensorboard.
log_period (int, optional)
bits_counting_function (callable, optional) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int
deterministic_cycle (bool, optional) – if True, we cycle through clients in their original order, otherwise, the clients are reshuffled at the beginning of every cycle.
rng (np.random._generator.Generator, optional) – used to reshuffle the clients. Defaults to None.
logdir (str, optional) – The path where to store the logs if there are some. Defaults to ./runs.
log_basename (str) – The basename of the created log file. Defaults to cyclic.

perform_round()[source]

run()[source]

FedProx

class flamby.strategies.FedProx(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, mu, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, seed=None, log=False, log_period=100, bits_counting_function=None, log_basename='fed_prox', logdir='./runs')[source]

FedProx Strategy class.

The FedProx strategy is a generalization and re-parametrization of FedAvg that adds a proximal term. Each client first trains his version of a global model locally using a proximal term, the states of the model of each client are then weighted-averaged and returned to each client for further training.

References

Parameters:

training_dataloaders (List) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (torch.optim.Optimizer) – The class of the torch model optimizer to use at each step.
learning_rate (float) – The learning rate to be given to the optimizer_class.
num_updates (int) – The number of updates to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) –

The target epsilon for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_target_delta (float) –

The target delta for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_max_grad_norm (float) – The maximum L2 norm of per-sample gradients; used to enforce differential privacy. Defaults to None.
mu (float) – The mu parameter involved in the proximal term. If mu = 0, then FedProx is reduced to FedAvg. Need to be tuned, there are no default mu values that would work for all settings.
log (bool, optional) – Whether or not to store logs in tensorboard. Defaults to False.
log_period (int, optional) – If log is True then log the loss every log_period batch updates. Defauts to 100.
bits_counting_function (Union[callable, None], optional) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int. Defaults to None.
log_basename (str, optional) – The basename of the created log file. Defaults to fed_prox.
logdir (str, optional) – The directory where to store the logs. Defaults to ./runs.

perform_round()

Does a single federated averaging round. The following steps will be performed:

each model will be trained locally for num_updates batches.
the parameter updates will be collected and averaged. Averages will be weighted by the number of samples in each client
the averaged updates willl be used to update the local model

run(): This method performs self.nrounds rounds of averaging and returns the list of models.

Scaffold

class flamby.strategies.Scaffold(training_dataloaders, model, loss, optimizer_class, learning_rate, num_updates, nrounds, dp_target_epsilon=None, dp_target_delta=None, dp_max_grad_norm=None, server_learning_rate=1, log=False, log_period=100, bits_counting_function=None, logdir='./runs', log_basename='scaffold')[source]

SCAFFOLD Strategy class

SCAFFOLD is a stateful algorithm which modifies the local update steps of FedAvg in order to provably correct for data heterogeneity across clients. If the data on each client is very different, their local updates via FedAvg will move in different directions. Each client maintains a ‘correction’ which estimates this difference between client updates and global average update. This correction is added to every local update on the client. This is a more efficient implementation of Scaffold whose communication and computation requirement exactly matches that of FedAvg. The current implementation assumes that SGD is the local optimizer, and that all clients participate every round.

References

https://arxiv.org/abs/1910.06378

Parameters:

training_dataloaders (List) – The list of training dataloaders from multiple training centers.
model (torch.nn.Module) – An initialized torch model.
loss (torch.nn.modules.loss._Loss) – The loss to minimize between the predictions of the model and the ground truth.
optimizer_class (torch.optim.Optimizer) – The class of the torch model optimizer to use at each step. It has to be SGD.
learning_rate (float) – The learning rate to be given to the clients optimizer_class.
num_updates (int) – The number of updates to do on each client at each round.
nrounds (int) – The number of communication rounds to do.
dp_target_epsilon (float) – The target epsilon for (epsilon, delta)-differential private guarantee. Defaults to None.
dp_target_delta (float) –

The target delta for (epsilon, delta)-differential
private guarantee. Defaults to None.
dp_max_grad_norm (float) –

The maximum L2 norm of per-sample gradients;
used to enforce differential privacy. Defaults to None.
server_learning_rate (float) – The learning rate with which the server’s updates are aggregated. Defaults to 1.
log (bool) – Whether or not to store logs in tensorboard. Defaults to False.
log_period (int) – If log is True then log the loss every log_period batch updates. Defauts to 100.
bits_counting_function (Union[callable, None]) – A function making sure exchanges respect the rules, this function can be obtained by decorating check_exchange_compliance in flamby.utils. Should have the signature List[Tensor] -> int. Defaults to None.
logdir (str) – Where to store the logs. Defaulst to ./runs.
log_basename (str) – The basename of the created logfile. Defaulst to scaffold.

perform_round()[source]

Does a single federated averaging round. The following steps will be performed:

each model will be trained locally for num_updates batches.
the parameter updates will be collected and averaged. Averages will be weighted by the number of samples in each client
the averaged updates willl be used to update the local model

run(): This method performs self.nrounds rounds of averaging and returns the list of models.