Substra 0.22.0 - 2022-10-20

Main changes

BREAKING CHANGE: the backend type is now set in the Client, the env variable DEBUG_SPAWNER is not used anymore. Default value is deployed.

before:

  export DEBUG_SPAWNER=subprocess
  client = substra.Client(debug=True)

after:

  client = substra.Client(backend_type=substra.BackendType.LOCAL_SUBPROCESS)

BREAKING CHANGE: schemas.ComputePlanSpec.clean_models property is now removed, the transient property on tasks outputs should be used instead.
BREAKING CHANGE: Model.category field has been removed.
BREAKING CHANGE: train and predict methods of all substrafl algos now takes datasamples as argument instead of X and y. This is impacting the user code only if he or she overwrite those methods instead of using the _local_train and _local_predict methods.
BREAKING CHANGE: The result of the get_data method from the opener is automatically provided to the given dataset as __init__ arg instead of x and y within the train and predict methods of all TorchAlgo classes. The user dataset should be adapted accordingly:

  from torch.utils.data import Dataset

  class MyDataset(Dataset):
      def __init__(self, x, y, is_inference=False) -> None:
          ...

  class MyAlgo(TorchFedAvgAlgo):
      def __init__(
          self,
      ):
          torch.manual_seed(seed)
          super().__init__(
              model=my_model,
              criterion=criterion,
              optimizer=optimizer,
              index_generator=index_generator,
              dataset=MyDataset,
          )

should be replaced with

  from torch.utils.data import Dataset

  class MyDataset(Dataset):
      def __init__(self, datasamples, is_inference=False) -> None:
          ...

  class MyAlgo(TorchFedAvgAlgo):
      def __init__(
          self,
      ):
          torch.manual_seed(seed)
          super().__init__(
              model=my_model,
              criterion=criterion,
              optimizer=optimizer,
              index_generator=index_generator,
              dataset=MyDataset,
          )

BREAKING CHANGE: Algo.category: do not rely on categories anymore, all algo categories will be returned as UNKNOWN.
BREAKING CHANGE: Replaced algo by algo_key in ComputeTask.

GUI

Improved user management: the last admin cannot be deleted anymore.

Substra

Algo categories are not checked anymore in local mode. Validations based on inputs and outputs are sufficient.
Pass substra-tools arguments via a file instead of the command line. This fixes an issue where compute plan would not run if there was too many data samples.

Substrafl

NOTABLE CHANGES due to breaking changes in substra-tools:
- The opener only exposes get_data and fake_data methods.
- The results of the above method is passed under the datasamples keys within the inputs dict arg of all tools methods (train, predict, aggregate, score).
- All method (train, predict, aggregate, score) now takes a task_properties argument (dict) in addition to inputs and outputs.
- The rank of a task previously passed under the rank key within the inputs is now given in the task_properties dict under the rank key.

This means that all opener.py file should be changed from:

  import substratools as tools

  class TestOpener(tools.Opener):
      def get_X(self, folders):
        ...

      def get_y(self, folders):
        ...

      def fake_X(self, n_samples=None):
        ...

      def fake_y(self, n_samples=None):
        ...

to:

  import substratools as tools

  class TestOpener(tools.Opener):
      def get_data(self, folders):
        ...

      def fake_data(self, n_samples=None):
        ...

This also implies that metrics has now access to the results of get_data and not only get_y as previously. The user should adapt all of his metrics file accordingly e.g.:

  class AUC(tools.Metrics):
      def score(self, inputs, outputs):
          """AUC"""
          y_true = inputs["y"]
          ...

      def get_predictions(self, path):
          return np.load(path)

  if __name__ == "__main__":
      tools.metrics.execute(AUC())

could be replace with:

  class AUC(tools.Metrics):
      def score(self, inputs, outputs, task_properties):
          """AUC"""
          datasamples = inputs["datasamples"]
          y_true = ... # getting target from the whole datasamples

      def get_predictions(self, path):
          return np.load(path)

  if __name__ == "__main__":
      tools.metrics.execute(AUC())

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.22.0

Substra 0.22.0 - 2022-10-20

GUI

Substra

Substrafl