Inter-cluster Model Migration¶

The model migration function allows the model file to be copied to a cluster with a different party id and still be available, the following two scenarios require model migration.

the cluster of any of the model generation participants is redeployed and the party id of the cluster is changed after the deployment, e.g. the source participant is arbiter-10000#guest-9999#host-10000, changed to arbiter-10000#guest-99#host-10000
Any one or more of the participants will copy the model file from the source cluster to the target cluster, which needs to be used in the target cluster

Basics. 1. In the above two scenarios, the participant party_id of the model changes, such as arbiter-10000#guest-9999#host-10000 -> arbiter-10000#guest-99#host-10000, or arbiter-10000#guest -9999#host-10000 -> arbiter-100#guest-99#host-100 2. the model's participant party_id changes, so model_id and the model file involving party_id need to be changed 3. The overall process has three steps: copy and transfer the original model file, execute the model migration task on the original model file, and import the new model generated by the model migration task 4. where execute model migration task on the original model file is actually a temporary copy of the original model file at the execution, and then modify model_id and the content of the model file involving party_id according to the configuration, in order to adapt to the new participant party_id. 5. All the above steps need to be performed on all new participants, even if the party_id of one of the target participants has not changed. 6. the new participant cluster version needs to be greater than or equal to 1.5.1.

The migration process is as follows.

Transfer the model file¶

Please package and transfer the model files (including the directory named by model id) generated by the machine where the source participant fate flow service is located to the machine where the target participant fate flow is located, and please transfer the model files to a fixed directory as follows.

$FATE_PROJECT_BASE/model_local_cache

Instructions: 1. just transfer the folder, if you do the transfer by compressing and packing, please extract the model files to the directory where the model is located after the transfer. 2. Please transfer the model files one by one according to the source participants.

Preparation work before migration¶

Instructions¶

refer to fate flow client to install the client fate-client which supports model migration, only fate 1.5.1 and above are supported.

Execute the migration task¶

Description¶

Execute the migration task by replacing the source model file with the model_id, model_version and the contents of the model involving role and party_id according to the migration task configuration file
The cluster submitting the task must have completed the above migration preparation

1. Modify the configuration file¶

Modify the configuration file of the migration task in the new participant (machine) according to the actual situation, as follows for the migration task example configuration file migrate_model.json

{
  "job_parameters": {
    "federated_mode": "SINGLE"
  },
  "role": {
    "guest": [9999],
    "arbiter": [10000],
    "host": [10000]
  },
  "migrate_initiator": {
    "role": "guest",
    "party_id": 99
  },
  "migrate_role": {
    "guest": [99],
    "arbiter": [100],
    "host": [100]
  },
  "execute_party": {
    "guest": [9999],
    "arbiter": [10000],
    "host": [10000]
  },
  "model_id": "arbiter-10000#guest-9999#host-10000#model",
  "model_version": "202006171904247702041",
  "unify_model_version": "202901_0001"
}

Please save the above configuration content to a location in the server for modification.

The following are explanatory notes for the parameters in this configuration.

job_parameters: The federated_mode in this parameter has two optional parameters, which are MULTIPLE and SINGLE. If set to SINGLE, the migration job will be executed only in the party that submitted the migration job, then the job needs to be submitted in all new participants separately; if set to MULTIPLE, the job will be distributed to the participants specified in execute_party to execute the job, only the new The task will be distributed to the participant specified in execute_party, and only needs to be submitted in the new participant as migrate_initiator.
role: This parameter fills in the role of the participant that generated the original model and its corresponding party_id information.
migrate_initiator: This parameter is used to specify the task initiator information of the migrated model, and the initiator's role and party_id should be specified respectively.
migrate_role: This parameter is used to specify the role and party_id information of the migrated model.
execute_party: This parameter is used to specify the role and party_id information of the party_id that needs to execute the migration, which is the source cluster party_id.
model_id: This parameter is used to specify the model_id of the original model to be migrated.
model_version: This parameter is used to specify the model_version of the original model that needs to be migrated.
unify_model_version: This parameter is not required, it is used to specify the model_version of the new model. If this parameter is not provided, the new model will take the job_id of the migrated job as its new model_version.

Examples of the above configuration files are. 1. the source model has guest: 9999, host: 10000, arbiter: 10000, migrate the model to have guest: 99, host: 100, arbiter: 100, and the new initiator as guest: 99 2. federated_mode: SINGLE: means that each migration task will be executed only in the cluster where the task is submitted, then the task needs to be submitted in 99 and 100 respectively. 3. For example, if the task is executed in 99, then execute_party is configured as guest: [9999]. 4. For example, if the task is executed in 10, then execute_party is configured as "arbiter": [10000], "host": [10000]

2. Submit the migration task (separate operation in all target clusters)¶

The migration task needs to be submitted using FATE Flow CLI v2. The sample execution command is as follows

flow model migrate -c $FATE_FLOW_BASE/examples/model/migrate_model.json

3. Task execution results¶

The following is the content of the configuration file for the actual migration task.

{
  "job_parameters": {
    "federated_mode": "SINGLE"
  },
  "role": {
    "guest": [9999],
    "host": [10000]
  },
  "migrate_initiator": {
    "role": "guest",
    "party_id": 99
  },
  "migrate_role": {
    "guest": [99],
    "host": [100]
  },
  "execute_party": {
    "guest": [9999],
    "host": [10000]
  },
  "model_id": "guest-9999#host-10000#model",
  "model_version": "202010291539339602784",
  "unify_model_version": "fate_migration"
}

What this task achieves is that the cluster with party_id of 9999 (guest) and 10000 (host) generates a model with model_id of guest-9999#host-10000#model and model_version of 202010291539339602784 modifies the migration generation adaptation The new model with party_id of 99 (guest) and 100 (host) clusters

The following is the return result of the successful migration.

{
    "data": {
        "detail": {
            "guest": {
                "9999": {
                    "retcode": 0,
                    "retmsg": "Migrating model successfully. the configuration of model has been modified automatically. new model id is: guest-99#host-100#model, Model files can be found at '/data/projects/fate/temp/fate_flow/guest#99#guest-99#host-100#model_fate_migration.zip'.zip. migration.zip'."
                }
            },
            "host": {
                "10000": {
                    "retcode": 0,
                    "retmsg": "Migrating model successfully. The configuration of model has been modified automatically, Model files can be found at '/data/projects/fate/temp/fate_flow/host#100#guest-99#host-100#model_fate_migration.zip'.zip. migration.zip'."
                }
            }
        },
        "guest": {
            "9999": 0
        },
        "host": {
            "10000": 0
        }
    },
    "jobId": "202010292152299793981",
    "retcode": 0,
    "retmsg": "success"
}

After the task is successfully executed, a compressed file of the migrated model is generated in each machine of the executing party, and the path of this file can be obtained in the returned results. For example, the path of the post-migration model file for the guest side (9999) is: /data/projects/fate/temp/fate_flow/guest#99#guest-99#host-100#model_fate_migration.zip, and the path of the post-migration model file for the host side (10000) is: /data/projects/fate/temp/fate_flow/guest#99#guest-99#host-100#model_fate_migration.zip, and the path of the The path of the migrated model file is: /data/projects/fate/temp/fate_flow/host#100#guest-99#host-100#model_fate_migration.zip. The new model_id and model_version can also be obtained from the return.

4. Transfer files and import (operate separately in all target clusters)¶

After the migration task is successful, please manually transfer the newly generated model compression files to the fateflow machines of the target clusters. For example, the new model compression file generated by guest party (99) in point 3 needs to be transferred to the guest (99) machine. The zip file can be placed anywhere on the corresponding machine. Next, you need to configure the model import task, see import_model.json for the configuration file examples/import_model.json).

The following example describes the configuration file for importing the migrated model in guest (99).

{
  "role": "guest",
  "party_id": 99,
  "model_id": "guest-99#host-100#model",
  "model_version": "fate_migration",
  "file": "/data/projects/fate/python/temp/guest#99#guest-99#host-100#model_fate_migration.zip"
}

Please fill in the role role, the current party_id, the new model_id and model_version of the migrated model, and the path to the zip file of the migrated model according to the actual situation.

The following is a sample command to submit an imported model using FATE Flow CLI v2.

flow model import -c $FATE_FLOW_BASE/examples/model/import_model.json

The import is considered successful when it returns the following.

{
  "retcode": 0,
  "retmsg": "success"
}

The migration task is now complete and the user can use the new model_id and model_version for task submission to perform prediction tasks with the migrated model.