Inter-cluster Model Migration¶
The model migration function makes it possible to copy the model file to a cluster with a different party_id
and still have it available.
- the cluster of any of the model generation participants is redeployed and the
party_id
of the cluster is changed after the deployment, e.g. the source participant isarbiter-10000#guest-9999#host-10000
, changed toarbiter-10000#guest-99#host-10000
- Any one or more of the participants will copy the model file from the source cluster to the target cluster, which needs to be used in the target cluster
Basics.
1. In the above two scenarios, the participant party_id
of the model changes, such as arbiter-10000#guest-9999#host-10000
-> arbiter-10000#guest-99#host-10000
, or arbiter-10000#guest -9999#host-10000
-> arbiter-100#guest-99#host-100
2. the model's participant party_id
changes, so model_id
and the model file involving party_id
need to be changed
3. The overall process has three steps: copy and transfer the original model file, execute the model migration task on the original model file, and import the new model generated by the model migration task.
4. where execute model migration task on the original model file is actually a temporary copy of the original model file at the execution, and then modify model_id
and the contents of the model file involving party_id
according to the configuration, in order to adapt to the new participant party_id
.
5. All the above steps need to be performed on all new participants, even if the party_id
of one of the target participants has not changed.
6. the new participant cluster version needs to be greater than or equal to 1.5.1
.
The migration process is as follows.
Transfer the model file¶
Please package and transfer the model files (including the directory named by model id) generated by the machine where the source participant fate flow service is located to the machine where the target participant fate flow is located, and please transfer the model files to a fixed directory as follows.
$FATE_PROJECT_BASE/model_local_cache
Instructions: 1. just transfer the folder, if you do the transfer by compressing and packing, please extract the model files to the directory where the model is located after the transfer. 2. Please transfer the model files one by one according to the source participants.
Preparation work before migration¶
Instructions¶
- refer to fate flow client to install the client fate-client which supports model migration, only fate 1.5.1 and above are supported.
Execute the migration task¶
Description¶
-
Execute the migration task by replacing the source model file with
model_id
,model_version
and the contents of the model involvingrole
andparty_id
according to the migration task configuration file -
The cluster submitting the task must complete the above migration preparation
1. Modify the configuration file¶
Modify the configuration file of the migration task in the new participant (machine) according to the actual situation, as follows for the migration task example configuration file migrate_model.json
{
"job_parameters": {
"federated_mode": "SINGLE"
},
"role": {
"guest": [9999],
"arbiter": [10000],
"host": [10000]
},
"migrate_initiator": {
"role": "guest",
"party_id": 99
},
"migrate_role": {
"guest": [99],
"arbiter": [100],
"host": [100]
},
"execute_party": {
"guest": [9999],
"arbiter": [10000],
"host": [10000]
},
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202006171904247702041",
"unify_model_version": "202901_0001"
}
Please save the above configuration content to a location in the server for modification.
The following are explanatory notes for the parameters in this configuration.
job_parameters
: Thefederated_mode
in this parameter has two optional parameters, which areMULTIPLE
andSINGLE
. If set toSINGLE
, the migration job will be executed only in the party that submitted the migration job, then the job needs to be submitted in all new participants separately; if set toMULTIPLE
, the job will be distributed to the participants specified inexecute_party
to execute the job, only the new The task will be distributed to the participant specified inexecute_party
, and only needs to be submitted in the new participant asmigrate_initiator
.role
: This parameter fills in therole
of the participant that generated the original model and its correspondingparty_id
information.migrate_initiator
: This parameter is used to specify the task initiator information of the migrated model, and the initiator'srole
andparty_id
should be specified respectively.migrate_role
: This parameter is used to specify therole
andparty_id
information of the migrated model.execute_party
: This parameter is used to specify therole
andparty_id
information of theparty_id
that needs to execute the migration, which is the source clusterparty_id
.model_id
: This parameter is used to specify themodel_id
of the original model to be migrated.model_version
: This parameter is used to specify themodel_version
of the original model that needs to be migrated.unify_model_version
: This parameter is not required, it is used to specify themodel_version
of the new model. If this parameter is not provided, the new model will take thejob_id
of the migrated job as its newmodel_version
.
Examples of the above configuration files are.
1. the source model has guest: 9999, host: 10000, arbiter: 10000,
migrate the model to have guest: 99, host: 100, arbiter: 100
as participants, and guest: 99
as the new initiator
2. federated_mode: SINGLE
means that each migration task will be executed only in the cluster where the task is submitted, then the task needs to be submitted in 99 and 100 respectively.
3. for example, if the task is executed at 99, then execute_party
is configured as "guest": [9999]
.
4. For example, if you execute at 100, then execute_party
is configured as "arbiter": [10000], "host": [10000]
2. Submit migration tasks (separate operations in all target clusters)¶
Migration tasks need to be committed using fate-client. A sample execution command is as follows.
flow model migrate -c $FATE_FLOW_BASE/examples/model/migrate_model.json
3. Task execution results¶
The following is the content of the configuration file for the actual migration task.
{
"job_parameters": {
"federated_mode": "SINGLE"
},
"role": {
"guest": [9999],
"host": [10000]
},
"migrate_initiator": {
"role": "guest",
"party_id": 99
},
"migrate_role": {
"guest": [99],
"host": [100]
},
"execute_party": {
"guest": [9999],
"host": [10000]
},
"model_id": "guest-9999#host-10000#model",
"model_version": "202010291539339602784",
"unify_model_version": "fate_migration"
}
What this task achieves is to migrate the model with model_id
of guest-9999#host-10000#model
and model_version
of 202010291539339602784
from a cluster with party_id
of 9999 (guest) and 10000 (host) to a new model that fits the party_id
of 99 (guest) and 100 (host) clusters
The following is the result of a successful migration.
{
"data": {
"detail": {
"guest": {
"9999": {
"retcode": 0,
"retmsg": "Migrating model successfully. the configuration of model has been modified automatically. new model id is: guest-99#host-100#model, Model files can be found at '/data/projects/fate/temp/fate_flow/guest#99#guest-99#host-100#model_fate_migration.zip'.zip. migration.zip'."
}
},
"host": {
"10000": {
"retcode": 0,
"retmsg": "Migrating model successfully. The configuration of model has been modified automatically, Model files can be found at '/data/projects/fate/temp/fate_flow/host#100#guest-99#host-100#model_fate_migration.zip'.zip. migration.zip'."
}
}
},
"guest": {
"9999": 0
},
"host": {
"10000": 0
}
},
"jobId": "202010292152299793981",
"retcode": 0,
"retmsg": "success"
}
After the task is successfully executed, a copy of the migrated model zip file is generated in each of the executor's machines, and the path to this file can be obtained in the returned results. For example, the path of the post-migration model file for 9999 (guest) is: /data/projects/fate/temp/fate_flow/guest#99#guest-99#host-100#model_fate_migration.zip
and for 10000 (host) The model file path is: /data/projects/fate/temp/fate_flow/host#100#guest-99#host-100#model_fate_migration.zip
. The new model_id
can be obtained from the return as well as the model_version
.
4. Transferring files and importing (separate operation in all target clusters)¶
After the migration task is successful, please manually transfer the newly generated model zip file to the fate flow machine of the target cluster. For example, the new model zip file generated by 9999 (guest) in point 3 needs to be transferred to the 99 (guest) machine. The zip file can be placed anywhere on the corresponding machine. Next, you need to configure the model import task, see import_model.json for the configuration file examples/import_model.json) (this configuration file is included in the zip file, please modify it according to the actual situation, do not use it directly).
The following is an example of the configuration file for importing the migrated model in guest (99).
{
"role": "guest",
"party_id": 99,
"model_id": "guest-99#host-100#model",
"model_version": "202010292152299793981",
"file": "/data/projects/fate/python/temp/guest#99#guest-99#host-100#202010292152299793981.zip"
}
Please fill in the role role
, the current party party_id
, the new model_id
and model_version
of the migrated model, and the path to the zip file of the migrated model according to the actual situation.
The following is a sample command to submit an imported model using fate-client.
flow model import -c $FATE_FLOW_BASE/examples/model/import_model.json
The import is considered successful when it returns the following.
{
"data": {
"job_id": "202208261102212849780",
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "foobar",
"party_id": "9999",
"role": "guest"
},
"retcode": 0,
"retmsg": "success"
}
The migration task is now complete and the user can submit the task with the new model_id
and model_version
to perform prediction tasks with the migrated model.