Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secretpad部署运行时遇到的问题 #130

Open
Meng-xiangkun opened this issue Aug 30, 2024 · 26 comments
Open

Secretpad部署运行时遇到的问题 #130

Meng-xiangkun opened this issue Aug 30, 2024 · 26 comments

Comments

@Meng-xiangkun
Copy link

Issue Type

Running

Have you searched for existing documents and issues?

Yes

OS Platform and Distribution

Linux centos7

All_in_one Version

Kuscia Version

0.10.0b0

What happend and What you expected to happen.

Secretpad源码打包镜像,已经部署了0.10.0b0版本的kuscia,kuscia部署了master节点和Alice、bob节点,启动Secretpad时遇到了一些问题

Log output.

_                   _
                       | |                 | |      secretpad  https://www.secretflow.org.cn/
 ___  ___  ___ _ __ ___| |_ _ __   __ _  __| |      Running in ALL-IN-ONE mode, CENTER function modules
/ __|/ _ \/ __| '__/ _ \ __| '_ \ / _` |/ _` |      Port: 8080
\__ \  __/ (__| | |  __/ |_| |_) | (_| | (_| |      Pid: 1
|___/\___|\___|_|  \___|\__| .__/ \__,_|\__,_|      Console: http://127.0.0.1:8080/
                           | |
                           |_|

secretpad  version: 0.5.0b0
kuscia     version: 0.6.0b0
secretflow version: 1.4.0b0
2024-08-30T11:02:11.736+08:00  INFO 1 --- [           main] o.s.secretpad.web.SecretPadApplication   : Starting SecretPadApplication v0.0.1-SNAPSHOT using Java 17.0.11 with PID 1 (/app/secretpad.jar started by root in /app)
2024-08-30T11:02:11.740+08:00  INFO 1 --- [           main] o.s.secretpad.web.SecretPadApplication   : The following 1 profile is active: "dev"
2024-08-30T11:02:13.143+08:00  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2024-08-30T11:02:13.516+08:00  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 364 ms. Found 38 JPA repository interfaces.
2024-08-30T11:02:15.763+08:00  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 443 (https) 8080 (http) 9001 (http)
2024-08-30T11:02:15.881+08:00  INFO 1 --- [           main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 4036 ms
2024-08-30T11:02:18.844+08:00  INFO 1 --- [           main] o.s.o.j.p.SpringPersistenceUnitInfo      : No LoadTimeWeaver setup: ignoring JPA class transformer
2024-08-30T11:02:21.442+08:00  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2024-08-30T11:02:21.904+08:00  INFO 1 --- [           main] o.s.d.j.r.query.QueryEnhancerFactory     : Hibernate is in classpath; If applicable, HQL parser will be used.
2024-08-30T11:02:23.288+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node, config=KusciaGrpcConfig(domainId=kuscia-system, host=10.233.74.6, port=8083, protocol=NOTLS, mode=MASTER, token=config/certs/token, certFile=config/certs/client.crt, keyFile=config/certs/client.pem)
2024-08-30T11:02:23.337+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Register kuscia node success, config=KusciaGrpcConfig(domainId=kuscia-system, host=10.233.74.6, port=8083, protocol=NOTLS, mode=MASTER, token=config/certs/token, certFile=config/certs/client.crt, keyFile=config/certs/client.pem)
2024-08-30T11:02:23.351+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node success, CHANNEL_FACTORIES={kuscia-system=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@1b557402}
2024-08-30T11:02:23.351+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node, config=KusciaGrpcConfig(domainId=alice, host=10.233.74.238, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/alice/token, certFile=config/certs/alice/client.crt, keyFile=config/certs/alice/client.pem)
2024-08-30T11:02:23.356+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Register kuscia node success, config=KusciaGrpcConfig(domainId=alice, host=10.233.74.238, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/alice/token, certFile=config/certs/alice/client.crt, keyFile=config/certs/alice/client.pem)
2024-08-30T11:02:23.361+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node success, CHANNEL_FACTORIES={kuscia-system=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@1b557402, alice=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@41cfcbb5}
2024-08-30T11:02:23.361+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node, config=KusciaGrpcConfig(domainId=bob, host=10.233.74.149, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/bob/token, certFile=config/certs/bob/client.crt, keyFile=config/certs/bob/client.pem)
2024-08-30T11:02:23.366+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Register kuscia node success, config=KusciaGrpcConfig(domainId=bob, host=10.233.74.149, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/bob/token, certFile=config/certs/bob/client.crt, keyFile=config/certs/bob/client.pem)
2024-08-30T11:02:23.370+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node success, CHANNEL_FACTORIES={bob=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@7f9083b4, kuscia-system=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@1b557402, alice=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@41cfcbb5}
2024-08-30T11:02:28.802+08:00  INFO 1 --- [           main] o.s.s.p.c.PersistenceConfiguration       : making sure database is WAL mode
2024-08-30T11:02:28.853+08:00  WARN 1 --- [           main] o.s.s.s.factory.CloudLogServiceFactory   : cloud service configuration is not available,please check your configuration,like ak,sk,host
2024-08-30T11:02:29.198+08:00  INFO 1 --- [           main] o.s.b.a.w.s.WelcomePageHandlerMapping    : Adding welcome page template: index
2024-08-30T11:02:29.866+08:00  INFO 1 --- [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 0 endpoint(s) beneath base path '/actuator'
2024-08-30T11:02:30.117+08:00  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 443 (https) 8080 (http) 9001 (http) with context path ''
2024-08-30T11:02:30.141+08:00  INFO 1 --- [           main] o.s.secretpad.web.SecretPadApplication   : Started SecretPadApplication in 19.666 seconds (process running for 21.177)
2024-08-30T11:02:30.334+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.DynamicKusciaChannelProvider   : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)
2024-08-30T11:02:30.425+08:00  INFO 1 --- [           main] o.s.s.web.init.DynamicBeanRegisterInit   : all mvc mapping [{POST [/api/v1alpha1/project/datatable/get]}, {POST [/api/v1alpha1/message/pending], consumes [application/json]}, {GET [/swagger-ui.html]}, {POST [/api/v1alpha1/model/status], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/project/datasource/list], consumes [application/json]}, {POST [/api/v1alpha1/project/tee/list]}, {POST [/api/v1alpha1/graph/node/output]}, {POST [/api/v1alpha1/model/discard], consumes [application/json]}, {POST [/api/v1alpha1/model/detail], consumes [application/json]}, {POST [/api/v1alpha1/datatable/pushToTee], consumes [application/json]}, {POST [/api/v1alpha1/project/job/get]}, {POST [/api/v1alpha1/model/serving/create], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/create], consumes [application/json]}, {POST [/api/v1alpha1/user/node/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/node/token], consumes [application/json]}, {POST [/api/v1alpha1/graph/detail]}, {GET [/edge || / || /model-submission/** || /message/** || /my-node/** || /logout/** || /record/** || /guide/** || /login/** || /edge/** || /home/** || /node/** || /dag/**]}, {POST [/api/v1alpha1/model/serving/delete], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/message/reply], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/update]}, {POST [/api/v1alpha1/project/job/stop]}, {POST [/api/v1alpha1/graph/node/max_index]}, {POST [/api/v1alpha1/project/datatable/delete]}, {POST [/api/v1alpha1/user/get]}, {POST [/api/v1alpha1/project/inst/add]}, {GET [/v3/api-docs/swagger-config], produces [application/json]}, {GET [/v3/api-docs], produces [application/json]}, {POST [/api/v1alpha1/datasource/list]}, {GET [/v3/api-docs.yaml], produces [application/vnd.oai.openapi]}, {POST [/api/v1alpha1/p2p/project/create], consumes [application/json]}, {POST [/api/v1alpha1/data/upload], consumes [multipart/form-data]}, {POST [/api/v1alpha1/graph/create]}, {POST [/api/v1alpha1/project/update/tableConfig], consumes [application/json]}, {GET [/sync], produces [text/event-stream]}, {POST [/api/v1alpha1/vote_sync/create], consumes [application/json]}, {POST [/api/v1alpha1/model/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/job/list]}, {POST [/api/v1alpha1/project/job/task/logs]}, {POST [/api/v1alpha1/p2p/project/update], consumes [application/json]}, {POST [/api/v1alpha1/graph/update]}, {POST [/api/v1alpha1/nodeRoute/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/meta/update]}, {POST [/api/v1alpha1/graph/delete]}, {POST [/api/v1alpha1/datasource/detail]}, {POST [/api/v1alpha1/node/refresh], consumes [application/json]}, {POST [/api/v1alpha1/model/pack], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/component/i18n]}, {POST [/api/v1alpha1/message/list], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/refresh], consumes [application/json]}, {POST [/api/v1alpha1/p2p/project/list], consumes [application/json]}, {POST [/api/v1alpha1/project/job/task/output]}, {POST [/api/login], consumes [application/json]}, {POST [/api/v1alpha1/project/node/add]}, {POST [/api/v1alpha1/p2p/project/archive], consumes [application/json]}, {POST [/api/v1alpha1/user/updatePwd]}, {POST [/api/v1alpha1/graph/stop]}, {POST [/api/logout], consumes [application/json]}, {POST [/api/v1alpha1/message/detail], consumes [application/json]}, {POST [/api/v1alpha1/node/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/status]}, {POST [/api/v1alpha1/cloud_log/sls]}, {POST [/api/v1alpha1/datasource/create]}, {POST [/api/v1alpha1/model/modelPartyPath], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/node/result/detail], consumes [application/json]}, {POST [/api/v1alpha1/feature_datasource/auth/list], consumes [application/json]}, { [/error], produces [text/html]}, {POST [/api/v1alpha1/version/list]}, {POST [/api/v1alpha1/datasource/delete]}, {POST [/api/v1alpha1/datatable/create], consumes [application/json]}, {POST [/api/v1alpha1/project/getOutTable], consumes [application/json]}, {POST [/api/v1alpha1/data/create], consumes [application/json]}, {POST [/api/v1alpha1/node/result/list], consumes [application/json]}, {POST [/api/v1alpha1/data/sync]}, {POST [/api/v1alpha1/datatable/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/logs]}, {POST [/api/v1alpha1/graph/list]}, {POST [/api/v1alpha1/model/page], consumes [application/json]}, {POST [/api/v1alpha1/model/info], consumes [application/json]}, {POST [/api/v1alpha1/project/get], consumes [application/json]}, {POST [/api/v1alpha1/datatable/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/create], consumes [application/json]}, {POST [/api/v1alpha1/node/update], consumes [application/json]}, {POST [/api/v1alpha1/datatable/list], consumes [application/json]}, {POST [/api/v1alpha1/project/datatable/add]}, {POST [/api/v1alpha1/nodeRoute/update], consumes [application/json]}, {POST [/api/v1alpha1/component/batch]}, {POST [/api/v1alpha1/approval/create], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/create], consumes [application/json]}, {POST [/api/v1alpha1/graph/start]}, {POST [/api/v1alpha1/project/update], consumes [application/json]}, {POST [/api/v1alpha1/node/page], consumes [application/json]}, {POST [/api/v1alpha1/user/remote/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/model/serving/detail], consumes [application/json]}, {POST [/api/v1alpha1/data/download]}, {POST [/api/v1alpha1/node/newToken], consumes [application/json]}, {POST [/api/v1alpha1/project/delete], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/page], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/listNode]}, {POST [/api/v1alpha1/approval/pull/status], consumes [application/json]}, {POST [/api/v1alpha1/project/list], consumes [application/json]}, {POST [/api/v1alpha1/node/list]}, { [/error]}, {POST [/api/v1alpha1/feature_datasource/create], consumes [application/json]}, {POST [/api/v1alpha1/component/list]}]
2024-08-30T11:02:30.427+08:00  INFO 1 --- [           main] o.s.s.web.init.DynamicBeanRegisterInit   : after unregister all mvc mapping [{POST [/api/v1alpha1/project/datatable/get]}, {POST [/api/v1alpha1/message/pending], consumes [application/json]}, {GET [/swagger-ui.html]}, {POST [/api/v1alpha1/model/status], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/project/datasource/list], consumes [application/json]}, {POST [/api/v1alpha1/project/tee/list]}, {POST [/api/v1alpha1/graph/node/output]}, {POST [/api/v1alpha1/model/discard], consumes [application/json]}, {POST [/api/v1alpha1/model/detail], consumes [application/json]}, {POST [/api/v1alpha1/datatable/pushToTee], consumes [application/json]}, {POST [/api/v1alpha1/project/job/get]}, {POST [/api/v1alpha1/model/serving/create], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/create], consumes [application/json]}, {POST [/api/v1alpha1/user/node/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/node/token], consumes [application/json]}, {POST [/api/v1alpha1/graph/detail]}, {GET [/edge || / || /model-submission/** || /message/** || /my-node/** || /logout/** || /record/** || /guide/** || /login/** || /edge/** || /home/** || /node/** || /dag/**]}, {POST [/api/v1alpha1/model/serving/delete], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/message/reply], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/update]}, {POST [/api/v1alpha1/project/job/stop]}, {POST [/api/v1alpha1/graph/node/max_index]}, {POST [/api/v1alpha1/project/datatable/delete]}, {POST [/api/v1alpha1/user/get]}, {POST [/api/v1alpha1/project/inst/add]}, {GET [/v3/api-docs/swagger-config], produces [application/json]}, {GET [/v3/api-docs], produces [application/json]}, {POST [/api/v1alpha1/datasource/list]}, {GET [/v3/api-docs.yaml], produces [application/vnd.oai.openapi]}, {POST [/api/v1alpha1/p2p/project/create], consumes [application/json]}, {POST [/api/v1alpha1/data/upload], consumes [multipart/form-data]}, {POST [/api/v1alpha1/graph/create]}, {POST [/api/v1alpha1/project/update/tableConfig], consumes [application/json]}, {GET [/sync], produces [text/event-stream]}, {POST [/api/v1alpha1/vote_sync/create], consumes [application/json]}, {POST [/api/v1alpha1/model/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/job/list]}, {POST [/api/v1alpha1/project/job/task/logs]}, {POST [/api/v1alpha1/p2p/project/update], consumes [application/json]}, {POST [/api/v1alpha1/graph/update]}, {POST [/api/v1alpha1/nodeRoute/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/meta/update]}, {POST [/api/v1alpha1/graph/delete]}, {POST [/api/v1alpha1/datasource/detail]}, {POST [/api/v1alpha1/node/refresh], consumes [application/json]}, {POST [/api/v1alpha1/model/pack], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/component/i18n]}, {POST [/api/v1alpha1/message/list], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/refresh], consumes [application/json]}, {POST [/api/v1alpha1/p2p/project/list], consumes [application/json]}, {POST [/api/v1alpha1/project/job/task/output]}, {POST [/api/login], consumes [application/json]}, {POST [/api/v1alpha1/project/node/add]}, {POST [/api/v1alpha1/p2p/project/archive], consumes [application/json]}, {POST [/api/v1alpha1/user/updatePwd]}, {POST [/api/v1alpha1/graph/stop]}, {POST [/api/logout], consumes [application/json]}, {POST [/api/v1alpha1/message/detail], consumes [application/json]}, {POST [/api/v1alpha1/node/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/status]}, {POST [/api/v1alpha1/cloud_log/sls]}, {POST [/api/v1alpha1/datasource/create]}, {POST [/api/v1alpha1/model/modelPartyPath], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/node/result/detail], consumes [application/json]}, {POST [/api/v1alpha1/feature_datasource/auth/list], consumes [application/json]}, { [/error], produces [text/html]}, {POST [/api/v1alpha1/version/list]}, {POST [/api/v1alpha1/datasource/delete]}, {POST [/api/v1alpha1/datatable/create], consumes [application/json]}, {POST [/api/v1alpha1/project/getOutTable], consumes [application/json]}, {POST [/api/v1alpha1/data/create], consumes [application/json]}, {POST [/api/v1alpha1/node/result/list], consumes [application/json]}, {POST [/api/v1alpha1/data/sync]}, {POST [/api/v1alpha1/datatable/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/logs]}, {POST [/api/v1alpha1/graph/list]}, {POST [/api/v1alpha1/model/page], consumes [application/json]}, {POST [/api/v1alpha1/model/info], consumes [application/json]}, {POST [/api/v1alpha1/project/get], consumes [application/json]}, {POST [/api/v1alpha1/datatable/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/create], consumes [application/json]}, {POST [/api/v1alpha1/node/update], consumes [application/json]}, {POST [/api/v1alpha1/datatable/list], consumes [application/json]}, {POST [/api/v1alpha1/project/datatable/add]}, {POST [/api/v1alpha1/nodeRoute/update], consumes [application/json]}, {POST [/api/v1alpha1/component/batch]}, {POST [/api/v1alpha1/approval/create], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/create], consumes [application/json]}, {POST [/api/v1alpha1/graph/start]}, {POST [/api/v1alpha1/project/update], consumes [application/json]}, {POST [/api/v1alpha1/node/page], consumes [application/json]}, {POST [/api/v1alpha1/user/remote/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/model/serving/detail], consumes [application/json]}, {POST [/api/v1alpha1/data/download]}, {POST [/api/v1alpha1/node/newToken], consumes [application/json]}, {POST [/api/v1alpha1/project/delete], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/page], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/listNode]}, {POST [/api/v1alpha1/approval/pull/status], consumes [application/json]}, {POST [/api/v1alpha1/project/list], consumes [application/json]}, {POST [/api/v1alpha1/node/list]}, { [/error]}, {POST [/api/v1alpha1/feature_datasource/create], consumes [application/json]}, {POST [/api/v1alpha1/component/list]}]
2024-08-30T11:02:30.463+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.MasterRouteInit   : kuscia protocol: https://
2024-08-30T11:02:30.478+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.MasterRouteInit   : update node router id: 1, srcNetAddress is:https://127.0.0.1:28080, dstNetAddress is:https://127.0.0.1:38080
2024-08-30T11:02:30.487+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.MasterRouteInit   : update node router id: 2, srcNetAddress is:https://127.0.0.1:38080, dstNetAddress is:https://127.0.0.1:28080
2024-08-30T11:02:30.487+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.TeeResourceInit   : init tee node ALL-IN-ONE CENTER
2024-08-30T11:02:30.503+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)
2024-08-30T11:02:30.716+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system  Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainRouteService/BatchQueryDomainRouteStatus
2024-08-30T11:02:30.717+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system  Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainService/QueryDomain
2024-08-30T11:02:30.738+08:00  INFO 1 --- [ault-executor-2] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from IDLE to CONNECTING
2024-08-30T11:02:30.738+08:00  INFO 1 --- [ault-executor-3] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from IDLE to CONNECTING
2024-08-30T11:02:30.748+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: 
2024-08-30T11:02:30.748+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domain_id: "tee"

2024-08-30T11:02:31.005+08:00  INFO 1 --- [ault-executor-0] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from CONNECTING to READY
2024-08-30T11:02:31.005+08:00  INFO 1 --- [ault-executor-1] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from CONNECTING to READY
2024-08-30T11:02:31.097+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {
  code: 11100
  message: "DomainRoute keys can not be empty"
}

2024-08-30T11:02:31.115+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {
  message: "success"
}
data {
  domain_id: "tee"
  deploy_token_statuses {
    token: "kYdZaMHb8FYkCKJNgAetz9KQhvZbFzyq"
    state: "unused"
    last_transition_time: "2024-08-30T02:58:45Z"
  }
  auth_center {
    authentication_type: "Token"
    token_gen_method: "UID-RSA-GEN"
  }
}

2024-08-30T11:02:31.139+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, bob], source=NodeRouteDO(srcNodeId=alice, dstNodeId=bob, routeId=1, srcNetAddress=https://127.0.0.1:28080, dstNetAddress=https://127.0.0.1:38080)) will be send to [alice, bob]
2024-08-30T11:02:31.139+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [alice, bob] will be send
2024-08-30T11:02:31.140+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, bob], source=NodeRouteDO(srcNodeId=alice, dstNodeId=bob, routeId=1, srcNetAddress=https://127.0.0.1:28080, dstNetAddress=https://127.0.0.1:38080)) , 4 wait to sync
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, alice], source=NodeRouteDO(srcNodeId=bob, dstNodeId=alice, routeId=2, srcNetAddress=https://127.0.0.1:38080, dstNetAddress=https://127.0.0.1:28080)) will be send to [bob, alice]
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [bob, alice] will be send
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, alice], source=NodeRouteDO(srcNodeId=bob, dstNodeId=alice, routeId=2, srcNetAddress=https://127.0.0.1:38080, dstNetAddress=https://127.0.0.1:28080)) , 3 wait to sync
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [] will be send
2024-08-30T11:02:31.142+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeDO, projectId=null, nodeIds=[tee], source=NodeDO(nodeId=tee, name=tee, auth=tee, description=tee, masterNodeId=null, controlNodeId=tee, netAddress=127.0.0.1:48080, token=null, type=embedded, mode=0)) , 2 wait to sync
2024-08-30T11:02:31.142+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, tee], source=NodeRouteDO(srcNodeId=alice, dstNodeId=tee, routeId=3, srcNetAddress=127.0.0.1:28080, dstNetAddress=127.0.0.1:48080)) will be send to [alice, tee]
2024-08-30T11:02:31.142+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [alice, tee] will be send
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, tee], source=NodeRouteDO(srcNodeId=alice, dstNodeId=tee, routeId=3, srcNetAddress=127.0.0.1:28080, dstNetAddress=127.0.0.1:48080)) , 1 wait to sync
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, alice], source=NodeRouteDO(srcNodeId=tee, dstNodeId=alice, routeId=4, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:28080)) will be send to [tee, alice]
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [tee, alice] will be send
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, alice], source=NodeRouteDO(srcNodeId=tee, dstNodeId=alice, routeId=4, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:28080)) , 0 wait to sync
2024-08-30T11:02:31.146+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, tee], source=NodeRouteDO(srcNodeId=bob, dstNodeId=tee, routeId=5, srcNetAddress=127.0.0.1:38080, dstNetAddress=127.0.0.1:48080)) will be send to [bob, tee]
2024-08-30T11:02:31.147+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [bob, tee] will be send
2024-08-30T11:02:31.148+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, tee], source=NodeRouteDO(srcNodeId=bob, dstNodeId=tee, routeId=5, srcNetAddress=127.0.0.1:38080, dstNetAddress=127.0.0.1:48080)) , 0 wait to sync
2024-08-30T11:02:31.152+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, bob], source=NodeRouteDO(srcNodeId=tee, dstNodeId=bob, routeId=6, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:38080)) will be send to [tee, bob]
2024-08-30T11:02:31.152+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [tee, bob] will be send
2024-08-30T11:02:31.152+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, bob], source=NodeRouteDO(srcNodeId=tee, dstNodeId=bob, routeId=6, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:38080)) , 0 wait to sync
2024-08-30T11:02:31.155+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.TeeResourceInit   : push alice-table datatable to tee node
2024-08-30T11:02:31.156+08:00  INFO 1 --- [           main] o.s.s.service.impl.DatatableServiceImpl  : Push datatable to teeNode with node id = alice, datatable id = alice-table
2024-08-30T11:02:31.196+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)
2024-08-30T11:02:31.201+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system  Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainDataGrantService/CreateDomainDataGrant
2024-08-30T11:02:31.202+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domaindata_id: "alice-table"
grant_domain: "tee"
domain_id: "alice"

2024-08-30T11:02:31.226+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {
  code: 11100
  message: "domaindata [alice-table] not exists"
}

2024-08-30T11:02:31.229+08:00 ERROR 1 --- [           main] o.s.s.m.i.d.DatatableGrantManager        : create domain grant from kusciaapi failed: code=11100, message=domaindata [alice-table] not exists, nodeId=alice, grantNodeId=tee, domainDataId=alice-table
2024-08-30T11:02:31.236+08:00  INFO 1 --- [           main] .s.b.a.l.ConditionEvaluationReportLogger : 

Error starting ApplicationContext. To display the condition evaluation report re-run your application with 'debug' enabled.
2024-08-30T11:02:31.257+08:00 ERROR 1 --- [           main] o.s.boot.SpringApplication               : Application run failed

org.secretflow.secretpad.common.exception.SecretpadException: 
	at org.secretflow.secretpad.common.exception.SecretpadException.of(SecretpadException.java:58)
	at org.secretflow.secretpad.manager.integration.datatablegrant.DatatableGrantManager.createDomainGrant(DatatableGrantManager.java:96)
	at org.secretflow.secretpad.service.impl.DatatableServiceImpl.pushDatatableToTeeNode(DatatableServiceImpl.java:359)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751)
	at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:391)
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:703)
	at org.secretflow.secretpad.service.impl.DatatableServiceImpl$$SpringCGLIB$$0.pushDatatableToTeeNode(<generated>)
	at org.secretflow.secretpad.web.init.TeeResourceInit.initAliceBobDatableToTee(TeeResourceInit.java:148)
	at org.secretflow.secretpad.web.init.TeeResourceInit.run(TeeResourceInit.java:85)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:699)
	at org.secretflow.secretpad.web.init.TeeResourceInit$$SpringCGLIB$$0.run(<generated>)
	at org.springframework.boot.SpringApplication.lambda$callRunner$5(SpringApplication.java:774)
	at org.springframework.util.function.ThrowingConsumer$1.acceptWithException(ThrowingConsumer.java:83)
	at org.springframework.util.function.ThrowingConsumer.accept(ThrowingConsumer.java:60)
	at org.springframework.util.function.ThrowingConsumer$1.accept(ThrowingConsumer.java:88)
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:782)
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:773)
	at org.springframework.boot.SpringApplication.lambda$callRunners$3(SpringApplication.java:758)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:758)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:331)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1317)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306)
	at org.secretflow.secretpad.web.SecretPadApplication.main(SecretPadApplication.java:61)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:95)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
	at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:65)

2024-08-30T11:02:31.311+08:00 ERROR 1 --- [   scheduling-3] o.s.s.s.TaskUtils$LoggingErrorHandler    : Unexpected error occurred in scheduled task

java.lang.InterruptedException: null
	at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1640)
	at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
	at org.secretflow.secretpad.persistence.datasync.buffer.center.CenterDataSyncDataBufferTemplate.peek(CenterDataSyncDataBufferTemplate.java:48)
	at org.secretflow.secretpad.service.listener.DbChangeEventListener.sync(DbChangeEventListener.java:110)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)

2024-08-30T11:02:31.322+08:00  INFO 1 --- [ault-executor-0] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from READY to SHUTDOWN
2024-08-30T11:02:31.327+08:00  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
@zimu-yuxi
Copy link

kuscia容器里看下有没名为alice-table的domaindata,没有的话可以先创建一个。
请问下kuscia是通过哪个脚本部署的

@Meng-xiangkun
Copy link
Author

kuscia容器里看下有没名为alice-table的domaindata,没有的话可以先创建一个。 请问下kuscia是通过哪个脚本部署的

https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn 根据这个通过k8s部署的,部署的RunP模式。

@zimu-yuxi
Copy link

可以确认下kuscia容器内是否有alice-table这个domaindata,如果没有可以创建一个,创建完还是相同报错,在org.secretflow.secretpad.manager.integration.datatablegrant.DatatableGrantManager#createDomainGrant,这里断点看一下builder的构造

@Meng-xiangkun
Copy link
Author

Meng-xiangkun commented Sep 6, 2024

可以确认下kuscia容器内是否有alice-table这个domaindata,如果没有可以创建一个,创建完还是相同报错,在org.secretflow.secretpad.manager.integration.datatablegrant.DatatableGrantManager#createDomainGrant,这里断点看一下builder的构造

Secretpad服务起来了,但是8080web页面打开是空白的,这是怎么回事啊,麻烦给看下
image
image

@1139763082 1139763082 assigned 1139763082 and unassigned 1139763082 Sep 9, 2024
@zimu-yuxi
Copy link

1.F12看下前端请求
2.secretpad容器memory调高一些,比如docker update 容器id --memory=8g --memory-swap=8g
3.如果都不行,可以尝试本地起下前端代码,参考这里

@Meng-xiangkun
Copy link
Author

1.F12看下前端请求 2.secretpad容器memory调高一些,比如docker update 容器id --memory=8g --memory-swap=8g 3.如果都不行,可以尝试本地起下前端代码,参考这里

image

@wangzul
Copy link

wangzul commented Sep 10, 2024

可以提供一下pad的配置文件,并且微信群中提示的错误为缺少nodeid F12开发者模式看一下传递的参数。

@Meng-xiangkun
Copy link
Author

Meng-xiangkun commented Sep 10, 2024

可以提供一下pad的配置文件,并且微信群中提示的错误为缺少nodeid F12开发者模式看一下传递的参数。

pad的配置文件:

server:
  tomcat:
    accesslog:
      enabled: true
      directory: /var/log/secretpad
  servlet:
    session:
      timeout: 30m
  http-port: 8080
  http-port-inner: 9001
  port: 443
  ssl:
    enabled: true
    key-store: "file:./config/server.jks"
    key-store-password: ${KEY_PASSWORD:secretpad}
    key-alias: secretpad-server
    key-password: ${KEY_PASSWORD:secretpad}
    key-store-type: JKS
  compression:
    enabled: true
    mime-types:
      - application/javascript
      - text/css
    min-response-size: 1024
spring:
  task:
    scheduling:
      pool:
        size: 10
  application:
    name: secretpad
  jpa:
    database-platform: org.hibernate.community.dialect.SQLiteDialect
    show-sql: false
    properties:
      hibernate:
        format_sql: false
    open-in-view: false
  datasource:
    driver-class-name: org.sqlite.JDBC
    url: jdbc:sqlite:./db/secretpad.sqlite
    hikari:
      idle-timeout: 60000
      maximum-pool-size: 1
      connection-timeout: 6000
  flyway:
    baseline-on-migrate: true
    locations:
      - filesystem:./config/schema/center

  #datasource used for mysql
  #spring:
  #  task:
  #    scheduling:
  #      pool:
  #        size: 10
  #  application:
  #    name: secretpad
  #  jpa:
  #    database-platform: org.hibernate.dialect.MySQLDialect
  #    show-sql: false
  #    properties:
  #      hibernate:
  #        format_sql: false
  #  datasource:
  #    driver-class-name: com.mysql.cj.jdbc.Driver
  #    url: your mysql url
  #    username:
  #    password:
  #    hikari:
  #      idle-timeout: 60000
  #      maximum-pool-size: 10
  #      connection-timeout: 5000
  jackson:
    deserialization:
      fail-on-missing-external-type-id-property: false
      fail-on-ignored-properties: false
      fail-on-unknown-properties: false
    serialization:
      fail-on-empty-beans: false
  web:
    locale: zh_CN # default locale, overridden by request "Accept-Language" header.
  cache:
    jcache:
      config:
        classpath:ehcache.xml
springdoc:
  api-docs:
    enabled: true
management:
  endpoints:
    web:
      exposure:
        include: health,info,readiness,prometheus
    enabled-by-default: false
kusciaapi:
  protocol: ${KUSCIA_PROTOCOL:notls}

kuscia:
  nodes:
    - domainId: kuscia-system
      mode: master
      host: ${KUSCIA_API_ADDRESS:kuscia-master.data-develop-operate-dev.svc.cluster.local}
      port: ${KUSCIA_API_PORT:8083}
      protocol: ${KUSCIA_PROTOCOL:notls}
      cert-file: config/certs/client.crt
      key-file: config/certs/client.pem
      token: config/certs/token

    - domainId: alice
      mode: lite
      host: ${KUSCIA_API_LITE_ALICE_ADDRESS:kuscia-lite-alice.data-develop-operate-dev.svc.cluster.local}
      port: ${KUSCIA_API_PORT:8083}
      protocol: ${KUSCIA_PROTOCOL:notls}
      cert-file: config/certs/alice/client.crt
      key-file: config/certs/alice/client.pem
      token: config/certs/alice/token

    - domainId: bob
      mode: lite
      host: ${KUSCIA_API_LITE_BOB_ADDRESS:kuscia-lite-bob.data-develop-operate-dev.svc.cluster.local}
      port: ${KUSCIA_API_PORT:8083}
      protocol: ${KUSCIA_PROTOCOL:notls}
      cert-file: config/certs/bob/client.crt
      key-file: config/certs/bob/client.pem
      token: config/certs/bob/token


job:
  max-parallelism: 1

secretpad:
  logs:
    path: ${SECRETPAD_LOG_PATH:../log}
  deploy-mode: ${DEPLOY_MODE:ALL-IN-ONE} # MPC TEE ALL-IN-ONE
  platform-type: CENTER
  node-id: kuscia-system
  center-platform-service: secretpad.master.svc
  gateway: ${KUSCIA_GW_ADDRESS:127.0.0.1:80}
  auth:
    enabled: true
    pad_name: ${SECRETPAD_USER_NAME}
    pad_pwd: ${SECRETPAD_PASSWORD}
  response:
    extra-headers:
      Content-Security-Policy: "base-uri 'self';frame-src 'self';worker-src blob: 'self' data:;object-src 'self';"
  upload-file:
    max-file-size: -1    # -1 means not limit, e.g.  200MB, 1GB
    max-request-size: -1 # -1 means not limit, e.g.  200MB, 1GB
  data:
    dir-path: /app/data/
  datasync:
    center: true
    p2p: false
  version:
    secretpad-image: ${SECRETPAD_IMAGE:0.5.0b0}
    kuscia-image: ${KUSCIA_IMAGE:0.6.0b0}
    secretflow-image: ${SECRETFLOW_IMAGE:1.4.0b0}
    secretflow-serving-image: ${SECRETFLOW_SERVING_IMAGE:0.2.0b0}
    tee-app-image: ${TEE_APP_IMAGE:0.1.0b0}
    tee-dm-image: ${TEE_DM_IMAGE:0.1.0b0}
    capsule-manager-sim-image: ${CAPSULE_MANAGER_SIM_IMAGE:0.1.2b0}

  component:
    hide:
      - secretflow/io/read_data:0.0.1
      - secretflow/io/write_data:0.0.1
      - secretflow/io/identity:0.0.1
      - secretflow/model/model_export:0.0.1
      - secretflow/ml.train/slnn_train:0.0.1
      - secretflow/ml.predict/slnn_predict:0.0.2

sfclusterDesc:
  deviceConfig:
    spu: "{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}"
    heu: "{\"mode\": \"PHEU\", \"schema\": \"paillier\", \"key_size\": 2048}"
  rayFedConfig:
    crossSiloCommBackend: "brpc_link"

tee:
  capsule-manager: capsule-manager.#.svc

data:
  sync:
    - org.secretflow.secretpad.persistence.entity.ProjectDO
    - org.secretflow.secretpad.persistence.entity.ProjectNodeDO
    - org.secretflow.secretpad.persistence.entity.NodeDO
    - org.secretflow.secretpad.persistence.entity.NodeRouteDO
    - org.secretflow.secretpad.persistence.entity.ProjectJobDO
    - org.secretflow.secretpad.persistence.entity.ProjectTaskDO
    - org.secretflow.secretpad.persistence.entity.ProjectDatatableDO
    - org.secretflow.secretpad.persistence.entity.VoteRequestDO
    - org.secretflow.secretpad.persistence.entity.VoteInviteDO
    - org.secretflow.secretpad.persistence.entity.TeeDownLoadAuditConfigDO
    - org.secretflow.secretpad.persistence.entity.NodeRouteApprovalConfigDO
    - org.secretflow.secretpad.persistence.entity.TeeNodeDatatableManagementDO
    - org.secretflow.secretpad.persistence.entity.ProjectModelServingDO
    - org.secretflow.secretpad.persistence.entity.ProjectGraphNodeKusciaParamsDO
    - org.secretflow.secretpad.persistence.entity.ProjectModelPackDO
    - org.secretflow.secretpad.persistence.entity.FeatureTableDO
    - org.secretflow.secretpad.persistence.entity.ProjectFeatureTableDO
    - org.secretflow.secretpad.persistence.entity.ProjectGraphDomainDatasourceDO

inner-port:
  path:
    - /api/v1alpha1/vote_sync/create
    - /api/v1alpha1/user/node/resetPassword
    - /sync
    - /api/v1alpha1/data/sync
# ip block config (None of them are allowed in the configured IP list)
ip:
  block:
    enable: true
    list:
      - 0.0.0.0/32
      - 127.0.0.1/8
      - 10.0.0.0/8
      - 11.0.0.0/8
      - 30.0.0.0/8
      - 100.64.0.0/10
      - 172.16.0.0/12
      - 192.168.0.0/16
      - 33.0.0.0/8

@Meng-xiangkun
Copy link
Author

可以提供一下pad的配置文件,并且微信群中提示的错误为缺少nodeid F12开发者模式看一下传递的参数。

image
image
image

@wangzul
Copy link

wangzul commented Sep 10, 2024

Secretpad源码打包镜像----使用的是那个分支或者版本

@Meng-xiangkun
Copy link
Author

Secretpad源码打包镜像----使用的是那个分支或者版本

用的是tags v0.9.0b0

@wangzul
Copy link

wangzul commented Sep 10, 2024

Secretpad源码打包镜像----使用的是那个分支或者版本

Secretpad源码打包镜像----使用的是那个分支或者版本

用的是tags v0.9.0b0

image
你提供的请求参数为 initiatorId目前只有0.10 和main分支使用这个参数0.9以下用的nodeId,你可以检查一下镜像

@Meng-xiangkun
Copy link
Author

Meng-xiangkun commented Sep 10, 2024

image
image
通讯地址改了不生效,节点不可用

@wangzul
Copy link

wangzul commented Sep 10, 2024

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

@Meng-xiangkun
Copy link
Author

Meng-xiangkun commented Sep 11, 2024

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image
修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

@wangzul
Copy link

wangzul commented Sep 11, 2024

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

  1. 修改过有重启docker容器吗?
  2. 提供一下pad日志 docker logs

@Meng-xiangkun
Copy link
Author

Meng-xiangkun commented Sep 11, 2024

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

  1. 修改过有重启docker容器吗?
  2. 提供一下pad日志 docker logs

重启了还是不行
pad日志:

`2024-09-11T14:45:06.681+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

code: 11404

message: "clusterdomainroutes.kuscia.secretflow \"tee-alice\" not found"

}

2024-09-11T14:45:06.681+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.m.i.noderoute.NodeRouteManager : DomainRoute.RouteStatus response status {

code: 11404

message: "clusterdomainroutes.kuscia.secretflow \"tee-alice\" not found"

}

2024-09-11T14:45:06.686+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=880bbbcbd83b485fb79cd581a9594a99, name=zdsc, platformType=CENTER, platformNodeId=kuscia-system, ownerType=CENTER, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=ALL-IN-ONE)

2024-09-11T14:45:06.686+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainService/QueryDomain

2024-09-11T14:45:06.686+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domain_id: "bob"

2024-09-11T14:45:06.697+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

message: "success"

}

data {

domain_id: "bob"

cert: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURBVENDQWVtZ0F3SUJBZ0lCQVRBTkJna3Foa2lHOXcwQkFRc0ZBREFZTVJZd0ZBWURWUVFERXcxcmRYTmoKYVdFdGMzbHpkR1Z0TUI0WERUY3dNREV3TVRBd01EQXdNRm9YRFRnd01ERXdNVEF3TURBd01Gb3dEakVNTUFvRwpBMVVFQXhNRFltOWlNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXlWKzAyT052Ck9SKy8xVE9IYjl3N0hRRlNiRmxUNUtkeHhLN3ZwU3MwWjdXcnRjeld0ZXBjcmsrVUhTWHREdUhpV0tBcTJpQksKK3drWGhBUzA0WDNySWxHQjhtRDVwbEMrMWlaaFg4NnV4eUFFZzB5MkdicCtrajVRamhBWC9LbDBsL1liSTQyaQpOWmV0SENvdDJQbXhFV2k5SHdabmNNTkEzNDFsQVl0RjVDOUswVkFaTkh2SHRHSzN2S1dTQjZ6Mk83ekY3NXJ0CkY2YlkwNms3c05vNm84bzBScWxrdjhnQmlybnpqa0RIeHlwY0VjZ3ZXTDBoTVkxUTVualN5OW5uV1JpMmFnc0kKLzJVUUlIMWJxSVo5Z1V1VE5KNFhmZnVhQ0sxWktLRmN3UUorZkxnTGFWMG5zekFrSEgxRkxmdWFZbHA0MjV4ZAp5eThNUU1pUGtTZGFXUUlEQVFBQm8yQXdYakFPQmdOVkhROEJBZjhFQkFNQ0FvUXdIUVlEVlIwbEJCWXdGQVlJCkt3WUJCUVVIQXdJR0NDc0dBUVVGQndNQk1Bd0dBMVVkRXdFQi93UUNNQUF3SHdZRFZSMGpCQmd3Rm9BVXNneU8KOHRqeThaREpLVU5uYjE3dU00U3c4THd3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQUFwalRtMS82MDlrYml6MAp6c0NvSDZmK3FLNmdLaldYWFpsdFZPM1Z6aFNnL2RSMVpnL1RuczJqdVpvMWpMVzhyMGtLZ3RYZFF5SnRRT2xSCkdlUlRKQ0x1Um1UYTd0ems2QW5ZUkcrSnhSM05tWUJ5NEg5UTJMM0JTZU90TTl5cFVjUlpjcHhiR0NLL1phdlQKZlpTWHJ6NEFnRW9SN1lwb3lUNFZaYlhXR3gzdmlucUF6dWZsekk0Y0JQOHA3YmQrbTNOZERXUlBmNlJ6UmhSSQpncklPK1M4UGZad2ZWUmJTZXVFYkRHSUppNlV0Mzlid3dOTXFuVllxV1czN3k3ZnVRQXVJVCtIK3ZXMXQwd0lyClFORnBPQnB2WTFyRjRuYmx0YkVaYnJrNk1Zc01Rc0ltQlkrcVRXOURFZ1ZkM2thOGgzUmx0TnM1QXlJa21ZenIKUVdKMzdFRT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo="

node_statuses {

name: "kuscia-lite-bob-545c476bd7-rbkbr"

status: "Ready"

version: "v0.10.0b0"

last_heartbeat_time: "2024-09-11T06:44:55Z"

last_transition_time: "2024-09-06T06:34:32Z"

}

deploy_token_statuses {

token: "XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF"

state: "used"

last_transition_time: "2024-09-06T06:33:27Z"

}

deploy_token_statuses {

token: "Hz3UmnfNp2uAEYlPW2mt2E3EvFZlvuDD"

state: "unused"

last_transition_time: "2024-09-06T06:34:27Z"

}

annotations {

key: "domain/bob"

value: "kuscia.secretflow/domain-type=embedded"

}

annotations {

key: "kubectl.kubernetes.io/last-applied-configuration"

value: "{\"apiVersion\":\"kuscia.secretflow/v1alpha1\",\"kind\":\"Domain\",\"metadata\":{\"annotations\":{\"domain/bob\":\"kuscia.secretflow/domain-type=embedded\"},\"name\":\"bob\"},\"spec\":{\"authCenter\":{\"authenticationType\":\"Token\",\"tokenGenMethod\":\"UID-RSA-GEN\"},\"cert\":null,\"master\":null,\"role\":null}}\n"

}

auth_center {

authentication_type: "Token"

token_gen_method: "UID-RSA-GEN"

}

}

2024-09-11T14:45:06.702+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=880bbbcbd83b485fb79cd581a9594a99, name=zdsc, platformType=CENTER, platformNodeId=kuscia-system, ownerType=CENTER, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=ALL-IN-ONE)

2024-09-11T14:45:06.702+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainService/QueryDomain

2024-09-11T14:45:06.703+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domain_id: "alice"

2024-09-11T14:45:06.714+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

message: "success"

}

data {

domain_id: "alice"

cert: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURBekNDQWV1Z0F3SUJBZ0lCQVRBTkJna3Foa2lHOXcwQkFRc0ZBREFZTVJZd0ZBWURWUVFERXcxcmRYTmoKYVdFdGMzbHpkR1Z0TUI0WERUY3dNREV3TVRBd01EQXdNRm9YRFRnd01ERXdNVEF3TURBd01Gb3dFREVPTUF3RwpBMVVFQXhNRllXeHBZMlV3Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRREpYN1RZCjQyODVINy9WTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmEKSUVyN0NSZUVCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzagpqYUkxbDYwY0tpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2Cm11MFhwdGpUcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnEKQ3dqL1pSQWdmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYgpuRjNMTHd4QXlJK1JKMXBaQWdNQkFBR2pZREJlTUE0R0ExVWREd0VCL3dRRUF3SUNoREFkQmdOVkhTVUVGakFVCkJnZ3JCZ0VGQlFjREFnWUlLd1lCQlFVSEF3RXdEQVlEVlIwVEFRSC9CQUl3QURBZkJnTlZIU01FR0RBV2dCU3kKREk3eTJQTHhrTWtwUTJkdlh1NHpoTER3dkRBTkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQVJxMW1DNm5lZEV1Zgp5cVd5L0J5STgwbDhiMU8vOFg3T3BUdDJ5SXZwUG9WaFdMV3RnSi9BM2JCa2R3L3VmNFczMkJoWlkweVg0ZE9sCjVBVXkvRGtGY3VIeHhpcm9UeEFMc1lNYWpMd0pBdmVUbFlSb080Rm16Z2FXVHVSN1lZUUVQUXVQNWhZRFZEMXcKaTJKYWJ5T2kyMTJMdUJvMVlzcmNhcy9pV0FhTi9jYWNWS010eThCSnV6a0t5dy9WZ1RjVXRIcERPTWdiY3o0MwpQZ21KbDY1bENlRTNjQWhoQ2pTYTV0M1JmWHBxN2VSNjQzT2Y5SzJCT3pRenVvc0ZoS0h2azdTWWV0dldnMTBFCldCc28yYnFZS2luRHlzak1wbkVHQ0RyMC9YaWtnSUFvS3gyeFhJZXRScG50MDIzc3Q4b01KUFd3Uk9Id0J5aGMKRE92aUZvcFVUUT09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K"

node_statuses {

name: "kuscia-lite-alice-6dd464f48-b5rmm"

status: "Ready"

version: "v0.10.0b0"

last_heartbeat_time: "2024-09-11T06:44:41Z"

last_transition_time: "2024-09-06T06:31:18Z"

}

deploy_token_statuses {

token: "dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv"

state: "used"

last_transition_time: "2024-09-06T06:29:35Z"

}

deploy_token_statuses {

token: "zIUGEgeayul3Shz9rv6pGXcPMIekm9Dr"

state: "unused"

last_transition_time: "2024-09-06T06:31:13Z"

}

annotations {

key: "domain/alice"

value: "kuscia.secretflow/domain-type=embedded"

}

annotations {

key: "kubectl.kubernetes.io/last-applied-configuration"

value: "{\"apiVersion\":\"kuscia.secretflow/v1alpha1\",\"kind\":\"Domain\",\"metadata\":{\"annotations\":{\"domain/alice\":\"kuscia.secretflow/domain-type=embedded\"},\"name\":\"alice\"},\"spec\":{\"authCenter\":{\"authenticationType\":\"Token\",\"tokenGenMethod\":\"UID-RSA-GEN\"},\"cert\":null,\"master\":null,\"role\":null}}\n"

}

auth_center {

authentication_type: "Token"

token_gen_method: "UID-RSA-GEN"

}

}

2024-09-11T14:45:06.715+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=880bbbcbd83b485fb79cd581a9594a99, name=zdsc, platformType=CENTER, platformNodeId=kuscia-system, ownerType=CENTER, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=ALL-IN-ONE)

2024-09-11T14:45:06.715+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainRouteService/QueryDomainRoute

2024-09-11T14:45:06.715+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: destination: "alice"

source: "bob"

2024-09-11T14:45:06.726+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

message: "success"

}

data {

name: "bob-alice"

authentication_type: "Token"

destination: "alice"

endpoint {

host: "10.233.74.148"

ports {

name: "http"

port: 1080

protocol: "HTTP"

5: "/"

}

}

source: "bob"

token_config {

destination_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

rolling_update_period: 86400

source_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

token_gen_method: "RSA-GEN"

}

status {

status: "Failed"

}

}

2024-09-11T14:45:06.727+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.m.i.noderoute.NodeRouteManager : DomainRoute.RouteStatus response status {

message: "success"

}

data {

name: "bob-alice"

authentication_type: "Token"

destination: "alice"

endpoint {

host: "10.233.74.148"

ports {

name: "http"

port: 1080

protocol: "HTTP"

5: "/"

}

}

source: "bob"

token_config {

destination_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

rolling_update_period: 86400

source_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

token_gen_method: "RSA-GEN"

}

status {

status: "Failed"

}

}

2024-09-11T14:45:07.369+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)

2024-09-11T14:45:07.370+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainRouteService/BatchQueryDomainRouteStatus

2024-09-11T14:45:07.370+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request:

2024-09-11T14:45:07.373+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

code: 11100

message: "DomainRoute keys can not be empty"

}`

image

image

@wangzul
Copy link

wangzul commented Sep 11, 2024

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

进入master 节点 查看路由配置kubectl get cdr 然后用以下格式查看具体的配置 kubectl get cdr alice-bob -oyaml

@Meng-xiangkun
Copy link
Author

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

进入master 节点 查看路由配置kubectl get cdr 然后用以下格式查看具体的配置 kubectl get cdr alice-bob -oyaml

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

进入master 节点 查看路由配置kubectl get cdr 然后用以下格式查看具体的配置 kubectl get cdr alice-bob -oyaml

sh-5.2# kubectl get cdr
NAME                  SOURCE   DESTINATION     HOST            AUTHENTICATION   READY
tee-kuscia-system     tee      kuscia-system                   Token            False
bob-alice             bob      alice           10.233.74.148   Token            False
alice-bob             alice    bob             10.233.37.70    Token            False
bob-kuscia-system     bob      kuscia-system                   Token            True
alice-kuscia-system   alice    kuscia-system                   Token            True
sh-5.2# kubectl get cdr alice-bob -oyaml
apiVersion: kuscia.secretflow/v1alpha1
kind: ClusterDomainRoute
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kuscia.secretflow/v1alpha1","kind":"ClusterDomainRoute","metadata":{"annotations":{},"name":"alice-bob"},"spec":{"authenticationType":"Token","destination":"bob","endpoint":{"host":"10.233.37.70","ports":[{"isTLS":false,"name":"http","pathPrefix":"/","port":1080,"protocol":"HTTP"}]},"interConnProtocol":"kuscia","requestHeadersToAdd":{"Authorization":"Bearer {{.TOKEN}}"},"source":"alice","tokenConfig":{"rollingUpdatePeriod":86400,"tokenGenMethod":"RSA-GEN"}}}
  creationTimestamp: "2024-09-06T06:36:12Z"
  generation: 4
  labels:
    kuscia.secretflow/clusterdomainroute-destination: bob
    kuscia.secretflow/clusterdomainroute-source: alice
  name: alice-bob
  resourceVersion: "943787"
  uid: 4d17f638-d7d3-44b7-83da-c99998e87b90
spec:
  authenticationType: Token
  destination: bob
  endpoint:
    host: 10.233.37.70
    ports:
    - isTLS: false
      name: http
      pathPrefix: /
      port: 1080
      protocol: HTTP
  interConnProtocol: kuscia
  requestHeadersToAdd:
    Authorization: Bearer {{.TOKEN}}
  source: alice
  tokenConfig:
    destinationPublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    rollingUpdatePeriod: 86400
    sourcePublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    tokenGenMethod: RSA-GEN
status:
  conditions:
  - lastTransitionTime: "2024-09-11T08:46:45Z"
    lastUpdateTime: "2024-09-11T08:46:45Z"
    message: TokenNotGenerate
    reason: DestinationIsNotAuthrized
    status: "False"
    type: Ready
  tokenStatus: {}

@wangzul
Copy link

wangzul commented Sep 11, 2024

  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401
可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

@Meng-xiangkun
Copy link
Author

  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

@Meng-xiangkun
Copy link
Author

  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

image
image
通讯也都是正常的

@wangzul
Copy link

wangzul commented Sep 11, 2024

  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

image image 通讯也都是正常的

我看你的路由配置使用的是具体IP 10.233.74.148
curl -kvvv http://10.233.74.148:1080 验证一下这个不过我推荐你按照文档重新配置一下路由授权

@Meng-xiangkun
Copy link
Author

  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080
    
# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

image image 通讯也都是正常的

我看你的路由配置使用的是具体IP 10.233.74.148 curl -kvvv http://10.233.74.148:1080 验证一下这个不过我推荐你按照文档重新配置一下路由授权

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: each job status

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-3 INITIALIZED task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-3), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-3, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-3), codeName=data_prep/psi, label=隐私求交, x=-260, y=-100, inputs=[ubryppxk-node-1-output-0, ubryppxk-node-2-output-0], outputs=[ubryppxk-node-3-output-0], nodeDef={attrPaths=[input/receiver_input/key, input/sender_input/key, protocol, sort_result, allow_duplicate_keys, allow_duplicate_keys/no/skip_duplicates_check, fill_value_int, ecdh_curve], attrs=[{is_na=false, ss=[id1]}, {is_na=false, ss=[id2]}, {is_na=false, s=PROTOCOL_RR22}, {b=true, is_na=false}, {is_na=false, s=no}, {is_na=true}, {is_na=true}, {is_na=false, s=CURVE_FOURQ}], domain=data_prep, name=psi, version=0.0.5}))

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-4 INITIALIZED task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-4), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-4, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-4), codeName=stats/table_statistics, label=全表统计, x=-260, y=20, inputs=[ubryppxk-node-3-output-0], outputs=[ubryppxk-node-4-output-0], nodeDef={attrPaths=[input/input_data/features], attrs=[{is_na=false, ss=[contact_cellular]}], domain=stats, name=table_statistics, version=0.0.2}))

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-3 INITIALIZED INITIALIZED

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-4 INITIALIZED INITIALIZED

2024-09-12T14:34:28.270+08:00 INFO 1 --- [lt-executor-190] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : starter jobEvent ... type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: jobId=emzj, jobState=Failed, task=[taskId=emzj-ubryppxk-node-3,alias=emzj-ubryppxk-node-3,state=Failed|taskId=emzj-ubryppxk-node-4,alias=emzj-ubryppxk-node-4,state=Pending], endTime=

2024-09-12T14:34:28.282+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: update job: it={

"type": "MODIFIED",

"object": {

"job_id": "emzj",

"status": {

"state": "Failed",

"create_time": "2024-09-12T06:34:27Z",

"start_time": "2024-09-12T06:34:28Z",

"tasks": [{

"task_id": "emzj-ubryppxk-node-3",

"state": "Failed",

"err_msg": "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found",

"create_time": "2024-09-12T06:34:28Z",

"start_time": "2024-09-12T06:34:28Z",

"end_time": "2024-09-12T06:34:28Z",

"alias": "emzj-ubryppxk-node-3"

}, {

"task_id": "emzj-ubryppxk-node-4",

"state": "Pending",

"alias": "emzj-ubryppxk-node-4"

}],

"stage_status_list": [{

"domain_id": "alice",

"state": "JobCreateStageSucceeded"

}, {

"domain_id": "bob",

"state": "JobCreateStageSucceeded"

}],

"approve_status_list": [{

"domain_id": "alice",

"state": "JobAccepted"

}, {

"domain_id": "bob",

"state": "JobAccepted"

}]

}

}

}

请问怎么把secretflow集成进来啊

@wangzul
Copy link

wangzul commented Sep 12, 2024

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: each job status

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-3 INITIALIZED task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-3), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-3, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-3), codeName=data_prep/psi, label=隐私求交, x=-260, y=-100, inputs=[ubryppxk-node-1-output-0, ubryppxk-node-2-output-0], outputs=[ubryppxk-node-3-output-0], nodeDef={attrPaths=[input/receiver_input/key, input/sender_input/key, protocol, sort_result, allow_duplicate_keys, allow_duplicate_keys/no/skip_duplicates_check, fill_value_int, ecdh_curve], attrs=[{is_na=false, ss=[id1]}, {is_na=false, ss=[id2]}, {is_na=false, s=PROTOCOL_RR22}, {b=true, is_na=false}, {is_na=false, s=no}, {is_na=true}, {is_na=true}, {is_na=false, s=CURVE_FOURQ}], domain=data_prep, name=psi, version=0.0.5}))

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-4 INITIALIZED task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-4), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-4, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-4), codeName=stats/table_statistics, label=全表统计, x=-260, y=20, inputs=[ubryppxk-node-3-output-0], outputs=[ubryppxk-node-4-output-0], nodeDef={attrPaths=[input/input_data/features], attrs=[{is_na=false, ss=[contact_cellular]}], domain=stats, name=table_statistics, version=0.0.2}))

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-3 INITIALIZED INITIALIZED

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-4 INITIALIZED INITIALIZED

2024-09-12T14:34:28.270+08:00 INFO 1 --- [lt-executor-190] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : starter jobEvent ... type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: jobId=emzj, jobState=Failed, task=[taskId=emzj-ubryppxk-node-3,alias=emzj-ubryppxk-node-3,state=Failed|taskId=emzj-ubryppxk-node-4,alias=emzj-ubryppxk-node-4,state=Pending], endTime=

2024-09-12T14:34:28.282+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: update job: it={

"type": "MODIFIED",

"object": {

"job_id": "emzj",

"status": {

"state": "Failed",

"create_time": "2024-09-12T06:34:27Z",

"start_time": "2024-09-12T06:34:28Z",

"tasks": [{

"task_id": "emzj-ubryppxk-node-3",

"state": "Failed",

"err_msg": "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found",

"create_time": "2024-09-12T06:34:28Z",

"start_time": "2024-09-12T06:34:28Z",

"end_time": "2024-09-12T06:34:28Z",

"alias": "emzj-ubryppxk-node-3"

}, {

"task_id": "emzj-ubryppxk-node-4",

"state": "Pending",

"alias": "emzj-ubryppxk-node-4"

}],

"stage_status_list": [{

"domain_id": "alice",

"state": "JobCreateStageSucceeded"

}, {

"domain_id": "bob",

"state": "JobCreateStageSucceeded"

}],

"approve_status_list": [{

"domain_id": "alice",

"state": "JobAccepted"

}, {

"domain_id": "bob",

"state": "JobAccepted"

}]

}

}

}

请问怎么把secretflow集成进来啊

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: each job status

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-3 INITIALIZED task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-3), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-3, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-3), codeName=data_prep/psi, label=隐私求交, x=-260, y=-100, inputs=[ubryppxk-node-1-output-0, ubryppxk-node-2-output-0], outputs=[ubryppxk-node-3-output-0], nodeDef={attrPaths=[input/receiver_input/key, input/sender_input/key, protocol, sort_result, allow_duplicate_keys, allow_duplicate_keys/no/skip_duplicates_check, fill_value_int, ecdh_curve], attrs=[{is_na=false, ss=[id1]}, {is_na=false, ss=[id2]}, {is_na=false, s=PROTOCOL_RR22}, {b=true, is_na=false}, {is_na=false, s=no}, {is_na=true}, {is_na=true}, {is_na=false, s=CURVE_FOURQ}], domain=data_prep, name=psi, version=0.0.5}))

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-4 INITIALIZED task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-4), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-4, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-4), codeName=stats/table_statistics, label=全表统计, x=-260, y=20, inputs=[ubryppxk-node-3-output-0], outputs=[ubryppxk-node-4-output-0], nodeDef={attrPaths=[input/input_data/features], attrs=[{is_na=false, ss=[contact_cellular]}], domain=stats, name=table_statistics, version=0.0.2}))

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-3 INITIALIZED INITIALIZED

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-4 INITIALIZED INITIALIZED

2024-09-12T14:34:28.270+08:00 INFO 1 --- [lt-executor-190] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : starter jobEvent ... type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: jobId=emzj, jobState=Failed, task=[taskId=emzj-ubryppxk-node-3,alias=emzj-ubryppxk-node-3,state=Failed|taskId=emzj-ubryppxk-node-4,alias=emzj-ubryppxk-node-4,state=Pending], endTime=

2024-09-12T14:34:28.282+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: update job: it={

"type": "MODIFIED",

"object": {

"job_id": "emzj",

"status": {

"state": "Failed",

"create_time": "2024-09-12T06:34:27Z",

"start_time": "2024-09-12T06:34:28Z",

"tasks": [{

"task_id": "emzj-ubryppxk-node-3",

"state": "Failed",

"err_msg": "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found",

"create_time": "2024-09-12T06:34:28Z",

"start_time": "2024-09-12T06:34:28Z",

"end_time": "2024-09-12T06:34:28Z",

"alias": "emzj-ubryppxk-node-3"

}, {

"task_id": "emzj-ubryppxk-node-4",

"state": "Pending",

"alias": "emzj-ubryppxk-node-4"

}],

"stage_status_list": [{

"domain_id": "alice",

"state": "JobCreateStageSucceeded"

}, {

"domain_id": "bob",

"state": "JobCreateStageSucceeded"

}],

"approve_status_list": [{

"domain_id": "alice",

"state": "JobAccepted"

}, {

"domain_id": "bob",

"state": "JobAccepted"

}]

}

}

}

请问怎么把secretflow集成进来啊

部署runp时是否遵循了以下说明
image

@Meng-xiangkun
Copy link
Author

Meng-xiangkun commented Sep 12, 2024

使用kuscia-secretflow镜像做隐私求交计算时出现这个错误
Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry
Failed to update kuscia job "dppm" status, Operation cannot be fulfilled on kusciajobs.kuscia.secretflow "dppm": the object has been modified; please apply your changes to the latest version and try again
image

2024-09-12 18:30:34.303 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.317 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.317 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (13.420693ms)

2024-09-12 18:30:34.317 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (13.470899ms)

2024-09-12 18:30:34.317 INFO resources/kusciajob.go:82 update kuscia job dppm

2024-09-12 18:30:34.329 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (12.672843ms)

2024-09-12 18:30:34.330 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.343 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.343 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (13.248207ms)

2024-09-12 18:30:34.343 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (13.29884ms)

2024-09-12 18:30:34.345 INFO handler/job_scheduler.go:323 Create kuscia tasks: dppm-qvxgwzap-node-35

2024-09-12 18:30:34.357 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.369 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.369 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.370 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.370 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (25.113735ms)

2024-09-12 18:30:34.370 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (25.15742ms)

2024-09-12 18:30:34.370 INFO handler/job_scheduler.go:661 jobStatusPhaseFrom readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=}}, kusciaJobId=dppm

2024-09-12 18:30:34.370 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.383 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.383 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.385 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.386 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (15.795756ms)

2024-09-12 18:30:34.386 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (15.879731ms)

2024-09-12 18:30:34.388 INFO handler/job_scheduler.go:661 jobStatusPhaseFrom readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=}}, kusciaJobId=dppm

2024-09-12 18:30:34.388 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (488.279µs)

2024-09-12 18:30:34.399 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.399 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.423 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.424 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.472 INFO resources/kusciatask.go:69 Start updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.488 INFO resources/kusciatask.go:71 Finish updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.488 INFO kusciatask/controller.go:521 Finished syncing kusciatask "dppm-qvxgwzap-node-35" (24.193535ms)

2024-09-12 18:30:34.490 INFO handler/job_scheduler.go:661 jobStatusPhaseFrom readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=Failed}}, kusciaJobId=dppm

2024-09-12 18:30:34.490 INFO handler/job_scheduler.go:679 jobStatusPhaseFrom failed readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=Failed}}, kusciaJobId=dppm

2024-09-12 18:30:34.491 WARN handler/failed_handler.go:62 Get task resource group dppm-qvxgwzap-node-35 failed, skip setting its status to failed, taskresourcegroup.kuscia.secretflow "dppm-qvxgwzap-node-35" not found

2024-09-12 18:30:34.491 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.491 INFO resources/kusciatask.go:69 Start updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.505 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.505 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (14.950352ms)

2024-09-12 18:30:34.505 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (14.972553ms)

2024-09-12 18:30:34.510 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.510 INFO resources/kusciatask.go:71 Finish updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.510 INFO kusciatask/controller.go:521 Finished syncing kusciatask "dppm-qvxgwzap-node-35" (19.491329ms)

2024-09-12 18:30:34.510 INFO kusciatask/controller.go:489 KusciaTask "dppm-qvxgwzap-node-35" was finished, skipping

2024-09-12 18:30:34.523 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.523 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (13.33302ms)

2024-09-12 18:30:34.523 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (13.376915ms)

2024-09-12 18:30:34.523 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.534 WARN resources/kusciajob.go:122 Failed to update kuscia job "dppm" status, Operation cannot be fulfilled on kusciajobs.kuscia.secretflow "dppm": the object has been modified; please apply your changes to the latest version and try again

2024-09-12 18:30:34.542 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.554 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.555 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (31.853225ms)

2024-09-12 18:30:34.555 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (31.901265ms)

2024-09-12 18:30:34.555 INFO handler/job_scheduler.go:700 KusciaJob dppm was finished, skipping

2024-09-12 18:30:34.555 INFO kusciajob/controller.go:266 KusciaJob "dppm" should not reconcile again, skipping

2024-09-12 18:30:34.555 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (111.519µs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants