Skip to content
Andre Merzky edited this page Feb 20, 2020 · 4 revisions

Problem Statement

RP supports remote and local data staging. Remote data staging refers to transfer of data between resources, local data staging to copy, move and link operations on a local, shared file system. Data staging directives can use two types of source and target URLs: the usual ones which point to a specific schema, host, and path element (etc.) referencing a data item, and custom ones with the following schemas:

  • client:/// - refers to the application's pwd
  • resource:/// - refers to the RP sandbox
  • pilot:/// - refers to the pilot sandbox
  • unit:/// - refers to the task sandbox

With the increase of use cases with more tightly and dynamically coupled tasks, these schemas turn out to be inefficient: in order for one task to refer to data items of another task, the first task needs to stage data to a global sandbox, and the second task then stage data into its task sandbox. For large numbers of tasks this is inefficient, and requires global coordination of file names to avoid conflicts.

Proposal

RP should introduce explicit references to task sandboxes. TO simplify the naming scheme, we propose to use

  • sandbox://<entity_id>/

as a uniform URL schema. For example

  • sandbox://client/ - refers to the application pwd
  • sandbox://pilot.0000/ - refers to the pilot sandbox for pilot.0000
  • sandbox://ornl.summit/ - refers to the resource sandbox for ornl.summit
  • sandbox://unit.123456/ - refers to the task sandbox of unit.123456

Impact

  • The translation from old URLs to new ones is straight forward, backward compatibility can be maintained for a time.
  • additional user cases become possible and much simpler. For example, the Repex layer would not need two-step data staging anymore, greatly simplifying replica orchestration.
  • Sandboxes are currently assigned in the UMGR scheduler, as after UMGR scheduling, all sandbox paths are resolvable, and all data staging follows after that state. The same approach should work for the new schemas, when introducing an additional cache for task sandboxes (prototype exists).
Clone this wiki locally