Skip to content

Commit

Permalink
[SYCL][Graph] Add Graphs printing API (intel#11796)
Browse files Browse the repository at this point in the history
Updates Sycl-Graph API with a graph print function outputting a dot
graph that represents the created modifiable graph.
Updates the Spec with the proposed API for debug print graph. 
Adds e2e tests that check the behaviour of this function.
  • Loading branch information
mfrancepillois committed Nov 10, 2023
1 parent 1957f75 commit 7736c42
Show file tree
Hide file tree
Showing 12 changed files with 563 additions and 0 deletions.
28 changes: 28 additions & 0 deletions sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,8 @@ public:
node add(T cgf, const property_list& propList = {});
void make_edge(node& src, node& dest);
void print_graph(std::string path, bool verbose = false) const;
};
template<>
Expand Down Expand Up @@ -727,6 +729,32 @@ Parameters:

Returns: A new executable graph object which can be submitted to a queue.

|
[source,c++]
----
void
print_graph(std::string path, bool verbose = false) const;
----

|Synchronous operation that writes a DOT formatted description of the graph to the
provided path. By default, this includes the graph topology, node types, node id,
and kernel names.
Verbose can be set to true to write more detailed information about each node type
such as kernel arguments, copy source, and destination addresses.
At the moment DOT format is the only supported format. The name of hte output file
must therefore match this extension, i.e. "<filename>.dot".

Parameters:

* `path` - The path to write the DOT file to.
* `verbose` - If true, print additional information about the nodes such as kernel args
or memory access where applicable.

Exceptions:

* Throws synchronously with error code `invalid` if the path is invalid or
the file extension is not supported or if the write operation failed.

|===

Table {counter: tableNumber}. Member functions of the `command_graph` class for queue recording.
Expand Down
8 changes: 8 additions & 0 deletions sycl/include/sycl/ext/oneapi/experimental/graph.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,14 @@ class __SYCL_EXPORT modifiable_command_graph {
/// executing.
bool end_recording(const std::vector<queue> &RecordingQueues);

/// Synchronous operation that writes a DOT formatted description of the graph
/// to the provided path. By default, this includes the graph topology, node
/// types, node id and kernel names.
/// @param path The path to write the DOT file to.
/// @param verbose If true, print additional information about the nodes such
/// as kernel args or memory access where applicable.
void print_graph(const std::string path, bool verbose = false) const;

protected:
/// Constructor used internally by the runtime.
/// @param Impl Detail implementation class to construct object with.
Expand Down
12 changes: 12 additions & 0 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -832,6 +832,18 @@ bool modifiable_command_graph::end_recording(
return QueueStateChanged;
}

void modifiable_command_graph::print_graph(std::string path,
bool verbose) const {
graph_impl::ReadLock Lock(impl->MMutex);
if (path.substr(path.find_last_of(".") + 1) == "dot") {
impl->printGraphAsDot(path, verbose);
} else {
throw sycl::exception(
sycl::make_error_code(errc::invalid),
"DOT graph is the only format supported at the moment.");
}
}

executable_command_graph::executable_command_graph(
const std::shared_ptr<detail::graph_impl> &Graph, const sycl::context &Ctx)
: impl(std::make_shared<detail::exec_graph_impl>(Ctx, Graph)) {
Expand Down
242 changes: 242 additions & 0 deletions sycl/source/detail/graph_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

#include <cstring>
#include <deque>
#include <fstream>
#include <functional>
#include <list>
#include <set>
Expand Down Expand Up @@ -228,7 +229,229 @@ class node_impl {
}
}

/// Recursive Depth first traversal of linked nodes.
/// to print node information and connection to Stream.
/// @param Stream Where to print node information.
/// @param Visited Vector of the already visited nodes.
/// @param Verbose If true, print additional information about the nodes such
/// as kernel args or memory access where applicable.
void printDotRecursive(std::fstream &Stream,
std::vector<node_impl *> &Visited, bool Verbose) {
// if Node has been already visited, we skip it
if (std::find(Visited.begin(), Visited.end(), this) != Visited.end())
return;

Visited.push_back(this);

printDotCG(Stream, Verbose);
for (const auto &Dep : MPredecessors) {
auto NodeDep = Dep.lock();
Stream << " \"" << NodeDep->MCommandGroup.get() << "\" -> \""
<< MCommandGroup.get() << "\"" << std::endl;
}

for (std::weak_ptr<node_impl> Succ : MSuccessors) {
Succ.lock()->printDotRecursive(Stream, Visited, Verbose);
}
}

private:
/// Prints Node information to Stream.
/// @param Stream Where to print the Node information
/// @param Verbose If true, print additional information about the nodes such
/// as kernel args or memory access where applicable.
void printDotCG(std::ostream &Stream, bool Verbose) {
sycl::detail::CG::CGTYPE CGType = MCommandGroup->getType();

Stream << "\"" << MCommandGroup.get() << "\" [style=bold, label=\"";

Stream << "ID = " << MCommandGroup.get() << "\\n";
Stream << "TYPE = ";

switch (CGType) {
case sycl::detail::CG::CGTYPE::None:
Stream << "None \\n";
break;
case sycl::detail::CG::CGTYPE::Kernel: {
Stream << "CGExecKernel \\n";
sycl::detail::CGExecKernel *Kernel =
static_cast<sycl::detail::CGExecKernel *>(MCommandGroup.get());
Stream << "NAME = " << Kernel->MKernelName << "\\n";
if (Verbose) {
Stream << "ARGS = \\n";
for (size_t i = 0; i < Kernel->MArgs.size(); i++) {
auto Arg = Kernel->MArgs[i];
std::string Type = "Undefined";
if (Arg.MType == sycl::detail::kernel_param_kind_t::kind_accessor) {
Type = "Accessor";
} else if (Arg.MType ==
sycl::detail::kernel_param_kind_t::kind_std_layout) {
Type = "STD_Layout";
} else if (Arg.MType ==
sycl::detail::kernel_param_kind_t::kind_sampler) {
Type = "Sampler";
} else if (Arg.MType ==
sycl::detail::kernel_param_kind_t::kind_pointer) {
Type = "Pointer";
} else if (Arg.MType == sycl::detail::kernel_param_kind_t::
kind_specialization_constants_buffer) {
Type = "Specialization Constants Buffer";
} else if (Arg.MType ==
sycl::detail::kernel_param_kind_t::kind_stream) {
Type = "Stream";
} else if (Arg.MType ==
sycl::detail::kernel_param_kind_t::kind_invalid) {
Type = "Invalid";
}
Stream << i << ") Type: " << Type << " Ptr: " << Arg.MPtr << "\\n";
}
}
break;
}
case sycl::detail::CG::CGTYPE::CopyAccToPtr:
Stream << "CGCopy Device-to-Host \\n";
if (Verbose) {
sycl::detail::CGCopy *Copy =
static_cast<sycl::detail::CGCopy *>(MCommandGroup.get());
Stream << "Src: " << Copy->getSrc() << " Dst: " << Copy->getDst()
<< "\\n";
}
break;
case sycl::detail::CG::CGTYPE::CopyPtrToAcc:
Stream << "CGCopy Host-to-Device \\n";
if (Verbose) {
sycl::detail::CGCopy *Copy =
static_cast<sycl::detail::CGCopy *>(MCommandGroup.get());
Stream << "Src: " << Copy->getSrc() << " Dst: " << Copy->getDst()
<< "\\n";
}
break;
case sycl::detail::CG::CGTYPE::CopyAccToAcc:
Stream << "CGCopy Device-to-Device \\n";
if (Verbose) {
sycl::detail::CGCopy *Copy =
static_cast<sycl::detail::CGCopy *>(MCommandGroup.get());
Stream << "Src: " << Copy->getSrc() << " Dst: " << Copy->getDst()
<< "\\n";
}
break;
case sycl::detail::CG::CGTYPE::Fill:
Stream << "CGFill \\n";
if (Verbose) {
sycl::detail::CGFill *Fill =
static_cast<sycl::detail::CGFill *>(MCommandGroup.get());
Stream << "Ptr: " << Fill->MPtr << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::UpdateHost:
Stream << "CGCUpdateHost \\n";
if (Verbose) {
sycl::detail::CGUpdateHost *Host =
static_cast<sycl::detail::CGUpdateHost *>(MCommandGroup.get());
Stream << "Ptr: " << Host->getReqToUpdate() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::CopyUSM:
Stream << "CGCopyUSM \\n";
if (Verbose) {
sycl::detail::CGCopyUSM *CopyUSM =
static_cast<sycl::detail::CGCopyUSM *>(MCommandGroup.get());
Stream << "Src: " << CopyUSM->getSrc() << " Dst: " << CopyUSM->getDst()
<< " Length: " << CopyUSM->getLength() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::FillUSM:
Stream << "CGFillUSM \\n";
if (Verbose) {
sycl::detail::CGFillUSM *FillUSM =
static_cast<sycl::detail::CGFillUSM *>(MCommandGroup.get());
Stream << "Dst: " << FillUSM->getDst()
<< " Length: " << FillUSM->getLength()
<< " Pattern: " << FillUSM->getFill() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::PrefetchUSM:
Stream << "CGPrefetchUSM \\n";
if (Verbose) {
sycl::detail::CGPrefetchUSM *Prefetch =
static_cast<sycl::detail::CGPrefetchUSM *>(MCommandGroup.get());
Stream << "Dst: " << Prefetch->getDst()
<< " Length: " << Prefetch->getLength() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::AdviseUSM:
Stream << "CGAdviseUSM \\n";
if (Verbose) {
sycl::detail::CGAdviseUSM *AdviseUSM =
static_cast<sycl::detail::CGAdviseUSM *>(MCommandGroup.get());
Stream << "Dst: " << AdviseUSM->getDst()
<< " Length: " << AdviseUSM->getLength() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::CodeplayHostTask:
Stream << "CGHostTask \\n";
break;
case sycl::detail::CG::CGTYPE::Barrier:
Stream << "CGBarrier \\n";
break;
case sycl::detail::CG::CGTYPE::Copy2DUSM:
Stream << "CGCopy2DUSM \\n";
if (Verbose) {
sycl::detail::CGCopy2DUSM *Copy2DUSM =
static_cast<sycl::detail::CGCopy2DUSM *>(MCommandGroup.get());
Stream << "Src:" << Copy2DUSM->getSrc()
<< " Dst: " << Copy2DUSM->getDst() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::Fill2DUSM:
Stream << "CGFill2DUSM \\n";
if (Verbose) {
sycl::detail::CGFill2DUSM *Fill2DUSM =
static_cast<sycl::detail::CGFill2DUSM *>(MCommandGroup.get());
Stream << "Dst: " << Fill2DUSM->getDst() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::Memset2DUSM:
Stream << "CGMemset2DUSM \\n";
if (Verbose) {
sycl::detail::CGMemset2DUSM *Memset2DUSM =
static_cast<sycl::detail::CGMemset2DUSM *>(MCommandGroup.get());
Stream << "Dst: " << Memset2DUSM->getDst() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::ReadWriteHostPipe:
Stream << "CGReadWriteHostPipe \\n";
break;
case sycl::detail::CG::CGTYPE::CopyToDeviceGlobal:
Stream << "CGCopyToDeviceGlobal \\n";
if (Verbose) {
sycl::detail::CGCopyToDeviceGlobal *CopyToDeviceGlobal =
static_cast<sycl::detail::CGCopyToDeviceGlobal *>(
MCommandGroup.get());
Stream << "Src: " << CopyToDeviceGlobal->getSrc()
<< " Dst: " << CopyToDeviceGlobal->getDeviceGlobalPtr() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::CopyFromDeviceGlobal:
Stream << "CGCopyFromDeviceGlobal \\n";
if (Verbose) {
sycl::detail::CGCopyFromDeviceGlobal *CopyFromDeviceGlobal =
static_cast<sycl::detail::CGCopyFromDeviceGlobal *>(
MCommandGroup.get());
Stream << "Src: " << CopyFromDeviceGlobal->getDeviceGlobalPtr()
<< " Dst: " << CopyFromDeviceGlobal->getDest() << "\\n";
}
break;
case sycl::detail::CG::CGTYPE::ExecCommandBuffer:
Stream << "CGExecCommandBuffer \\n";
break;
default:
Stream << "Other \\n";
break;
}
Stream << "\"];" << std::endl;
}

/// Creates a copy of the node's CG by casting to it's actual type, then using
/// that to copy construct and create a new unique ptr from that copy.
/// @tparam CGT The derived type of the CG.
Expand Down Expand Up @@ -410,6 +633,25 @@ class graph_impl {
MInorderQueueMap[QueueWeakPtr] = Node;
}

/// Prints the contents of the graph to a text file in DOT format.
/// @param FilePath Path to the output file.
/// @param Verbose If true, print additional information about the nodes such
/// as kernel args or memory access where applicable.
void printGraphAsDot(const std::string FilePath, bool Verbose) const {
/// Vector of nodes visited during the graph printing
std::vector<node_impl *> VisitedNodes;

std::fstream Stream(FilePath, std::ios::out);
Stream << "digraph dot {" << std::endl;

for (std::weak_ptr<node_impl> Node : MRoots)
Node.lock()->printDotRecursive(Stream, VisitedNodes, Verbose);

Stream << "}" << std::endl;

Stream.close();
}

/// Make an edge between two nodes in the graph. Performs some mandatory
/// error checks as well as an optional check for cycles introduced by making
/// this edge.
Expand Down
34 changes: 34 additions & 0 deletions sycl/test-e2e/Graph/Explicit/debug_print_graph.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// REQUIRES: level_zero || cuda, gpu
// RUN: %{build} -o %t.out
// RUN: %if linux %{ %{run} %t.out ; FileCheck %s --input-file graph.dot %}
// RUN: %if windows %{ %{run} %t.out %}
// Windows output format differs from linux format.
// The filecheck-based output checking is suited to linux standards.
// On Windows, we only test that printing takes place correctly and does not
// trigger errors or throw execeptions.
//
// CHECK: digraph dot {
// CHECK-NEXT: "0x[[#%x,NODE1:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE1]]\nTYPE = CGExecKernel \nNAME = _ZTSZZ11add_kernelsItEN4sycl3_V13ext6oneapi12experimental4nodeENS4_13command_graphILNS4_11graph_stateE0EEEmNS1_6bufferIT_Li1ENS1_6detail17aligned_allocatorINSt12remove_constISA_E4typeEEEvEESH_SH_ENKUlRNS1_7handlerEE_clESJ_EUlNS1_4itemILi1ELb1EEEE_\n"];
// CHECK-NEXT: "0x[[#%x,NODE2:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE2]]\nTYPE = CGExecKernel \nNAME = _ZTSZZ11add_kernelsItEN4sycl3_V13ext6oneapi12experimental4nodeENS4_13command_graphILNS4_11graph_stateE0EEEmNS1_6bufferIT_Li1ENS1_6detail17aligned_allocatorINSt12remove_constISA_E4typeEEEvEESH_SH_ENKUlRNS1_7handlerEE0_clESJ_EUlNS1_4itemILi1ELb1EEEE_\n"];
// CHECK-NEXT: "0x[[#NODE1]]" -> "0x[[#NODE2]]"
// CHECK-NEXT: "0x[[#%x,NODE3:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE3]]\nTYPE = CGExecKernel \nNAME = _ZTSZZ11add_kernelsItEN4sycl3_V13ext6oneapi12experimental4nodeENS4_13command_graphILNS4_11graph_stateE0EEEmNS1_6bufferIT_Li1ENS1_6detail17aligned_allocatorINSt12remove_constISA_E4typeEEEvEESH_SH_ENKUlRNS1_7handlerEE1_clESJ_EUlNS1_4itemILi1ELb1EEEE_\n"];
// CHECK-NEXT: "0x[[#NODE2]]" -> "0x[[#NODE3]]"
// CHECK-NEXT: "0x[[#%x,NODE4:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE4]]\nTYPE = CGExecKernel \nNAME = _ZTSZZ11add_kernelsItEN4sycl3_V13ext6oneapi12experimental4nodeENS4_13command_graphILNS4_11graph_stateE0EEEmNS1_6bufferIT_Li1ENS1_6detail17aligned_allocatorINSt12remove_constISA_E4typeEEEvEESH_SH_ENKUlRNS1_7handlerEE2_clESJ_EUlNS1_4itemILi1ELb1EEEE_\n"];
// CHECK-DAG: "0x[[#NODE3]]" -> "0x[[#NODE4]]"
// CHECK-DAG: "0x[[#NODE2]]" -> "0x[[#NODE4]]"
// CHECK-NEXT: "0x[[#%x,NODE5:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE5]]\nTYPE = CGCopy Device-to-Device \n"];
// CHECK-DAG: "0x[[#NODE3]]" -> "0x[[#NODE5]]"
// CHECK-DAG: "0x[[#NODE4]]" -> "0x[[#NODE5]]
// CHECK-NEXT: "0x[[#%x,NODE6:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE6]]\nTYPE = CGCopy Device-to-Host \n"];
// CHECK-NEXT: "0x[[#%x,NODE7:]]"
// CHECK-SAME: [style=bold, label="ID = 0x[[#NODE7]]\nTYPE = None \n"];

#define GRAPH_E2E_EXPLICIT

#include "../Inputs/debug_print_graph.cpp"
Loading

0 comments on commit 7736c42

Please sign in to comment.