LeNetCUDA

The purpose of this project is to accelerate a simple Convolutional Neural Network forward propagation algorithm on a Nvidia GPU and to show what are the possible architectural choices that can be used to speed up a code running on a GPU.

Project overview

The following folders contains the source code:
header files
source files
main application
The profiling script is the one used for profiling the application, from it is possible to select different metrics and events to be profiled using nvprof command line profiler available in the NVIDIA TOOLKIT. The script will generate three files (_exhautive, _medium, _light)containing more and more detailed profiling information about the application. Here and Here you can find some examples.
More details about the project can be found in the report.

Hoe to compile, run & profile

If you want to use the application you need to install the Nvidia Toolkit on your machine and of course have a Nvidia GPU available.
Instructions on how to install the Toolkit can be found here.

compile

You can compile the sourcefiles using make

make

and clean the compilation files using

make clean

I suggest to change the following flag GPU_ARCHITECTURE=sm_53 according to your GPU. The above flag is suited for my NVIDIA Jetson NANO board with a Tegra X1 GPU which is a Mawxell architecture.
Higly suggest to take a look at the references i used for the CNN documentation, you can find them in the report.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
docs		docs
inc		inc
logs		logs
rept		rept
src		src
tmp		tmp
.DS_Store		.DS_Store
LeNet.cu		LeNet.cu
README.md		README.md
makefile		makefile
presentation.pdf		presentation.pdf
presentation.pptx		presentation.pptx
profile_app.sh		profile_app.sh
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeNetCUDA

Project overview

Hoe to compile, run & profile

compile

About

Releases

Packages

Languages

Luca-Dalmasso/LeNetCUDA

Folders and files

Latest commit

History

Repository files navigation

LeNetCUDA

Project overview

Hoe to compile, run & profile

compile

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages