Skip to content

Commit

Permalink
Merge pull request #6 from LKedward/v1.0rc
Browse files Browse the repository at this point in the history
v1.0.0 release candidate
  • Loading branch information
LKedward authored Mar 16, 2020
2 parents 8c2af70 + cbd0265 commit a108d3d
Show file tree
Hide file tree
Showing 40 changed files with 2,730 additions and 562 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
*.smod
*.a
doc/*
bin/*
obj/*
*fclKernels.cl
test/testSummary
test/test_outputs*
Expand Down
15 changes: 15 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
language: c
dist: bionic
before_install:
- sudo apt-get update -qq
- sudo apt-get install -qq gfortran
- wget -q http://registrationcenter-download.intel.com/akdlm/irc_nas/vcp/16284/intel_sdk_for_opencl_applications_2020.0.270.tar.gz -O /tmp/opencl_runtime.tgz
- tar -xzf /tmp/opencl_runtime.tgz -C /tmp
- sed 's/decline/accept/g' -i /tmp/intel_sdk_for_opencl_applications_2020.0.270/silent.cfg
- sudo /tmp/intel_sdk_for_opencl_applications_2020.0.270/install.sh -s /tmp/intel_sdk_for_opencl_applications_2020.0.270/silent.cfg

script:
- make -j test

after_success:
- bash <(curl -s https://codecov.io/bash)
61 changes: 42 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,29 +2,45 @@
## *A modern Fortran abstraction layer for openCL*
Focal is a module library which wraps calls to the openCL runtime API (using [clfortran](https://github.com/cass-support/clfortran)) with a higher abstraction level appropriate to the Fortran language.

__Project status:__ *Beta*
The goal of Focal is to provide a concise and accessible Fortran interface to the OpenCL API while retaining the full functionality thereof.
This is desirable in Fortran which as a language provides a higher level of abstraction than C; importantly this allows scientists and engineers to focus on their domain specific problem rather than details of low-level implementation.

__Key features:__

* Removes use of c pointers to call OpenCL API
* Provides a level of type safety using typed buffer objects
* Decreases verbosity of OpenCL API calls while still providing the same functionality
* Abstracts away low level details, such as size in bytes
* Contains built-in customisable error handling for all OpenCL API calls
* Contains built-in 'debug' mode for checking program correctness
* Contains build-in routines for collecting and presented profiling information

__Project status:__ v1.0.0 stable release

__Documentation__: [lkedward.github.io/focal-docs](https://lkedward.github.io/focal-docs/)

__License__: [MIT](./LICENSE)

__Key features:__
__Prerequisites:__

- [GNU make](https://www.gnu.org/software/make/) utility
- Fortran compiler supporting the 2008 standard (tested regularly with `gfortran` 7.4.0 & 9.1.0 and `ifort` 19.1.0 )
- An OpenCL development library (One of:
[Intel OpenCL SDK](https://software.intel.com/en-us/opencl-sdk),
[NVIDIA CUDA Toolkit](https://developer.nvidia.com/cuda-downloads),
[AMD Radeon Software](https://www.amd.com/en/support) )

- Focal removes the need to use c pointers in Fortran to call the OpenCL API;
- provides a level of type-safety through the use of typed buffer objects;
- decreases the verbosity of OpenCL API calls while still providing the same functionality;
- abstracts away low-level details, such as buffer size in bytes, not appropriate to Fortran;
- makes it easier to write and debug OpenCL programs with customisable built-in runtime error checking.

## Getting started

* [Building the Focal Library](https://lkedward.github.io/focal-docs/build)
* [Using and linking Focal](https://lkedward.github.io/focal-docs/linking/)
* [Quickstart programming guide](https://lkedward.github.io/focal-docs/quickstart/)
* [Example programs](./examples)
* [Lattice Boltzmann demo](https://github.com/LKedward/lbm2d_opencl)

## Simple example
The following fortran program calculates the sum of two large arrays using an openCL kernel.
## Quick example
The following fortran program calculates the sum of two large arrays using an OpenCL kernel.

```fortran
program sum
Expand All @@ -38,29 +54,26 @@ real, parameter :: sumVal = 10.0 ! Target value for array sum
integer :: i ! Counter variable
character(:), allocatable :: kernelSrc ! Kernel source string
type(fclDevice), allocatable :: devices(:) ! List of focal devices
type(fclDevice) :: device ! Device object
type(fclProgram) :: prog ! Focal program object
type(fclKernel) :: sumKernel ! Focal kernel object
real(c_float) :: array1(Nelem) ! Host array 1
real(c_float) :: array2(Nelem) ! Host array 2
real :: array1(Nelem) ! Host array 1
real :: array2(Nelem) ! Host array 2
type(fclDeviceFloat) :: array1_d ! Device array 1
type(fclDeviceFloat) :: array2_d ! Device array 2
! Create context with nvidia platform
call fclSetDefaultContext(fclCreateContext(vendor='nvidia'))
! Select device with most cores and create command queue
devices = fclFindDevices(sortBy='cores')
call fclSetDefaultCommandQ(fclCreateCommandQ(devices(1),enableProfiling=.true.))
device = fclInit(vendor='nvidia',sortBy='cores')
call fclSetDefaultCommandQ(fclCreateCommandQ(device,enableProfiling=.true.))
! Load kernel from file and compile
call fclSourceFromFile('examples/sum.cl',kernelSrc)
prog = fclCompileProgram(kernelSrc)
sumKernel = fclGetProgramKernel(prog,'sum')
! Initialise device arrays
array1_d = fclBufferFloat(Nelem,read=.true.,write=.false.)
array2_d = fclBufferFloat(Nelem,read=.true.,write=.true.)
call fclInitBuffer(array1_d,Nelem)
call fclInitBuffer(array2_d,Nelem)
! Initialise host array data
do i=1,Nelem
Expand Down Expand Up @@ -90,3 +103,13 @@ __kernel void sum(const int nElem, const __global float * v1, __global float * v
if(i < nElem) v2[i] += v1[i];
}
```
## Bundled third-party sources
The following open source libraries are used as dependencies and bundled in the repository ([./external](https://github.com/LKedward/focal/tree/master/external)):
* [fortran-utils](https://github.com/certik/fortran-utils)/[sorting](https://github.com/certik/fortran-utils/blob/master/src/sorting.f90) (MIT license)
* [clfortran](https://github.com/cass-support/clfortran) (LGPL)
* [M_strings](https://github.com/urbanjost/M_strings) (Unlicense/Public domain)
43 changes: 20 additions & 23 deletions examples/nbody.f90
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ program nbody
character(*), parameter :: cl_vendor = 'nvidia,amd,intel' ! Vendors for which to create OpenCL context in order of preference

! ---------Program variables ---------
integer :: i, nBlock
integer :: kern1T, kern2T
integer :: i
integer(c_size_t) :: kern1T, kern2T
real :: Tavg, perf
character(:), allocatable :: kernelSrc ! Kernel source string
type(fclDevice), allocatable :: devices(:) ! List of focal devices
type(fclDevice) :: device ! OpenCL device on which to run
type(fclProgram) :: prog ! Focal program object
type(fclKernel) :: kern1, kern2 ! Focal kernel object
type(fclEvent) :: e
Expand All @@ -40,32 +40,29 @@ program nbody
write(*,*) ('-',i=1,72)
write(*,*)

! Create context with nvidia platform
call fclSetDefaultContext(fclCreateContext(vendor=cl_vendor))
! Initialise OpenCL context and select device with most cores
device = fclInit(vendor=cl_vendor,sortBy='cores')

! Select device with most cores and create command queue
devices = fclFindDevices(sortBy='cores')
call fclSetDefaultCommandQ(fclCreateCommandQ(devices(1),enableProfiling=.true., &
call fclSetDefaultCommandQ(fclCreateCommandQ(device,enableProfiling=.true., &
outOfOrderExec=.true.,blockingWrite=.false.))

write(*,*) ' Created OpenCL command queue on device: "',devices(1)%name,'"'
write(*,'(A,I6,A,I6,A,I4,A,A,A)') ' (', devices(1)%nComputeUnits,' cores, ', &
devices(1)%global_memory/1024/1024,'MB, ', &
devices(1)%clock_freq, 'MHz, ',&
devices(1)%version,')'
write(*,*) ' Created OpenCL command queue on device: "',device%name,'"'
write(*,'(A,I6,A,I6,A,I4,A,A,A)') ' (', device%nComputeUnits,' cores, ', &
device%global_memory/1024/1024,'MB, ', &
device%clock_freq, 'MHz, ',&
device%version,')'
write(*,*) ''

! Set profiler device
profiler%device = devices(1)
profiler%device = device

! Load kernels from file and compile
call fclGetKernelResource(kernelSrc)
prog = fclCompileProgram(kernelSrc)

! Get kernel objects and set local/global work sizes
nBlock = (N+blockSize-1)/blockSize
kern1 = fclGetProgramKernel(prog,'bodyForces',[nBlock*blockSize],[blockSize])
kern2 = fclGetProgramKernel(prog,'integrateBodies',[nBlock*blockSize],[blockSize])
kern1 = fclGetProgramKernel(prog,'bodyForces',[N],[blockSize])
kern2 = fclGetProgramKernel(prog,'integrateBodies',[N],[blockSize])

call fclProfilerAdd(profiler,Niter,kern1,kern2)

Expand All @@ -79,12 +76,12 @@ program nbody
call random_number(pz)

! Initialise device arrays
pxd = fclBufferFloat(N,read=.true.,write=.true.,profileName='pxd')
pyd = fclBufferFloat(N,read=.true.,write=.true.,profileName='pyd')
pzd = fclBufferFloat(N,read=.true.,write=.true.,profileName='pzd')
vxd = fclBufferFloat(N,read=.true.,write=.true.,profileName='vxd')
vyd = fclBufferFloat(N,read=.true.,write=.true.,profileName='vyd')
vzd = fclBufferFloat(N,read=.true.,write=.true.,profileName='vzd')
call fclInitBuffer(pxd,N,profileName='pxd')
call fclInitBuffer(pyd,N,profileName='pyd')
call fclInitBuffer(pzd,N,profileName='pzd')
call fclInitBuffer(vxd,N,profileName='vxd')
call fclInitBuffer(vyd,N,profileName='vyd')
call fclInitBuffer(vzd,N,profileName='vzd')

call fclProfilerAdd(profiler,1,pxd,pyd,pzd,vxd,vyd,vzd)

Expand Down
1 change: 1 addition & 0 deletions examples/platform_query.f90
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ program platform_query
platforms(i)%devices(j)%global_memory/1024/1024,'MB, ', &
platforms(i)%devices(j)%clock_freq, 'MHz, ',&
platforms(i)%devices(j)%version
! write(*,*) platforms(i)%devices(j)%extensions

end do

Expand Down
15 changes: 7 additions & 8 deletions examples/sum.f90
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,20 @@ program sum

integer :: i ! Counter variable
character(:), allocatable :: kernelSrc ! Kernel source string
type(fclDevice), allocatable :: devices(:) ! List of focal devices
type(fclDevice) :: device ! OpenCL device on which to run
type(fclProgram) :: prog ! Focal program object
type(fclKernel) :: sumKernel ! Focal kernel object
real(c_float) :: array1(Nelem) ! Host array 1
real(c_float) :: array2(Nelem) ! Host array 2
type(fclDeviceFloat) :: array1_d ! Device array 1
type(fclDeviceFloat) :: array2_d ! Device array 2

! Create context with nvidia platform
call fclSetDefaultContext(fclCreateContext(vendor='nvidia,amd,intel'))
! Initialise OpenCL context and select device with most cores
device = fclInit(vendor='nvidia,amd,intel',sortBy='cores')

! Select device with most cores and create command queue
devices = fclFindDevices(sortBy='cores') !,type='cpu')
write(*,*) 'Using device: ',devices(1)%name
call fclSetDefaultCommandQ(fclCreateCommandQ(devices(1),enableProfiling=.true.))
write(*,*) 'Using device: ',device%name
call fclSetDefaultCommandQ(fclCreateCommandQ(device,enableProfiling=.true.))

! Load kernel from file and compile
! call fclSourceFromFile('examples/sum.cl',kernelSrc)
Expand All @@ -32,8 +31,8 @@ program sum
sumKernel = fclGetProgramKernel(prog,'sum')

! Initialise device arrays
array1_d = fclBufferFloat(Nelem,read=.true.,write=.false.)
array2_d = fclBufferFloat(Nelem,read=.true.,write=.true.)
call fclInitBuffer(array1_d,Nelem,access='r')
call fclInitBuffer(array2_d,Nelem,access='rw')

! Initialise host array data
do i=1,Nelem
Expand Down
110 changes: 0 additions & 110 deletions external/Quicksort.f90

This file was deleted.

Loading

0 comments on commit a108d3d

Please sign in to comment.