The goal of this lab work is to become familiar with a simple way to harness power of GPUs for image processing: GLSL shaders in WebGL.
We will be using a restricted and easy-to-learn subset of GLSL features, sufficient to implement algorithms operating on a neighborhood of a given output pixel.
We do not focus on performance and code optimization in this lab work.
We will be writing small programs (shaders) responsible for constructing output images pixel by pixel. The managing code providing inputs to these programs and calling them will be in JavaScript.
Here is a bottom-up introduction to GLSL shaders.
precision highp float;
// input image texture
uniform sampler2D inputImage;
// texture coordinates corresponding to the current output pixel position
varying vec2 texCoord;
void main() {
// sample the input image
vec4 color = texture2D(inputImage, texCoord);
// shuffle color channels and write out
gl_FragColor = color.bgra;
}
- GLSL is much like C. There are functions, variables
if
s and loops (while
andfor
). float foo;
is a variable declaration of fractional scalar typefloat
.- We will manipulate
float
-based quantities using types likefloat
,vec2
,vec3
, andvec4
, representing scalar, 2-, 3-, and 4-component vector data respectively.- Vector components can be referred to as
r
,g
,b
,a
. For example, if we have avec4 stuff
, thenstuff.r
is its first component,stuff.g
is its second component, and so on. stuff.rg
,stuff.gb
,stuff.gg
are validvec2
constructs consisting of corresponding components of variablestuff
, whilestuff.rgb
,stuff.bgr
,stuff.bbb
etc. are all validvec3
values.- For vectors storing coordinates rather than colors,
x
,y
,z
andw
can be used instead ofr
,g
,b
,a
. This is only for readability and has no impact on the behavior of the resulting program. As we mainly operate with colors, we stick to thergba
notation in what follows.
- Vector components can be referred to as
- Common binary arithmetic operations (
+
,-
,*
,/
) are defined for vector expressions of the same length, as well as for vector/scalar pairs.*
and/
perform pointwise multiplication and division.dot(foo, bar)
returns a dot product of vectorsfoo
andbar
. - A special global variable
texCoords
of type "varying vec2
" specifies the current output pixel position in 0..1 normalized coordinates. This variable is not a part of the GLSL specification: it comes from a predefined vertex shader specifically for this lab work. - The
uniform
specifier defines read-only variables during shader execution. Their values are set before running the shader program.uniform sampler2D
is a variable referring to an input texture image. A shader may have multiple such variables, to read from multiple textures.texture2D()
is a function used to read pixels (vec4
values) from a texture, and is a part of the GLSL standard. The following code is idiomatic:vec4 inputColor = texture2D(inputTexture, texCoords);
- The output sampling gird and the values
texCoords
may take are fully conditioned by the size of the output texture bound to the shader. When sampling an input image of a different size, thetexture2D()
function is set to perform bilinear interpolation of the input pixels. - You may declare and use the
uniform vec2 pixelStep
variable containing the pixel displacement to add to (or subtract from) thetexCoords
variable to get the neighbor pixel coordinate. If declared in your shader,pixelStep
will be automatically filled with the right value. This is not a part of the standard.
- The
void main()
function is the entrypoint (a shader may have other functions too). It is run for every pixel of the output texture.gl_FragColor
is avec4
variable defining the current output pixel color. In this work, this variable is the only way for programs to provide output. Assigning a value to this variable is the primary responsibility ofmain
.gl_FragColor
is a part of the GLSL specification.- In this lab work we will use 8-bit images, implying that every component of the output texture is internally stored as a 8-bit value. Consequently,
gl_FragColor
is automatically clipped into 0..1 range before the pixel color value is written to the output texture.
For more details refer to the GLSL ES 1.0 quick reference card.
To execute GLSL programs, a host environment is required. We use WebGL in this work: it is widely available, OS- and hardware-independent, and does not require any specific software like a compilation toolchain. Any common modern web browser integrates the WebGL support.
The programs are embedded into an HTML document. Check example_basic.html
and example_with_controls.html
for reference.
- Input images are loaded using
<img>
. They are referenced using theid
attribute (required to be unique). - GLSL source code is placed into
<script>
tags withtype="x-shader/x-fragment"
and uniqueid
attributes. - The managing code, situated within a large
<script>
block, constructs programs, assigns inputs, and launches their execution. <input>
elements can be used to add knobs to the page, in order to control uniform variables in shader programs. They are referenced by theirid
s.
WebGL intricacies are abstracted through predefined functions:
buildProgram(shaderScriptTagId)
constructs and returns a GLSL program from the source code taken from a givenshaderScriptTagId
.makeTextureFromImage(imgTagId)
loads an image with a given id into a texture usable in programs.makeEmptyTexture(width, height)
allocates an empty texture of a given size in pixels. Such textures can be used as outputs of programs.runProgram(program, inputs, output)
executes aprogram
takinginputs
object and storing the result tooutput
texture.inputs
is a mapping of uniform variable names to their values. The values can be textures (forsampler2D
variables), numbers (forfloat
s) or arrays (forvec*
variables).- The last argument is optional. If omitted, the program output will be displayed on the screen.
displayTexture(texture)
can be used to display a given texture to the screen.getKnobValue(inputId)
can be used to retrieve the value of a knob added with an<input>
tag with a giveninputId
.
- Write a program converting a colorful image into a grayscale one.
- Try out different ways to map colors to gray values.
- Take inspiration from
example_with_controls.html
to make the conversion parameters tunable. - What is the "best" way to map colors to gray? Share your observations.
Decomposing images into "base" and "details" is a common operation in many imaging algorithms. We first generate a blurred version of an input image by applying a low-pass filter. It can then be subsampled with no risk of aliasing, providing a coarse-scale "base" image (low-resolution representation of the input). Taking difference of the upsampled "base" and the input image provides a "details" image, containing a high-frequency component of the input.
The original input image can be perfectly reconstructed from its "base" and "details" counterparts.
The decomposition into base/details can then be applied in a recursive fashion, taking "base" as a new input at every iteration. This leads to a multiscale pyramidal representation of the input, with "details" at multiple scales and a "base" image which can be as small as 1*1 pixel.
Likewise, we can perfectly reconstruct the original input from its pyramidal representation, starting from the coarse scale "base", upsampling it, adding "details", upsampling again, etc. Also, if we apply a multiplicative factor to the details at every scale during the reconstruction, we can amplify or attenuate certain frequencies in the reconstructed image.
Exercise: make an "equalizer". A set of programs computing the multiscale representation of the input image and applying gains to the corresponding scales to control the amount of "basses", "mids" and "trebles" in the reconstructed image.
- Implement the base/details decomposition and reconstruction. Make sure the reconstructed image matches the input (a visual assessment is sufficient to conclude; no need to compare images numerically). When debugging, you may need to visually inspect the details image. Keep in mind that the textures can only hold values in 0..1 range.
- Perform the decomposition process recursively until the 1x1 base image is reached, then perform the reconstruction in the same way. Ensure again the reconstructed image matches the input.
- Take inspiration from
example_with_controls.html
to apply gains to the details at different scales during the reconstruction.