History and Development Trivia

This page covers the history and development of ABS-Fractal-Explorer and previous Fractal rendering engines.

Previous Fractal Engines

Scratch

I created my first fractal on May 21st 2021 in Scratch. I later developed the Scratch project into a Mandelbrot Fractal Explorer. It has 10 different fractals to explore. Scratch however is not the fastest for rendering Fractals, there was a reason I included a digit for hours in the timer.

JavaScript

JavaScript was significantly faster than Scratch, but was a bit more complicated. Since I couldn't figure out how to get an image to display, I printed a long list of numbers for the image data instead. I would then use Scratch to convert the list of numbers into an image as shown in this video.

Later on I was able to get the fractal to render into a canvas. Where I rendered some fractals at 8192x6144 1024i.

Java

Java was used to render the 75 Mandelbrot Variants video using CPU multi-threading. I did not know how to save images at the time, so I took screenshots of the output window. Over time I added more features, such as saving images and rendering video sequences. The Java Fractal Engine was also used to test experimental rendering techniques.

C++

I started to learn how to code in C++/C by porting the Fractal rendering code from Java. The code was roughly ~1.4x faster than Java if I recall correctly.

OpenGL

C++ and OpenGL were used to render the 330 Cubic Fractals video. The GPU was significantly faster at rendering Fractals, however it was a bit painful to code. It took me a very long time before I could get OpenGL to be setup properly in Visual Studio 2019. After awhile I found some "Ray Tracing" demo that I was able to repurpose into a Fractal renderer. I understood zero of the OpenGL code, the tutorials didn't really help at all, like why do I need to render triangles to display a 2D image?

I had to split the high quality renders into 144 pieces otherwise OpenGL would crash. The morphs did not have to be split since they were rendered at lower quality. I predicted the video would take anywhere from 20-120 hours to render, so I let my PC run during the night. I then would wake up to the surprise that OpenGL crashed through the night. A few of the morph transitions effectively blew up to infinity, which meant that the portion of the screen taken up by the interior parts of the fractal went up too, meaning more pixels to be rendered at the full 4096 iterations causing OpenGL to crash. So now I had to split the image into 2 before going to bed which slowed down rendering by 30%. All the frames of the video rendered in a combined 28.8 hours.

I gave up with linking libraries and headers in C++ so I did the text overlays in Java where importing stuff was comparatively trivial. Other than a few renders the OpenGL engine was not used often as it was too hard to modify the OpenGL back-end code I did not understand.

OpenCL

I wanted to make a proper Fractal Exploration application that anyone could download. So I coded up a CPU renderer recycling the Engine used for Super Sweeper 0.77.0 Windows Edition, which already had SDL2 setup to display a byte array image. I then wanted more speed so I looked back into GPU rendering where I learned about OpenCL. OpenCL was straightforward and easy compared to OpenGL.

ABS-Fractal-Explorer

ABS-IconF-256x256

FracExp files

FracExp files used to just be a Struct binary file. At first it was just a 64bit hash to check for corruption followed by the parameters, but later it also included the ABS-Fractal-Explorer version number for future proofing. Afterwards it included a version number for the FracExp file itself, because it is possible that it may change and update at a different rate to the software itself. The first versions of the FracExp file were used between versions 1.0.3 - 1.0.5 before major changes to ABS-Fractal-Explorer broke their functionality.

The hashing function was "extremely sophisticated" by preforming a few rounds of an algorithm similar to hash += number; hash ^= number; hash = CircularShift(hash,5);.

In the transition from version 1.0.8 to version 1.0.9 (2023/10/06), FracExp files would be rewritten to be human readable. The revision to the file format included capabilities to store multiple fractals and additional metadata such as the files creation time or author. Multiple 256bit hashes are used to detect different types of manual changes to the file. Prior to 1.0.9, there were plans to open multiple Fractals by passing multiple FracExp files into the terminal.

FracExpKB files follow a similar structure and format to FracExp files with some slight changes.

Version 1.0.4 Code Map

I mainly coded in features without thinking about how everything would connect together. So now I have to refactor a bunch of things. The diagram below is from August 27th 2023 when I was coding ABS-Fractal-Explorer version 1.0.4. Based on version 1.0.8, version 1.1.0 will probably release with a similar structure. Current goals for version 1.0.9 are to get asynchronous threads (So I/O and rendering can happen independently), and to properly integrate Dear ImGUI into the UI instead of having it in a separate window.

Version 1.0.31 Code Map

Here is roughly how the code works in Version 1.0.31, it's not at all like 1.0.4 because of all the rewritten code. Currently, render.cpp is the longest file at 2607 lines, which is because it contains lots of functions and code that should probably be put into separate files. Some of the functions in render.cpp handle the GUI (Dear ImGui), fractal parameters, monitor/display list, and a few other things not directly related to the original purpose of displaying a uint8_t* onto a window.

render.cpp is almost as long as the 2772 line 99KB main.c file from Super-Sweeper 0.58 (2022/12/31) before I started using header files. Although having the entire game (Which compiled to ~32KB) in a singular main.c file made it easy to port version 0.47.0 to Windows.

On October 23rd 2023, I decided to rewrite a lot of things. Now the code has asynchronous threading in the works along with a proper GUI instead of the one that used a graphics library ported from a graphing calculator. I switched over from using Makefile to CMake, which makes compilation faster as object files are only recompiled when needed instead of being thrown out after every compilation. I am working bit by bit to re-implement features using better ways and methods.

Support for 32bit Windows Vista

On March 4th 2024, I got ABS-Fractal-Explorer v1.0.2 to compile and run on 32bit Windows Vista. It was a bit tricky since MinGW64 and MSYS2 don't support 32bit Windows Vista, but I was able to get the older MinGW and msys.bat to work. Read how its all done here.

ABS-Fractal-Explorer v1.0.2 running on 32bit Windows Vista, next to ABS-Fractal-Explorer v1.1.5 running on 64bit Windows 10.

Getting rid of globals and macros

(2024/03/11 - 2024/03/25) One can compare v1.1.7 rev-0 to v1.1.7 rev-10 and not see much difference when running the program; each revision only being a minor change. However, in the source code, many things have been refactored over the revisions to avoid macros, global variables, inserting patternMemcpy() into a few places, and several other things. Lots of deprecated old code was updated/replaced/removed. Nearly every file has been touched and reviewed in some part to reduce other bad programming habits from previous commits.

A lot of it was utilizing more C++ features in the code such references, inline functions, and adding more namespaces for enums to reduce magic number constants. One of the new functions I added was a "proper" hashing function to replace the random and arbitrary operations I did to generate hashes before-hand. The 64bit fnv1a hash is a non-cryptographic hash function, but is otherwise pretty fast and simple to implement in the code.

SIMD Optimization

(2024/04/17 - 2024/04/18)

After thinking about it for so long, I finally added in SIMD (Single Instruction Multiple Data) Vectorization to speed up and optimize CPU rendering. It was a rather interesting experience writing the SIMD code. Accounting for asynchronous and independent events was a bit tricky since operations are applied to all variables at once. One of these situations was with different values escaping at different iteration counts, where I used masks to keep track of which variables had escaped. One nice optimization was being able to remove the ternaries/branching used to apply fabs() based on the formula, it now could be done by setting a mask beforehand and performing a bitwise AND to perform fabs().

I started out with the SSE2 instruction set since it is supported by pretty much every CPU released since 2000, and is a system requirement for Windows Vista onwards. The SSE2 instruction set features 128bit registers, which can store 4 32bit floats, or 2 64bit floats. Overall, using SSE2 made 32bit floats about twice as fast, and 64bit floats around 30% faster.

More modern CPU's (From about 2017 onwards) support more advanced SIMD instruction sets, such as AVX512, which uses 512bit registers that can perform operations on 16 32bit floats or 8 64bit floats. According to the March 2024 Steam Hardware Survey, only about 11.2% of computers support AVX512. From the same survey: 100% of computers support SSE2 and SSE3 (Windows 10 requires SSE3); 99.6% support SSSE3, SSE4.1, and SSE4.2; 97.1% support AVX; 93.5% support AVX2; and 31.6% support SSE4a.

(2024/04/21)

I added in AVX rendering, however it did not seem to be that much faster than SSE2. It was around 36% faster for 64bit floats and 34% faster for 32bit floats. It seems like I am leaving a lot of performance on the table, and I would guess that there is still a lot to learn about SIMD instructions and optimization. One thing I do know for certain is that the type punning used to access individual floats ((fp32*)((void*)(&value)))[i] is rather inefficient. To my understanding, accessing floats in this way moves them out of the SIMD registers, creating slowdowns. If I recall correctly, the x87 math instructions share the same registers with the SIMD instructions, which could potentially cause some values to be reloaded.

It is easy to dismiss what little performance gains we get from SIMD instructions on the CPU when compared to the incredible speeds offered by the GPU. However, while the GPU is the king of 32bit floats, this may not always be true for 64bit floats. Most graphics cards compute 64bit floats at a small fraction of the speed of 32bit floats. Sometimes its 1:4, other times its 1:64. In some cases, the GPU might not even support 64bit floats natively. This speed difference is sometimes enough for the CPU to be faster at computing 64bit floats.

Miscellaneous

The Complete Keyboard

In version 1.0.12 (2023/10/29), the Keyboard graphic was expanded to include all 242 SDL2 Scancodes. I grabbed every Keyboard in the house (Although my brother had some resistance) and tested how many Scancodes I would get. I was surprised the AC (Application Control) keys triggered, but otherwise, most Scancodes didn't trigger, and some only worked on half of the keyboards I tested.

While I hoped for more Scancodes to trigger, most of the special keys did not trigger a Scancode. One keyboard had two different keys that sent a "Media Select" Scancode. I did notice that some function keys would briefly send Ctrl/Shift/Meta Scancodes to preform some functions such as taking screenshots. The weirdest result was my laptop where the Mic Mute button triggered the "Z" and "Minus" Scancodes oddly.

Game Controller Key-binds

Awhile ago (2023/08/18), I created a concept for Game Controller key-binds. Although, it will probably be awhile before Game Controllers will be supported in ABS-Fractal-Explorer. Google Slides is excellent for drawing things!

(2024/01/30)

After thinking it over, it would be better if the bumpers and the d-pad were switched. Most controllers have analog L2 and R2 buttons which would be great for zooming in/out. R1 and R2 could be used to adjust iterations, then the d-pad for incrementing the fractal id/family.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly