The high priority CUDA stream causes libOBS to be at a lower priority than the tracking, which is not what we want. Instead we want tracking to be incomplete in those cases, rather than slowing down encoding and other things.
Geometry updates are also now done once per frame instead of one per tracking update, which should improve the smoothness without affecting performance too much. Additionally all tracking info is now in the 0..1 range, which drastically simplifies some math - especially with texture coordinates.
To deal with tracking and updates being asynchronous, a very simple approximation of movement velocity has been added. This is mostly wrong, but it can bridge the gap where tracking updates are slower, as the values are all filtered anyway.
* Fixed Linux distros not being able to load the plugin.
* Fixed vertex buffers not being zero initialized.
* Removed all unused mipmapping options and drastically optimized it.
* Added lots and lots of optional performance profiling.
* Optimize Dual Filtering Blur by re-using rendertargets.
* Optimized everything to use single fullscreen tri instead of quads.
* Removed broken effects.
Adds a new CMake option "ENABLE_PROFILING" which enables all CPU and GPU performance profiling available in StreamFX for tracking what's actually causing things to be slow.
Asynchronous rendering allows the GPU to perform work while the CPU performs other work, and is significantly faster than lockstep immediate rendering. By reusing existing render targets we can see a performance improvement of up to 500%, while still doing the same things.
The new libOBS API allows us to directly access the underlying API instead of having to mess around in memory. By using it we can avoid crashing in case the compiler for it is different, or in case the actual back end structure changes.
Additionally the mostly unimplemented and unused options have also been removed, which streamlines the use of this class even further and reduces both shader and code complexity.
Finally by optimizing the use of the internal render target we can achieve a speed up of up to 3000% over the old way, allowing for many more mipmapped filters.
Q_INIT_RESOURCE and Q_CLEANUP_RESOURCE can't be called from within a namespace and instead have to be in outside of the namespace, so by moving them into small inline functions we can fulfill this restriction.
Related: #192#155
* #153#167#178 Update localization.
* #157 Fix a few example shaders and add some new ones.
* #160 Fix Look Ahead setting
* #162 Increase Direct3D11 texture eviction priority.
* #163 Asynchronous Nvidia Face Tracking.
* #164 Add clang-tidy support to CMake.
* #165 Fix some clang-10 errors.
* #168 Updated StreamFX logo.
* #169 Redesigned version string generation.
* #170 Fix compiler error with C++17 when using libobs.
* #171 Add global configuration handler.
* #172 Move Windows exclusive library handler into its own file.
* #174 Refactor everything into the streamfx:: namespace.
* #176 Implement a new UI/UX experience for StreamFX users.
* #177 Link to stdc++fs on GNU and Clang.
* #179 Fix incorrect render sizes and performance with gfx::shader.
* #181#183 Fix locale strings missing info or no longer being used.
* #182 Add a default path for gfx::shader based on the selected file or the examples directory.
Crowding commits each language in its own commit, instead of merging multiple into one commit. This results in very spammy builds, to the point of several hundred being spawned in the same second.
Fixes rendering at unexpected sizes by first rendering to a render target and then rendering the contents of that render target to the frame buffer instead. This also prevent rendering twice or more, which might cause severe FPS impact.