Load additional functions from CUDA and add new enumerations to support them:
* cuDevicePrimaryCtxSetFlags allows us to sched scheduling mode for the GPU.
* cuCtxgetStreamPriorityRange allows us to check which priority levels are supported.
* cuStreamCreateWithPriority allows us to create streams with non-default priority.
The scheduler mode is now set to yield so that other threads can do work when we hit an eventual stalling problem. Streams can also now be created with higher priority and different flags, if necessary. In most cases this should allow CUDA resources to execute even while the GPU is under heavy load.
Due to the 'nvcuda' library being part of the driver, it falls in a clause of the GPL which allows us to load and interface with system drivers. Since we can't rely on Nvidias headers here (incompatible license), most of this was pulled from FFmpeg and other things were found out via testing.