I've been doing some profiling of a very simple ebiten project, and noticed that thread.go was doing a bunch of unnecessary allocations to accomplish its work. This change seeks to reduce GC work.
Input.go was also doing some unnecessary allocations.
The thread.go change reduces the total number of allocations per frame from 1342 to 852 (~36% reduction). The input.go change reduces it further to 752 (~44% total reduction). Perf tests were done on windows.
Now the thread object is created at (*UserInterface).Run, we don't
have to care whether the (main) thread is started or not when
Call is called. Admit queueing the functions.
Fixes#884
This enables thread available not only for the main thread but also
any threads.
This is a preparation for iOS Metal, that runs drawing functions on
a particular thread.
Updates #737