Make Your Code Slower With Multithreading

With the performance of modern CPU cores plateauing recently, the main performance gains are with multiple cores and multithreaded applications. Typically, a fast GPU is only so mind-bogglingly quick because thousands of cores operate in parallel on the same set of tasks. So, it would seem prudent for our applications to try to code in a multithreaded fashion to take advantage of this parallelism. Or so it would seem, but as [Marc Brooker] illustrates, it’s not as simple as one would assume, and it’s very easy to end up with far worse overall performance and no easy way to fix it.

[Marc] was rerunning an old experiment to calculate the expected number of birthdays in a shared group of people using brute force. The experiment was essentially a tight loop running a pseudorandom number generator, the standard libc rand() function. [Marc] profiled the code for single-thread and multithreaded versions and noted the runtime dramatically increased beyond two threads. Something fishy was going on. Running perf, [Marc] noted that there were significant L1 cache misses, but the real killer for performance was the increase in expensive context switches.  Perf indicated that for four threads, the was an overhead of nearly 50% servicing spin locks. There were no locks in the code, so after more perf magic, the syscalls taking all the time were identified.  Something in there was using a futex (or fast userspace mutex) a whole lot.

Continue reading “Make Your Code Slower With Multithreading”

A 6502 Retrocomputer In A Very Tidy Package

One of the designers whose work we see constantly in the world of retrocomputing is [Grant Searle], whose work on minimal chip count microcomputers has spawned a host of implementations across several processor families.

Often a retrocomputer is by necessity quite large, as an inevitable consequence of having integrated circuits in the period-correct dual-in-line packages with 0.1″ spaced pins. Back in the day there were few micros whose PCBs were smaller than a Eurocard (100 mm x 160 mm, 4″ x 6.3″), and many boasted PCBs much larger.

[Mark Feldman] though has taken a [Grant Searle] 6502 design and fitted it into a much smaller footprint through ingenious use of two stacked Perf+ prototyping boards. This is a stripboard product that features horizontal traces on one side and vertical on the other, which lends itself to compactness. Continue reading “A 6502 Retrocomputer In A Very Tidy Package”

Evaluating The Unusual And Innovative Perf+ Protoboard

Back in 2015 [Ben Wang] attempted to re-invent the protoboard with the Perf+. Not long afterward, some improvements (more convenient hole size and better solder mask among others) yielded an updated version which I purchased. It’s an interesting concept and after making my first board with it here are my thoughts on what it does well, what it’s like to use, and what place it might have in a workshop.

Perf+ Overview