Convert Any Book To A DIY Audiobook?

If the idea of reading a physical book sounds like hard work, [Nick Bild’s] latest project, the PageParrot, might be for you. While AI gets a lot of flak these days, one thing modern multimodal models do exceptionally well is image interpretation, and PageParrot demonstrates just how accessible that’s become.

[Nick] demonstrates quite clearly how little code is needed to get from those cryptic black and white glyphs to sounds the average human can understand, specifically a paltry 80 lines of Python. Admittedly, many of those lines are pulling in libraries, and some are just blank, so functionally speaking, it’s even shorter than that. Of course, the whole application is mostly glue code, stitching together other people’s hard work, but it’s still instructive and fun to play with.

The hardware required is a Raspberry Pi Zero 2 W, a camera (in this case, a USB webcam), and something to hold it above the book. Any Pi with the ability to connect to a camera should also work, however, with just a little configuration.

On the software side, [Nick] pulls in the CV2 library (which is the interface to OpenCV) to handle the camera interfacing, programming it to full HD resolution. Google’s GenAI is used to interface the Gemini 2.5 Flash LLM via an API endpoint. This takes a captured image and a trivial prompt, and returns the whole page of text, quick as a flash.

Finally, the script hands that text over to Piper, which turns that into a speech file in WAV format. This can then be played to an audio device with a call out to the console aplay tool. It’s all very simple at this level of abstraction.

Continue reading “Convert Any Book To A DIY Audiobook?”

2025 Pet Hacks Contest: Fort Bawks Is Guarded By Object Detection

One of the difficult things about raising chickens is that you aren’t the only thing that finds them tasty. Foxes, raccoons, hawks — if it can eat meat, it probably wants a bite of your flock. [donutsorelse] wanted to protect his flock and to be able to know when predators were about without staying up all night next to the hen-house. What to do but outsource the role of Chicken Guardian to a Raspberry pi?

Object detection is done using a YOLOv8 model trained on images of the various predators local to [donutorelse]. The model is running on a Raspberry Pi and getting images from a standard webcam. Since the webcam has no low-light capability, the system also has a motion-activated light that’s arguably goes a long way towards spooking predators away itself. To help with the spooking, a speaker module plays specific sound files for each detected predator — presumably different sounds might work better at scaring off different predators.