Tools & Scripts

(April 12, 2013)

In the recent years, I’ve written dozens of small one-file tools for my specific purposes, some of which might be interesting for other users as well, so I’m going to publish them here. In most cases, it’s really just a small tool, but there are also some grown-up applications among them, some of which even have a graphical interface. Most of the tools are written in Python 2.x; exceptions will be marked as such.

The descriptions on this page are going to be very short, which doesn’t do justice for the more complex programs in here. The individual scripts usually contain a little more documentation right at the beginning of the source code file, so clicking the link should help a bit.

However, that’s still not enough for some of the more complex beasts in here. Some of these already come with a link to a blog article that describes them in more detail. I will try to write up more complete descriptions in the future. You can control and accelerate this process by dropping me a note on what tool you would like to know more about in particular.

bargraph.py — generate bar graphs in SVG format

This tool creates (somewhat) fancy colorful bar graphs in SVG format, using a simple custom text file format as input. It has been used to create the graphs for my old H.264 Decoder Benchmark articles, for example.

blurip.py — semi-Automatic Blu-ray/AVCHD to MKV conversion

I often deal with directory dumps of Blu-ray discs or AVCHD media that have been encoded with inefficient encoders at excessive bitrates and want to re-encode these with x264 into Matroska files. There are tools that do this automatically, but they don’t offer the desired level of control; on the other hand, running the »tools of the trade« manually (x264 for video, the excellent eac3to for audio and subtitle extraction, and finally MKVToolNix for multiplexing) is a bit tedious, especially when the source material has additional letterboxing that needs to be cropped off. blurip.py takes care of all this: Using MPlayer as an additional external tool, the black bars are autodetected and there’s even an encoding preview.
Note that this program is not able to rip commercial Blu-ray discs.

cal.py — PostScript calendar generator

First, this is a Python library that can be used to generate calendars. It’s a bit biased towards Germany, as that’s the only country for which it comes with a full set of holiday data.
More interesting than the library itself is the example program that is contained in it, as this can be used to generate two types of calendards in PostScript format. The first one is a single sheet of paper containing an overview over a full year. The other one is special format that contains a view of three months on each quarter page, suitable for table top flip-over calendars.

cgrep.py — Advanced grep for C source files

This is an older tool that acts like a simplified UNIX grep, but with two additional features: first, it tries to extract the name of the C function where each hit is found and output it (in another color) as well. Second, if the output of the program is redirected into a file, it will write HTML instead of plain text.

ensure_folder_jpg.py — clean up folder artwork in music collections

For historical reasons, many operating systems and audio players expect album artwork in one of two locations: embedded into the file or in an external file called Folder.jpg. This script scans whole directory trees and renames (and possibly converts) single images in folders with music files.

extract_gdepth.py — extract depth image from Google Camera »Lens Blur« photos

This is a simple script that extracts the original (sharp) image and the computed depth map from JPEG images that have been shot with Google’ Android Camera app in »Lens Blur« mode.

flatcopy.py — flatten a directory tree

This tool helped me in a specific situation where I needed to take files that were sprawled across a directory tree (including some symlinks) and copy them into a flat directory.

gallery.py — generate HTML5+JavaScript image galleries

There are hundreds of HTML image gallery generators out there, but none of them suited all my needs: It should work without server-side scripting, have thumbnails, not dismantle the »open in new tab« functionality in the browser and it should be possible to copy and paste the page URL while viewing an image to get a link that starts the gallery with that exact image. As usual, I decided to write my own solution for this. The result is a Python script that scales down the images and generates an index.html file that contains all the plumbing in HTML5, CSS and JavaScript. Optionally, links to download the original (unscaled) images can be generated as well. If the server supports PHP, it can also generate a PHP script that generates a ZIP file of all original images on the fly, so that users can download all the originals single or at once without storing them twice on the server.

gliss.cpp — advanced slideshow viewer

While Impressive can be used for image slideshows, it’s clearly better suited for PDF presentations: When showing photos, I often want to zoom and scroll around in them quickly, and Impressive simply isn’t up to the task. I could have improved Impressive to solve this, but instead I went down the easier route: write a fancy picture viewer from scratch. The result is a single-file program that displays JPEG images using OpenGL 1.1, provides ultra-smooth panning and zooming (without sacrificing image quality in simple full-screen display mode), transitions, EXIF information and the option to call an external movie player (MPC-HC, MPlayer, VLC) for video files. All this is put into a single C++ file, runs on Win32 or SDL and has only OpenGL and libjpeg-turbo as dependencies.
Win32 binary available: gliss.exe (163k)

gpx2kml.py — GPS track lock processing and conversion

When I travel, I usually take by GPS track logger with me and when back at home, I want to import the track into Google Earth. No problem so far. However, I’d like to split the track into multiple sub-tracks, colored by way of transportation, and this is nearly impossible to do with Google Earth’s abysmal polygon editor alone. This tool takes care of this: It converts GPX track logs into KML, but it can split tracks by times specified in a text file and style them individually. It can also simplify tracks and geo-reference photos in the KML file – at least if their time stamps are accurate, but that’s the job of PhotoJoin.

hexdump.py — yet another hexdump tool for the console

This program simply generates hex dumps in the traditional offset / hex / ASCII format with 16 bytes per line, as it’s used by various other tools. In addition to that, it can search for hexadacimal patterns and dump the few bytes directly following them.

hindex.py — HTML index file generation

When files are uploaded onto a webserver, the server is usually able to create HTML directory listings on the fly. If such a HTML-based browsable directory structure is desired on a local disk, this script can generate the necessary HTML index files.

htshare.py — web server for file sharing

This scripts makes sharing large files over the internet easier. It is a simple web server with the additional functionality that one HTShare server (an »uplink«) can connect to another one (a »hub«) so that its files are accessible over the hub as well, using the hub as a relay. Since all connections originate from the uplink, this even works behind firewalls and NATs; only the hub needs to be accessible from outside.

icsmaker.py — simple creation of multiple calendar events

This script is very helpful if a longer event (in my typical use case, a demoparty) is split into several sub-events (like competitions) and each of these sub-events shall have their own calendar entry. The user creates a text file with a simple syntax, describing all the events to generate (with optional description texts, which is useful for e.g. seminars). The script then creates a bunch of .ics files that can be imported in a calendar application, or a single .ics file for applications supporting that.

jpegcrop.py — interactive lossless JPEG cropping tool

When cropping JPEG images in typical image editors like GIMP, they are decoded and re-compressed when saving them, resulting in a slight loss of quality. It is, however, possible to crop JPEG files losslessly, with some constraints. The command-line tool jpegtran does exactly that (and a few other tricks too), but it is a bit cumbersome to use. For Windows, there’s a graphical application called jpegcrop.exe, but it is a bit outdated, doesn’t allow cropping with a fixed aspect ratio, and has brain-dead default settings that produce (almost) unusable results.
The jpegcrop.py script is a simple Tk-based front-end for jpegtran that displays the source image and allows the user to specify a crop rectangle interactively (including fixed aspect ratio cropping).
Win32 binary available: JPEGcrop.exe (6.2M)

kill_cr_inplace.py — Microsoft-to-POSIX line end conversion

This script removes all »carriage return« characters from one ore more files, overwriting the input files. Very useful when multiple files are contaminated with DOS/Windows-style line endings.

kill_id3v2_inplace.py — remove ID3v2 tags from MP3 files

I never liked the ID3v2 standard and I try to avoid it whenever possible. In the (perhaps unlikely) case that you feel the same, this tool may be useful for you: It removes the ID3v2 tags from one or multiple files without mercy, overwriting the original file.

kmltrackjoin.py — join tracks in a KML file

Google Earth unfortunately lacks an option to join multiple tracks (»line strings« in KML parlance) together. This tool implements that externally: It reads a KML or KMZ file, joins all tracks inside each folder together and writes a new KML file where those folders have been replaced by the joined tracks.

kmltracksplit.py — split a GPS track by placemarks

This tool is similar to gpx2kml.py in that its purpose is splitting GPS tracks, but that’s where the similarities end. Instead of GPX, it takes KML as input; it does not split by time, but by location of the closest placemark that is already present in the input KML file, and it generates sub-tracks that are named by the two placemarks they connect.

lametool.py — MP3 encoding of whole albums

This is a console-mode dialog driven application targeted at encoding whole directories of WAV files into MP3 format using the LAME encoder. It doesn’t have any fancy features except multithreading: LAME itself isn’t capable of exploiting multi-core CPUs, but this tool can simply run multiple instances of LAME, each encoding one track of an album. This makes it possible to encode whole CDs in a mere minute on modern systems.

pcf2fon.py — convert X11 bitmap fonts into Windows bitmap fonts

This tool converts X11 bitmap fonts in .pcf format into Windows bitmap fonts in .fon format, using either the ANSI or DOS codepage.

pdfgen.py — generate PDF documents from images

This tool takes images in any format, puts them onto pages of a defined size and generates a PDF file from that. No modifications (like rescaling) are made to the images; JPEG images are even pasted into the PDF file in their original compressed format.
(detailed description here)

photojoin.py — synchronize photos from multiple cameras

When multiple cameras are used to take photos of a single event, it’s hard to put them together on a common timeline: Usually, the internal clocks of the cameras are not synchronized together, making photos that have been taken at the same time appear at wildly different points in the timeline. This tool proposes a solution to the problem.
(detailed description here)
Win32 binary available: PhotoJoin.exe (5.5M)

project_generator.py — generate build environment for simple C projects

Many of my experiments start as single C files, but creating and maintaining a working Makefile or Visual Studio project setup is a big hassle. This tool takes care of that: It creates a directory with a »hello world« template C file and the necessary boilerplate for building around it.

reimage.py — shrink JPEG images to a specified size

This program takes an image as input, scales it down so that it doesn’t exceed a specified maximum resolution and then compresses it into a JPEG file with a specified, fixed file size.

rename_helper.py — replicate file rename operations

If two people have a copy of the same files, then one of them renames them to better fit into some scheme and the other one wants to rename the files in the same way, the only sane solution is often to copy the files again. This tool circumvents that by generating a list of the new file names together with each file’s (abbreviated) hash. The other person can then run the tool with this list to apply the new names to his or her own copy of the files.

tailor.py — Unix »tail« replacement with extras

This tool acts like the standard Unix command »tail -f«, but it adds two features: First, all lines can be prepended with a timestamp, and second, lines matching specific regular expressions can be marked by writing them in color.

truncate.py — truncate files

This script simply truncates a file at a specified position in bytes (or KiB, MiB, GiB).

tscut.c — cut MPEG-2 Transport Streams

This command-line tool cuts MPEG-2 Transport Streams with MPEG-2, H.264/AVC or H.265/HEVC video on I/IDR frame boundaries. It doesn’t do any fancy modifications to the streams: It just detects suitable positions in the file where it can be cut so that the file remains playable and copies everything verbatim.
It can also be used to recover missing PATs and PMTs for single-program Transport Streams.
Win32 binary available: tscut.exe (98k)

untabify.py — convert tabs to spaces

This simple script converts ASCII tabulator codes into runs of spaces and optionally removes whitespace at the end of lines.

urlqueue.py — maintain a queue of URLs

This program implements a kind of »read later« list of URLs: It serves a list (or rather, a queue) of URLs via a local webserver. Using specific bookmarklets, the user can put URLs he/she visits into the queue and then, later, recall the URLs one after another.

wikkit.py — parallel HTTP downloader

This tool can download multiple files via HTTP at once, which is often faster than downloading them one after another. It has a wget-like recursive mode for mirroring whole (parts of) websites; furthermore, it can grab numbered ranges of URLs as they are often found in image galleries.

webvideohelper.py — web video conversion helper

This tool helps with converting videos for publishing on the internet in MP4 (H.264) and WebM (VP8) format. It can downscale and convert audio on the fly, and it can create »poster images« (still image previews of the video with a »click to play« button on them) and template HTML code as well. Note that this tool needs FFmpeg installed on the system to work.

zipstream.py — on-the-fly ZIP file generation

This is both a Python library as well as a console application that can be used to generate ZIP files. The special thing here is that it is possible to use the program in streaming mode, i.e. both the input files and the resulting ZIP file can be piped to and from the program on the fly.

Post a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Captcha: