Compiling WordNet on Windows to use with Emacs

WordNet running on MSYS2 on Windows

As a non-native English speaker who reads, writes and reviews lots of English texts, I frequently look up definitions as well as synonyms of words. Of course there are numerous online sources available to do this, but I like to decrease my online 'footprint' due to privacy reasons. It also takes extra time to switch to a browser window and enter a search query.

Fortunately the fine folks at Princeton University compiled WordNet [1], a large lexical database of English, which can be used offline - together with a tool to search that database. Even better, somebody wrote a package to use WordNet inside my favorite editor Emacs [2]. This means that just by hovering the cursor over a word inside Emacs, the definition as well as synonyms can be shown. The source code [3] is kindly provided by Princeton University.

Compiling WordNet using MSYS2 on/for Windows

As is usually the case, compiling on/for Windows using the MSYS2 subsystem [4] can be done, with a few minor tweaks.

First, start a MSYS2 shell and install the required dependencies (build tools, as well as the programming language Tcl and its widget toolkit Tk ):

pacman -Sy --noconfirm base-devel mingw-w64-x86_64-tcl mingw-w64-x86_64-tk

Then …

more ...

VirtualBox Does Not Automatically Resize Disk Image

I use VirtualBox [1] a lot as (local) virtualization software. It is a full-featured virtualization host, and supports multiple underlying disk image file types for guests.

One of those is VirtualBox' native Virtual Disk Image or VDI file type. An advantage of this type is that one can create a dynamically allocated image. This image will initially be very small and not occupy any space for unused virtual disk sectors, but will grow when a disk sector is written to for the first time. VirtualBox does this by checking for unused sectors.

However, this poses issues for disks with multiple partitions. If the last partition is say a (unused) swap partition, then VirtualBox does not automatically grow the underlying image. Even though the first partition is full, VirtualBox will not grow and therefore the host disk will be full without having reached its full potential.

To solve this issue, the machine needs to be partitioned using one big happy partition. Then VirtualBox will dynamically resize according to expectations.

I use packer [2] to prepare disk images for Debian, together with a preseed [3] file. Using preseeding to partition the disk is limited to what is supported by the partition tool …

more ...

Setting Up a New Sphinx Documentation Framework

When having to write documentation for different formats, I always use the reStructuredText [1] (or reST) format. As this is something that happens quite often, it made sense to put some effort in automating the set up of a new documentation framework, a reusable set up script.

Setting up a framework

The standard documentation framework that I use consists of Sphinx [2], which takes care of converting source pages written in reST into several formats: For example HTML, but also PDF or something more exotic like ePub files. Note that Sphinx already comes with a setup script, sphinx-quickstart [3] - but this doesn't take care of deploying files.

In order to be able to create a reusable framework, I split the necessary files into three groups:

  • The Sphinx configuration itself,
  • version information, and
  • a LaTeX formatting template.

The Sphinx configuration

This part consists of two different files; A generic Makefile [4] to build the different artifact types - as well as a Sphinx configuration file (conf.py [5]) containing basic information about the project, and plugin details. These files rarely change after having initialized the framework.

Version information

The version information (version, or build number) can change per release, and is therefore contained in a separate …

more ...

Hacker Summer Camp: BSides Las Vegas and DEF CON 2018 review

BSides Las Vegas 2018

Time flies... It's already been a few months ago that BSides Las Vegas and DEF CON 2018 were held.

BSides Las Vegas was nice, although the overall quality of talks seemed to be a little higher in previous editions. This of course can be completely due to me picking exactly the wrong talks: There is simply too much to see.

DEF CON 2018 was also different than previous editions - mostly, because it was now so spread out (Caesars Palace as well as the Flamingo Las Vegas). In practice, this meant a lot of walking between the two locations. When you were attending a talk in one location, it physically wasn't possible to attend the next one unless it was located in the same or surrounding room.

Fortunately DEF CON is all about learning, doing and networking - and in that aspect it didn't disappoint.

Especially for Hacker Summer Camp, I designed and built my own badge - consisting of an ESP32 [1] microprocessor running MicroPython [2], an e-Ink display, some custom Python code, and a retro cassette case. The display rotates numerous fitting images. The image was visible even when the power was disconnected, thanks to the 3-color e-Ink display.

ESP32 and an eInk display

A follow-up …

more ...

Customize and theme tmux the easy way

tmux

Terminal multiplexers allow you to view multiple separate terminal sessions within a single terminal window. Tmux is my terminal multiplexer of choice, as it has more features than the 'original multiplexer' GNU Screen. The default setup gives you some information, but its appearance is, well...

tmux-default

Fortunately you can theme, or customize pretty much everything: From the colors to the information being shown in the status bar.

tmux-dracula

In order to make it easier to theme tmux, I split the tmux configuration file into two separate files. One file contains the main configuration ( ~/.tmux.conf ), and another file contains only theming (visual) variables ( ~/.tmux.THEMENAME.theme ). This setup makes it easier to switch different themes, without changing the main tmux configuration file.

As I wanted to automatically load a theme based on a shell environment variable, I added a small piece of code to the main tmux configuration file. This executes a shell command, which in turn loads the correct theme file.

run-shell "tmux source-file ~/.tmux.\${TMUX_THEME:-default}.theme"

The theme file is loaded dynamically, based on the environment variable $TMUX_THEME . If the environment variable is not set or empty, then the default theme is loaded: ~/.tmux.default.theme .

Loading a different …

more ...

Improving cross-subsystem git workflow: The different git configuration files

Cross-platform

Git configuration settings can be stored in three different files: The system configuration file, the global configuration file and the repository's local configuration file. See git on Windows - location of configuration files [1] for their locations.

When you use multiple subsystems on Windows (like MSYS2, Cygwin or any of the the Windows Subsystem for Linux distributions) it can be a chore to keep the git configurations synchronized. In other words: The less configuration files to maintain, the better.

Whether it's git for Windows, or one of the subsystem-specific git binaries:

Each of the git binaries that runs on Windows expands the tilde ( ~ ) to the home directory, and the path separator is always a slash ( / ).

These features can be used in our advantage in order to simplify the git configuration files between all subsystems.

Re-defining the system

The system configuration file is meant to store all system-specific configuration settings, which will be applied to all users and git repositories on the system.

If you're the only user of your workstation, it makes sense to re-define system as subsystem:

All subsystem-dependent git configuration settings should be set in the system git configuration file.

This means that settings depending on underlying binaries, like …

more ...

When to sharpen, and when to cut

Cut down a tree

When performing a task for the first time, I think of whether it's a one-off, or that it will become a recurring thing. Python scripts for example can be developed blazingly fast, and a little bit of automation can go a long way.

However...

...sometimes, while developing an automated solution that looked so simple beforehand, becomes a wild ride from one rabbit hole into the other. Missing dependencies, compile errors, functions that don't lend themselves very well for automation; Everything that can go wrong will go wrong.

That's why I like The Pomodoro Technique [1] so much, where you work in discrete time chunks of say 25, or 30 minutes. You decide upon the maximum cost for the implementation beforehand. Given the expected return, what is a sane investment ? If the time is up, then it's back to the original task at hand.

I have learned the hard way to always budget some time for documenting the (partial) solution, so that at least there's the profit of knowledge gained. Or, another record of a failed attempt...

[1]https://francescocirillo.com/pages/pomodoro-technique
more ...

Rebase OpenSSL 1.0.2-chacha to use TLS 1.3

the-road-ahead

Since its inception in 2014, the OpenSSL 1.0.2-chacha fork [1] has been used as standard OpenSSL distribution for numerous SSL/TLS pentesting tools. It includes default support for ciphers that are deemed insecure, and has extensive starttls support.... in comparison with the vanilla 1.0.2 branch.

However, even though 1.0.2 is deemed a Long Term Supported (LTS) version, no new ciphers or functionality will be added to it.

The initial reason to start the fork was a lack of ChaCha20 / Poly1305 support in the 1.0.2 branch. After that, more and more features and insecure ciphers were added or ported back in from other branches.

As ChaCha20 / Poly1305 support has been added to the 1.1.1 branch, which also contains (preliminary) TLS 1.3 support, it might be time for the insecure OpenSSL version to be rebased onto a new branch. The initial goals will still be the same:

  • Add as much ciphers and functionality as possible
  • Keep the source aligned as much as possible to the vanilla version
  • Keep the patches atomic, transparent and maintainable
  • Write as little custom code as possible

This will be quite the challenge, as the architecture and …

more ...

Tools for setting, tracking and achieving long term goals

planner2018

Immediately after reading an article on David Allen and his brainchild Getting Things Done, I started with implementing his methodology. I loved it. I still love it - especially the Getting Things Done concepts of inbox ZERO, maintaining lists, and periodic reviews.

Inbox ZERO for me is not so much about having empty email inboxes, as well as making sure that input is collected from multiple locations and stored into one dedicated location. An inbox can also be a notebook, or note taking software like Google Keep.

Electronically stored lists have the benefit of being available on a multitude of devices, the ability to synchronize between them, backups, and their biggest advantage - providing dynamic views.

emacs

Both tools that I have been using so far (the open source Java application ThinkingRock [1], and Emacs in Org mode [2]) for maintaining lists of actionable items and projects were great in that perspective. Using those tools for periodic reviews was a different story. After trying numerous configurations I never got the hang of using ThinkingRock and Emacs for that purpose. Items become abstract letters on a screen. Views never fully captured what was important or which project served which goal.

Periodically reviewing projects and …

more ...

Diff binary files like docx, odt and pdf with git

conversion_tools

Working with binary file types like the Microsoft Word XML Format Document docx , the OpenDocument Text odt format and the Portable Document Format pdf in combination with git has its difficulties. Out of the box, git only provides diffing for plain text formats. Comparing binary files in textual format is not supported.

With a simple configuration change and some open source, cross-platform tools, git can be adapted to diff those formats as well.

Installing the tools

First, one needs the tools which can convert the binary files to plain text formats. For most formats like docx and odt , the open source tool Pandoc [1] will do the trick. It can even export those files to Markdown format, or (my personal choice) reStructuredText [2]. A markup language like reStructuredText makes it possible to make a detailed comparison between structured documents, for instance when the heading level changed.

For PDF, there's the open source tool pdftotext , which is part of the Poppler [3] utils package and available for (almost) all operating systems. This can convert a PDF file to plain text.

There's a tiny catch with pdftotext , as it has issues using stdout as output, instead of writing to files. This is …

more ...