Now playing: Bad Apple!!

Problem

Being a polyglot can be painful sometimes.

Particularly, it is extremely common for me to have multiple windows open at the same time, each using a different language: at this very moment, I have my messenger on where I participate conversations in Chinese, and I am writing this post in English at the same time in my VSCode window.

As a result, with OS defaults, I have to switch between input sources (a.k.a. input methods) so often that I have even grown the bad habit of continuously pressing the fn key for… nothing at all. Assuredly, fn is more convenient than Windows PCs’ win + space, but wouldn’t it be much better if I don’t have to press any key?

You might ask: doesn’t System Settings have that feature built right in? Well, kind of. The problem here is that this feature of automatically switching input source is based on the current document, and thus macOS can get it wrong quite frequently.

Wouldn’t it be much better if macOS could switch input sources on a per-app basis, just like Windows does? 1 In fact, there are already many non-free solutions out there that prove this point, like Input Source Pro and KeyboardHolder.

Now that it is theoretically possible, why not make my own “lite version” just for fun?

Whence comes Claveilleur, my own open-source macOS input source switching daemon, whose name comes from the French words for keyboard (clavier) and watchman (veilleur).

Workflow

Just one caveat before we actually begin: I’m not your regular Apple developer.

On the one hand, I do appreciate the speed of ARM-based Macs and the abundance of well-made GUI apps on macOS, but as a random dev mostly doing cross-platform app development, I haven’t used Xcode that much, and I’m just a bit reluctant to leave my polyglot-friendly VSCode…

On the other hand, the lure of Swift does seem irresistible this time (despite the fact that I’m still new to Swift and more at ease with Rust): it has nearly seamless Objective-C interoperability support and I have heard of it even having some exclusive high-level macOS API bindings.

Fortunately, Apple also provides SourceKit-LSP that allows me to code in Swift using any LSP-compatible editor. Combined with the SwiftPM CLI to build the project in the terminal, this does seem to provide the level of VSCode support that I would expect from a popular programming language. Thus, Claveilleur was made entirely without launching Xcode. 2

Solution

I want Claveilleur to be a CLI app in the style of skhd and yabai. All you need to do as a regular user is to download the all-in-one binary, use its CLI to tell launchd that you have a new daemon, and then you’re good to go!

Detecting Input Source Changes

It should be clear by now that the core of Claveilleur relies on observing certain macOS desktop events such as the change of the current input source and of the frontmost app.

Let’s first try to detect current input source change. The problem is, what APIs should I use to achieve that?

It did take me quite some time to find the right search engine keywords, navigate to this very StackOverflow answer, and have some slightest clue about what this Objective-C snippet is doing:

[[NSDistributedNotificationCenter defaultCenter]
           addObserver:self
              selector:@selector(myMethod:)
                  name:(__bridge NSString*)
                           kTISNotifySelectedKeyboardInputSourceChanged
                object:nil
    suspensionBehavior:NSNotificationSuspensionBehaviorDeliverImmediately];

I haven’t written a single line of Objective-C, but it does look like C extended with Smalltalk-style messaging

Aha! So here we are calling the addObserver method of the default NSDistributedNotificationCenter with myMethod as the callback function.

Let’s see how this might translate to Swift (*typing in VSCode*)…

DistributedNotificationCenter.default.

Wait. This looks interesting…

func publisher(
  for name: Notification.Name,
  object: AnyObject? = nil
) -> NotificationCenter.Publisher

It turns out that Publisher allows the use of FRP (Functional Reactive Programming) on NotificationCenters!

From here, things should become much easier…

The code is as readable as it gets if you are familiar with FRP:

  • .publisher(): We are processing the stream of input source change events. Whenever this happens, a kTISNotifySelectedKeyboardInputSourceChanged message will be posted in DistributedNotificationCenter.default.
  • .map(): Whenever a message arrives, we immediately get the current (new) input method.
  • .removeDuplicates(): We also need to ensure that the user is indeed switching to a different input method.
  • .sink(): We get the current (frontmost) app, and associate it with the current input method. Later, when the current app changes, we can then use this information to restore the input method to that associated value.

Yes, it’s just that simple… Or is it?

Detecting Current App Changes

The previous snippet assumes that getCurrentAppBundleID is correctly implemented, but how on earth should we get the current app and detect its change?

NSWorkspace.shared has a frontmostApplication field, the changes of which are even observable. Thus, it is very tempting to get the app bundle ID directly from there:

… except it doesn’t work all the time.

The corner case is encountered when, for example, the user activates Spotlight with cmd + space: the frontmostApplication simply doesn’t change at all!

According to this StackOverflow question, this is because floating windows (such as those of Spotlight and Alfred) are somewhat special:

[They] aren’t really active because they use the NSNonactivatingPanelMask style mask, but they still can be focused.

… but there has to be another way, right?

Yes! Right in that question thread, Ryan H has proposed what could very likely be the way forward:

[…] get pids for all the apps you want, and use AXObserver and AXObserverAddNotification to get notifications on them.

The AX prefix here seems to stand for Carbon Accessibility. This makes perfect sense, since the aforementioned proprietary apps also need accessibility privileges to run in the first place!

At this point, the plan for the next step is quite clear: I need to detect every single kAXFocusedWindowChangedNotification or kAXApplicationHiddenNotification message in order to correctly find out the current app! Sounds tedious, isn’t it?

To make things worse, using AXObserver* APIs for this purpose has some main difficulties:

  • Those APIs are old C-style ones, which require another kind of dance to call, completely different from what we have seen previously.
  • Those APIs are called on a per-PID basis, but I am not sure what PIDs will be useful to me.
  • The business logic of getting the current app is since disconnected from the way of detecting current app changes.

Receiving kAX* Messages for a Single PID

Now, let’s implement the detection and handling of kAX* messages in the WindowChangeObserver class.

Sadly, we can not write this part of the code directly in the FRP style: the Carbon Accessibility APIs, which belong to Apple’s C-based Carbon framework, seem to date from as early as Mac OS X v10.2, but most Carbon APIs have already been deprecated since v10.8, so Apple obviously didn’t take the time to provide high-level abstractions for them.

However, I did find just the right amount of info in this Chinese blog post to make my WindowChangeObserver work. As it turns out, the way of calling AXObserver* APIs looks quite like the Objective-C snippet in the Detecting Input Source Changes section, but this time, I’m writing all of this in Swift rather than in C.

The first thing to do is to declare WindowChangeObserver as a subclass of NSObject:

The notifNames mapping here not only gives the two types of messages that we care about, but also helps convert kAX* messages to regular Notification.Name, so that we can send them to a NotificationCenter and handle them in the “old” way.

Next, we declare the callback for the observer:

We need to remember that Carbon is a C-based framework, so naturally we are declaring a C-compatible callback above. That is to say, despite being written as a Swift closure, we are not allowed to capture any variable from the environment in the callback, and the self reference is hidden in the refcon parameter.

However, apart from this restriction, what the callback does is still quite clear: it simply sends the converted Notification.Name to localNotificationCenter.

The implementation of init and deinit methods is quite boring, since they do almost nothing other than initializing and deinitializing rawObserver respectively:

… where .unwrap() is just a convenience method to convert AXErrors to exceptions and throw them.

Tracking the Interesting PIDs

This time, let’s maintain a collection of WindowChangeObservers for each useful PID in the RunningAppsObserver class.

As usual, RunningAppsObserver should be declared as a subclass of NSObject:

Here, rawObserver will be initialized to an Objective-C key-value observation of currentWorkSpace.runningApplications, which is responsible for maintaining the windowChangeObservers collection by repeatedly calculating the latest changes in the observed collection of running apps:

Finally, thanks to the directions of this Python snippet, we can use Quartz Window Services 3 to obtain the collection of “interesting” PIDs (“interesting” as in “having a GUI”):

To this point, we have finally obtained an observer that can detect current window changes from an automatically-adjusted range of PIDs and send Claveilleur.focusedWindowChangedNotification or Claveilleur.appHiddenNotification messages to localNotificationCenter accordingly:

Consuming the Messages

After all the above efforts, we can finally return to the familiar FRP-style APIs.

First, we have a bunch of different bundle ID-generating publishers, all likely to indicate a new current app:

Then, all we need to do is to declare another observer that consumes all those publishers, and saves or loads input sources according to the current app:

Getting the Current App

Now, the core functionality of Claveilleur is complete. With so much time being spent detecting current app changes, the only missing piece in the puzzle 4 seems to be the way of actually getting the current app.

Long story short: I haven’t completely figured that part out. I have found different ways of doing this, but it seems to me that every single one of them might fail in one way or another under certain circumstances.

At the time of writing, this is achieved by combining AXUIElementGetPid results and NSWorkspace ones, which seems to yield the correct result for over 90% of the time.

Correctly Getting Accessibility Privileges

When configuring the CI builds for Claveilleur, a natural idea is to build a Universal 2 (a.k.a. “fat”) binary that supports both x64 and ARM64 architectures:

swift build -c release --arch arm64 --arch x86_64

However, when running this build on ARM-based Macs, it seemed that the Accessibility Privileges can never be granted (Claveilleur/#2).

That is, the following function always returns false:

As it turns out, this has something to do with the code signing rules that macOS is enforcing. It just so happens that when the app is not a macOS bundle, it could be very hard get it signed correctly.

I solved this problem by first creating a minimal bundle manifest under Supporting/Info.plist:

… then embedding the manifest into the executable 5 by adding linkerSettings to the .executableTarget() section in Package.swift like so:

… and just to be sure, signing the CI build again before publishing:

codesign -dvvv --force --sign - \
  "$(swift build --show-bin-path -c release --arch arm64 --arch x86_64)/claveilleur"

Conclusion

This is in fact only my second time doing Swift (after Ouverture), and my feelings towards this overall experience are still quite complicated at this moment.

On the one hand, Swift does seem like a beautifully-designed programming language to me, which, just like Rust, cares a lot about bringing modern features and patterns into a traditional procedural/object-oriented context.

On the other hand, it seems to me that even as a somewhat experienced developer, having to use quite a bunch of under-documented and under-maintained APIs is still a major issue while getting my hands on macOS desktop development.

I wish Apple could realize this issue and… change things for the better in the future, maybe?


  1. However, turning on this feature on Windows will lead to another issue where the task bar is also considered as an app. My friend Icecovery’s IMEIndicator provides more details on it and a (hacky) workaround. ↩︎

  2. I still had to download Xcode from the Mac App Store to get the full macOS SDK :( ↩︎

  3. Quartz is the name of the macOS window server. ↩︎

  4. Apart from the part of getting and setting the current input source (which fits nicely into ~30 lines of Objective-C), that is. ↩︎

  5. If you are using Rust for macOS desktop development, the embed_plist crate might do the work for you. ↩︎