Now playing: Bad Apple!!
Problem
Being a polyglot can be painful sometimes.
Particularly, it is extremely common for me to have multiple windows open at the same time, each using a different language: at this very moment, I have my messenger on where I participate conversations in Chinese, and I am writing this post in English at the same time in my VSCode window.
As a result, with OS defaults, I have to switch between input sources (a.k.a.
input methods) so often that I have even grown the bad habit of continuously
pressing the fn
key for… nothing at all. Assuredly, fn
is more convenient
than Windows PCs’ win + space
, but wouldn’t it be much better if I don’t have
to press any key?
You might ask: doesn’t System Settings
have that feature built right in? Well,
kind of. The problem here is that this feature of automatically switching input
source is based on the current document, and thus macOS can get it wrong
quite frequently.
Wouldn’t it be much better if macOS could switch input sources on a per-app
basis, just like Windows does? 1 In fact, there are already many non-free
solutions out there that prove this point, like
Input Source Pro
and
KeyboardHolder
.
Now that it is theoretically possible, why not make my own “lite version” just for fun?
Whence comes Claveilleur
, my own
open-source macOS input source switching daemon, whose name comes from the
French words for keyboard (clavier) and watchman (veilleur).
Workflow
Just one caveat before we actually begin: I’m not your regular Apple developer.
On the one hand, I do appreciate the speed of ARM-based Macs and the abundance of well-made GUI apps on macOS, but as a random dev mostly doing cross-platform app development, I haven’t used Xcode that much, and I’m just a bit reluctant to leave my polyglot-friendly VSCode…
On the other hand, the lure of Swift does seem irresistible this time (despite the fact that I’m still new to Swift and more at ease with Rust): it has nearly seamless Objective-C interoperability support and I have heard of it even having some exclusive high-level macOS API bindings.
Fortunately, Apple also provides
SourceKit-LSP
that allows me to code
in Swift using any LSP-compatible editor. Combined with the
SwiftPM
CLI to build the project in
the terminal, this does seem to provide the level of VSCode support that I would
expect from a popular programming language. Thus, Claveilleur
was made
entirely without launching Xcode. 2
Solution
I want Claveilleur
to be a CLI app in the style of
skhd
and
yabai
. All you need to do as a regular
user is to download the all-in-one binary, use its CLI to tell launchd
that
you have a new daemon, and then you’re good to go!
Detecting Input Source Changes
It should be clear by now that the core of Claveilleur
relies on observing
certain macOS desktop events such as the change of the current input source and
of the frontmost app.
Let’s first try to detect current input source change. The problem is, what APIs should I use to achieve that?
It did take me quite some time to find the right search engine keywords,
navigate to this very
StackOverflow answer
, and have some
slightest clue about what this Objective-C snippet is doing:
[[NSDistributedNotificationCenter defaultCenter]
addObserver:self
selector:@selector(myMethod:)
name:(__bridge NSString*)
kTISNotifySelectedKeyboardInputSourceChanged
object:nil
suspensionBehavior:NSNotificationSuspensionBehaviorDeliverImmediately];
I haven’t written a single line of Objective-C, but it does look like C extended with Smalltalk-style messaging…
Aha! So here we are calling the addObserver
method of the default
NSDistributedNotificationCenter
with myMethod
as the callback function.
Let’s see how this might translate to Swift (*typing in VSCode*)…
DistributedNotificationCenter.default.
Wait. This looks interesting…
func publisher(
for name: Notification.Name,
object: AnyObject? = nil
) -> NotificationCenter.Publisher
It turns out that
Publisher
allows the use of FRP (Functional Reactive Programming) on
NotificationCenter
s!
From here, things should become much easier…
The code is as readable as it gets if you are familiar with FRP:
.publisher()
: We are processing the stream of input source change events. Whenever this happens, akTISNotifySelectedKeyboardInputSourceChanged
message will be posted inDistributedNotificationCenter.default
..map()
: Whenever a message arrives, we immediately get the current (new) input method..removeDuplicates()
: We also need to ensure that the user is indeed switching to a different input method..sink()
: We get the current (frontmost) app, and associate it with the current input method. Later, when the current app changes, we can then use this information to restore the input method to that associated value.
Yes, it’s just that simple… Or is it?
Detecting Current App Changes
The previous snippet assumes that getCurrentAppBundleID
is correctly
implemented, but how on earth should we get the current app and detect its
change?
NSWorkspace.shared
has a frontmostApplication
field, the changes of which
are even observable. Thus, it is very tempting to get the app bundle ID directly
from there:
… except it doesn’t work all the time.
The corner case is encountered when, for example, the user activates Spotlight
with cmd + space
: the frontmostApplication
simply doesn’t change at all!
According to this StackOverflow question, this is because floating windows (such as those of Spotlight and Alfred) are somewhat special:
[They] aren’t really active because they use the
NSNonactivatingPanelMask
style mask, but they still can be focused.
… but there has to be another way, right?
Yes! Right in that question thread, Ryan H has proposed what could very likely be the way forward:
[…] get pids for all the apps you want, and use
AXObserver
andAXObserverAddNotification
to get notifications on them.
The AX
prefix here seems to stand for
Carbon Accessibility.
This makes perfect sense, since the aforementioned proprietary apps also need
accessibility privileges to run in the first place!
At this point, the plan for the next step is quite clear: I need to detect every
single kAXFocusedWindowChangedNotification
or
kAXApplicationHiddenNotification
message in order to correctly find out the
current app! Sounds tedious, isn’t it?
To make things worse, using AXObserver*
APIs for this purpose has some main
difficulties:
- Those APIs are old C-style ones, which require another kind of dance to call, completely different from what we have seen previously.
- Those APIs are called on a per-PID basis, but I am not sure what PIDs will be useful to me.
- The business logic of getting the current app is since disconnected from the way of detecting current app changes.
Receiving kAX*
Messages for a Single PID
Now, let’s implement the detection and handling of kAX*
messages in the
WindowChangeObserver
class.
Sadly, we can not write this part of the code directly in the FRP style: the Carbon Accessibility APIs, which belong to Apple’s C-based Carbon framework, seem to date from as early as Mac OS X v10.2, but most Carbon APIs have already been deprecated since v10.8, so Apple obviously didn’t take the time to provide high-level abstractions for them.
However, I did find just the right amount of info in
this Chinese blog post to make my
WindowChangeObserver
work. As it turns out, the way of calling AXObserver*
APIs looks quite like the Objective-C snippet in the
Detecting Input Source Changes section, but
this time, I’m writing all of this in Swift rather than in C.
The first thing to do is to declare WindowChangeObserver
as a subclass of
NSObject
:
The notifNames
mapping here not only gives the two types of messages that we
care about, but also helps convert kAX*
messages to regular
Notification.Name
, so that we can send them to a NotificationCenter
and
handle them in the “old” way.
Next, we declare the callback for the observer:
We need to remember that Carbon is a C-based framework, so naturally we are
declaring a C-compatible callback above. That is to say, despite being written
as a Swift closure, we are not allowed to capture any variable from the
environment in the callback, and the self
reference is hidden in the refcon
parameter.
However, apart from this restriction, what the callback does is still quite
clear: it simply sends the converted Notification.Name
to
localNotificationCenter
.
The implementation of init
and deinit
methods is quite boring, since they do
almost nothing other than initializing and deinitializing rawObserver
respectively:
… where .unwrap()
is just a convenience method to convert AXError
s to
exceptions and throw them.
Tracking the Interesting PIDs
This time, let’s maintain a collection of WindowChangeObserver
s for each
useful PID in the RunningAppsObserver
class.
As usual, RunningAppsObserver
should be declared as a subclass of NSObject
:
Here, rawObserver
will be initialized to an Objective-C key-value observation
of currentWorkSpace.runningApplications
, which is responsible for maintaining
the windowChangeObservers
collection by repeatedly calculating the latest
changes in the observed collection of running apps:
Finally, thanks to the directions of this Python snippet, we can use Quartz Window Services 3 to obtain the collection of “interesting” PIDs (“interesting” as in “having a GUI”):
To this point, we have finally obtained an observer that can detect current
window changes from an automatically-adjusted range of PIDs and send
Claveilleur.focusedWindowChangedNotification
or
Claveilleur.appHiddenNotification
messages to localNotificationCenter
accordingly:
Consuming the Messages
After all the above efforts, we can finally return to the familiar FRP-style APIs.
First, we have a bunch of different bundle ID-generating publishers, all likely to indicate a new current app:
Then, all we need to do is to declare another observer that consumes all those publishers, and saves or loads input sources according to the current app:
Getting the Current App
Now, the core functionality of Claveilleur
is complete. With so much time
being spent detecting current app changes, the only missing piece in the
puzzle 4 seems to be the way of actually getting the current app.
Long story short: I haven’t completely figured that part out. I have found different ways of doing this, but it seems to me that every single one of them might fail in one way or another under certain circumstances.
At the time of writing, this is achieved by
combining AXUIElementGetPid
results and NSWorkspace
ones,
which seems to yield the correct result for over 90% of the time.
Correctly Getting Accessibility Privileges
When configuring the CI builds for Claveilleur
, a natural idea is to build a
Universal 2 (a.k.a. “fat”) binary that supports both x64 and ARM64
architectures:
swift build -c release --arch arm64 --arch x86_64
However, when running this build on ARM-based Macs, it seemed that the Accessibility Privileges can never be granted (Claveilleur/#2).
That is, the following function always returns false
:
As it turns out, this has something to do with the code signing rules that macOS is enforcing. It just so happens that when the app is not a macOS bundle, it could be very hard get it signed correctly.
I solved this problem by first creating a minimal bundle manifest under
Supporting/Info.plist
:
… then embedding the manifest into the executable 5 by adding
linkerSettings
to the .executableTarget()
section in Package.swift
like
so:
… and just to be sure, signing the CI build again before publishing:
codesign -dvvv --force --sign - \
"$(swift build --show-bin-path -c release --arch arm64 --arch x86_64)/claveilleur"
Conclusion
This is in fact only my second time doing Swift (after
Ouverture
), and my feelings towards
this overall experience are still quite complicated at this moment.
On the one hand, Swift does seem like a beautifully-designed programming language to me, which, just like Rust, cares a lot about bringing modern features and patterns into a traditional procedural/object-oriented context.
On the other hand, it seems to me that even as a somewhat experienced developer, having to use quite a bunch of under-documented and under-maintained APIs is still a major issue while getting my hands on macOS desktop development.
I wish Apple could realize this issue and… change things for the better in the future, maybe?
However, turning on this feature on Windows will lead to another issue where the task bar is also considered as an app. My friend
Icecovery
’sIMEIndicator
provides more details on it and a (hacky) workaround. ↩︎I still had to download Xcode from the Mac App Store to get the full macOS SDK :( ↩︎
Quartz is the name of the macOS window server. ↩︎
Apart from the part of getting and setting the current input source (which fits nicely into ~30 lines of Objective-C), that is. ↩︎
If you are using Rust for macOS desktop development, the
embed_plist
crate might do the work for you. ↩︎