Now playing: Bad Apple!!
Problem
Being a polyglot can be painful sometimes.
Particularly, it is extremely common for me to have multiple windows open at the same time, each using a different language: at this very moment, I have my messenger on where I participate conversations in Chinese, and I am writing this post in English at the same time in my VSCode window.
As a result, with OS defaults, I have to switch between input sources (a.k.a. input methods)
so often that I have even grown the bad habit of continuously pressing the fn
key for…
nothing at all.
Assuredly, fn
is more convenient than Windows PCs’ win + space
,
but wouldn’t it be much better if I don’t have to press any key?
You might ask: doesn’t System Settings
have that feature built right in?
Well, kind of.
The problem here is that this feature of automatically switching input source
is based on the current document,
and thus macOS can get it wrong quite frequently.
Wouldn’t it be much better if macOS could switch input sources on a per-app basis,
just like Windows does? 1
In fact, there are already many non-free solutions out there that prove this point,
like Input Source Pro
and KeyboardHolder
.
Now that it is theoretically possible, why not make my own “lite version” just for fun?
Whence comes Claveilleur
,
my own open-source macOS input source switching daemon,
whose name comes from the French words for keyboard (clavier) and watchman (veilleur).
Workflow
Just one caveat before we actually begin: I’m not your regular Apple developer.
On the one hand, I do appreciate the speed of ARM-based Macs and the abundance of well-made GUI apps on macOS, but as a random dev mostly doing cross-platform app development, I haven’t used Xcode that much, and I’m just a bit reluctant to leave my polyglot-friendly VSCode…
On the other hand, the lure of Swift does seem irresistible this time (despite the fact that I’m still new to Swift and more at ease with Rust): it has nearly seamless Objective-C interoperability support and I have heard of it even having some exclusive high-level macOS API bindings.
Fortunately, Apple also provides SourceKit-LSP
that allows me to code in Swift using any LSP-compatible editor.
Combined with the SwiftPM
CLI
to build the project in the terminal,
this does seem to provide the level of VSCode support that
I would expect from a popular programming language.
Thus, Claveilleur
was made entirely without launching Xcode. 2
Solution
I want Claveilleur
to be a CLI app in the style of
skhd
and yabai
.
All you need to do as a regular user is to download the all-in-one binary,
use its CLI to tell launchd
that you have a new daemon, and then you’re good to go!
Detecting Input Source Changes
It should be clear by now that the core of Claveilleur
relies on
observing certain macOS desktop events such as the change of the current input source
and of the frontmost app.
Let’s first try to detect current input source change. The problem is, what APIs should I use to achieve that?
It did take me quite some time to find the right search engine keywords,
navigate to this very StackOverflow answer
,
and have some slightest clue about what this Objective-C snippet is doing:
[[NSDistributedNotificationCenter defaultCenter]
addObserver:self
selector:@selector(myMethod:)
name:(__bridge NSString*)
kTISNotifySelectedKeyboardInputSourceChanged
object:nil
suspensionBehavior:NSNotificationSuspensionBehaviorDeliverImmediately];
I haven’t written a single line of Objective-C, but it does look like C extended with Smalltalk-style messaging…
Aha! So here we are calling the addObserver
method of
the default NSDistributedNotificationCenter
with myMethod
as the callback function.
Let’s see how this might translate to Swift (*typing in VSCode*)…
DistributedNotificationCenter.default.
Wait. This looks interesting…
func publisher(
for name: Notification.Name,
object: AnyObject? = nil
) -> NotificationCenter.Publisher
It turns out that
Publisher
allows the use of FRP (Functional Reactive Programming) on
NotificationCenter
s!
From here, things should become much easier…
The code is as readable as it gets if you are familiar with FRP:
.publisher()
: We are processing the stream of input source change events. Whenever this happens, akTISNotifySelectedKeyboardInputSourceChanged
message will be posted inDistributedNotificationCenter.default
..map()
: Whenever a message arrives, we immediately get the current (new) input method..removeDuplicates()
: We also need to ensure that the user is indeed switching to a different input method..sink()
: We get the current (frontmost) app, and associate it with the current input method. Later, when the current app changes, we can then use this information to restore the input method to that associated value.
Yes, it’s just that simple… Or is it?
Detecting Current App Changes
The previous snippet assumes that getCurrentAppBundleID
is correctly implemented,
but how on earth should we get the current app and detect its change?
NSWorkspace.shared
has a frontmostApplication
field, the changes of which are even observable.
Thus, it is very tempting to get the app bundle ID directly from there:
… except it doesn’t work all the time.
The corner case is encountered when, for example, the user activates Spotlight with cmd + space
:
the frontmostApplication
simply doesn’t change at all!
According to this StackOverflow question, this is because floating windows (such as those of Spotlight and Alfred) are somewhat special:
[They] aren’t really active because they use the
NSNonactivatingPanelMask
style mask, but they still can be focused.
… but there has to be another way, right?
Yes! Right in that question thread, Ryan H has proposed what could very likely be the way forward:
[…] get pids for all the apps you want, and use
AXObserver
andAXObserverAddNotification
to get notifications on them.
The AX
prefix here seems to stand for
Carbon Accessibility.
This makes perfect sense, since the aforementioned proprietary apps also need
accessibility privileges to run in the first place!
At this point, the plan for the next step is quite clear:
I need to detect every single kAXFocusedWindowChangedNotification
or
kAXApplicationHiddenNotification
message in order to correctly find out the current app!
Sounds tedious, isn’t it?
To make things worse, using AXObserver*
APIs for this purpose has some main difficulties:
- Those APIs are old C-style ones, which require another kind of dance to call, completely different from what we have seen previously.
- Those APIs are called on a per-PID basis, but I am not sure what PIDs will be useful to me.
- The business logic of getting the current app is since disconnected from the way of detecting current app changes.
Receiving kAX*
Messages for a Single PID
Now, let’s implement the detection and handling of kAX*
messages
in the WindowChangeObserver
class.
Sadly, we can not write this part of the code directly in the FRP style: the Carbon Accessibility APIs, which belong to Apple’s C-based Carbon framework, seem to date from as early as Mac OS X v10.2, but most Carbon APIs have already been deprecated since v10.8, so Apple obviously didn’t take the time to provide high-level abstractions for them.
However, I did find just the right amount of info
in this Chinese blog post
to make my WindowChangeObserver
work.
As it turns out, the way of calling AXObserver*
APIs looks quite like the Objective-C snippet
in the Detecting Input Source Changes section,
but this time, I’m writing all of this in Swift rather than in C.
The first thing to do is to declare WindowChangeObserver
as a subclass of NSObject
:
The notifNames
mapping here not only gives the two types of messages that we care about,
but also helps convert kAX*
messages to regular Notification.Name
,
so that we can send them to a NotificationCenter
and handle them in the “old” way.
Next, we declare the callback for the observer:
We need to remember that Carbon is a C-based framework,
so naturally we are declaring a C-compatible callback above.
That is to say, despite being written as a Swift closure,
we are not allowed to capture any variable from the environment in the callback,
and the self
reference is hidden in the refcon
parameter.
However, apart from this restriction, what the callback does is still quite clear:
it simply sends the converted Notification.Name
to localNotificationCenter
.
The implementation of init
and deinit
methods is quite boring,
since they do almost nothing other than initializing
and deinitializing rawObserver
respectively:
… where .unwrap()
is just a convenience method to convert AXError
s to exceptions
and throw them.
Tracking the Interesting PIDs
This time, let’s maintain a collection of WindowChangeObserver
s for each useful PID
in the RunningAppsObserver
class.
As usual, RunningAppsObserver
should be declared as a subclass of NSObject
:
Here, rawObserver
will be initialized to an Objective-C key-value observation
of currentWorkSpace.runningApplications
,
which is responsible for maintaining the windowChangeObservers
collection
by repeatedly calculating the latest changes in the observed collection of running apps:
Finally, thanks to the directions of this Python snippet, we can use Quartz Window Services 3 to obtain the collection of “interesting” PIDs (“interesting” as in “having a GUI”):
To this point, we have finally obtained an observer that can detect current window changes
from an automatically-adjusted range of PIDs and send
Claveilleur.focusedWindowChangedNotification
or Claveilleur.appHiddenNotification
messages to localNotificationCenter
accordingly:
Consuming the Messages
After all the above efforts, we can finally return to the familiar FRP-style APIs.
First, we have a bunch of different bundle ID-generating publishers, all likely to indicate a new current app:
Then, all we need to do is to declare another observer that consumes all those publishers, and saves or loads input sources according to the current app:
Getting the Current App
Now, the core functionality of Claveilleur
is complete.
With so much time being spent detecting current app changes,
the only missing piece in the puzzle 4 seems to be
the way of actually getting the current app.
Long story short: I haven’t completely figured that part out. I have found different ways of doing this, but it seems to me that every single one of them might fail in one way or another under certain circumstances.
At the time of writing, this is achieved by
combining AXUIElementGetPid
results and NSWorkspace
ones,
which seems to yield the correct result for over 90% of the time.
Correctly Getting Accessibility Privileges
When configuring the CI builds for Claveilleur
,
a natural idea is to build a Universal 2 (a.k.a. “fat”) binary that supports both
x64 and ARM64 architectures:
swift build -c release --arch arm64 --arch x86_64
However, when running this build on ARM-based Macs, it seemed that the Accessibility Privileges can never be granted (Claveilleur/#2).
That is, the following function always returns false
:
As it turns out, this has something to do with the code signing rules that macOS is enforcing. It just so happens that when the app is not a macOS bundle, it could be very hard get it signed correctly.
I solved this problem by first creating
a minimal bundle manifest under Supporting/Info.plist
:
… then embedding the manifest into the executable 5 by adding linkerSettings
to the .executableTarget()
section in Package.swift
like so:
… and just to be sure, signing the CI build again before publishing:
codesign -dvvv --force --sign - \
"$(swift build --show-bin-path -c release --arch arm64 --arch x86_64)/claveilleur"
Conclusion
This is in fact only my second time doing Swift
(after Ouverture
),
and my feelings towards this overall experience are
still quite complicated at this moment.
On the one hand, Swift does seem like a beautifully-designed programming language to me, which, just like Rust, cares a lot about bringing modern features and patterns into a traditional procedural/object-oriented context.
On the other hand, it seems to me that even as a somewhat experienced developer, having to use quite a bunch of under-documented and under-maintained APIs is still a major issue while getting my hands on macOS desktop development.
I wish Apple could realize this issue and… change things for the better in the future, maybe?
However, turning on this feature on Windows will lead to another issue where the task bar is also considered as an app. My friend
Icecovery
’sIMEIndicator
provides more details on it and a (hacky) workaround. ↩︎I still had to download Xcode from the Mac App Store to get the full macOS SDK :( ↩︎
Quartz is the name of the macOS window server. ↩︎
Apart from the part of getting and setting the current input source (which fits nicely into ~30 lines of Objective-C), that is. ↩︎
If you are using Rust for macOS desktop development, the
embed_plist
crate might do the work for you. ↩︎