Skip to main content

Vendor Security

A few weeks ago I had to have a conversation with a vendor about credentials. Despite some push back from our side, they insisted that their Bearer Token style authentication key for HTTP requests was safe from MitM + Replay attacks. The token was to be used from a user's device (their phone). They claimed it must be safe because the API access that uses this token is protected by HTTPS/TLS/SSL. In short: yes it's protected from external snooping, but it's not safe unless you fully trust all of your users and their devices, which you should not. (aside: that talk was from 2015? that's surprising).

This reminded me of both an inadequacy in how we generally speak with vendors, and also a Kafkaesque conference call…

It is not uncommon for vendors to send credentials by email. For a long time I pushed back on this, but the practice became seemingly so common (and the security concerns of this approach such a foreign idea to vendors) that I've mostly stopped bothering, unless it's a credential that my team needs to care about. If you broadcast your key in a place others can see it, that's on you. We might mention it, but because there's not a great universal way to use an established secure method to communicate (this is the first problem: the inadequacy of secure communication with vendors), it's usually so much of a hassle that I don't feel as strongly about identifying this kind of misstep when it's solely someone else's problem if/when it leaks. That might make me complicit in the worsening-of-everything, but honestly, these days I'd spend so much of my/their time chasing this kind of thing down, that it seems less worth it than in the past, and I've got my own work to do.

There was one client, though, where this happened all the time with their vendors (and themselves). It happened so often that we had to speak up about it whenever someone delivered a long-lasting security credential in an insecure matter. It seemed like every week we had to send a "please invalidate this credential and send it via the secure method we established at the start of this relationship"—which, in most cases, was Keybase.

There's really not a great way to get normals to send secure messages with the tools they already have.

Anyway, this one client, and this one conference call went something like this: we get about 30 people together on a giant "go/no-go" call. These are mostly the big client, but in this case we're one of the vendors, and there are at least 3 other vendors on the call. When it gets to be our turn I say "we're go on our stuff, but we'd like Vendor X to invalidate the key they sent Vendor Y by email earlier today and generate a new one; even if we consider this email secure (which we don't), we don't want to have this key and you sent it to us in the big email thread."

Vendors on this project were used to us saying this kind of thing. They didn't care. We were—in part—getting paid to care, so we brought it up. There was non-visible eyerolling, but we eventually all agreed that Vendor X would regenerate and send to Vendor Y.

Next thing we know, and still on the conference call, the representative from Vendor X says "ok, Dave, I regenerated the key. It's a7b 38…" I break in as soon as I realize what's going on and I say "STOP. PLEASE. The whole point of regenerating is that we should exercise Least Privilege and there are a lot of people on this call that don't need this key—they should not have it." More eye rolling, but Vendor X's person says "ok, Dave; I'll call you directly then."

Slight pause in the big conference call and we hear Dave (from Vendor Y) say "Hello?" then we hear Vendor X say "Hey Dave. It's Kathy. Okay, the key is a7b 38…" Sure enough, Kathy called from her mobile phone to Dave's mobile phone, and neither of them muted themselves. We heard both sides of the conversation and everyone got the key yet again.

I think we made them regenerate a 3rd one, but this kind of complete lack of diligence is a main factor in me noping out of pestering vendors about regenerating credentials that are compromised by neglect, unless we have a specific mandate to do so (or if the credentials are ours to regenerate).

Matter and Privacy

As of October 1, 2024, I've stepped away from my role as VP of Technology at Matter. This means that I can no longer personally vouch for the privacy aspects of the app.

When I was still working at Faculty, we took on a new client that was not yet named Matter. We eventually transitioned from an agency-client relationship to a startup relationship, where I became the VP of Technology. This is what I've been doing for the past two-ish years.

Chris wrote some good background on founding Matter, so I won't repeat all of those details, but I've been wanting to write a bit about the origin story from my perspective.

When we were trying to figure out how to turn some of the neuroscience, existing material, and lots of our CEO Axel's ideas into a product, we started discussing the idea of building an app around the concept of allowing users to log memories that they could later recall to improve their happiness. As a natural skeptic, it took me a little while to come around to believing that this was even possible and wasn't just hand-wavy wellness stuff. I've since been convinced that we have technology that—when employed correctly—can actually improve happiness by having our users recall positive experiences. And we have actual science (which I will link in the future) that proves that their brains will create/release neurotransmitters ("happiness chemicals" in the context of what we're working on) in line with their original positive experience, making them feel happier. For real.

So, as a very bare concept, we landed on the idea of allowing users to store photos of these positive experiences, as well as logging ratings of the emotions they experienced so they could recall them later to stimulate these neurotransmitters.

At one point Axel asked me "what do you think of that?" I said "To be honest, I have to ask: why would I ever send you a private photo and a positive rating for a sexual desire emotion? I would never send something like that to another party like Facebook, so why would I do it for us?"

This led us down an interesting—and mostly unique—path to how we handle private user content, and how we model threats against this private data. We adopted user privacy as a core value, and how we think about this informs many other decisions we make with the app and the whole ecosystem handling our users' data. This became so important to us that we knew it needed to be one of the foundational aspects of how we work and this decision informed the product, not the inverse. We knew it was not something we could bolt on later—that trying to add this once we'd already exposed user data (to even ourselves) would be error-prone at best, and impossible at worst.

Early on, we set out some core principles:

  • we need to build trust with our users so they can believe what we say when it comes to how we handle their data (advanced users can audit traffic over the network to help build this trust, if they desire)
  • we need to protect our users from mistakes we might make (we shouldn't be able to suffer a data leak of our users' data if we have a weak password or our code has a bug)
  • even if we are competent enough to prevent a leak from ever happening, and even if our users trust us to do what we say, we must be resilient to being strong-armed by a future controlling power (e.g. if someone we don't trust buys us)

We also had some extremely unusual conversations related to parameters around how far we can go with this:

  • "should we build our own datacentre for this?" "no, probably not. We can use an existing host if we're careful about what data we collect and how we collect it." "but would our things be safer if we did?" "honestly, someone much larger than us will do a much better job with the physical layer… I don't think we want to spend our funding on hollowing out a mountain and hiring a militia."
  • "we can have the best intentions, but we can't always rely on those intentions. If one of our users' data became valuable to an evil nation state and they kidnapped my family, I'll be honest, I'd probably have to hand over the data."

Given these criteria and extremes, we decided that our best course of action is to just never have our users' private data.

This means that when you rate something high "pride" in Matter, we can't tell you've done that. We've intentionally set up our metrics system to refuse to collect this kind of personal data, and we (the people and systems at Matter) simply never get it (only the app running on your device gets this data). We don't store the data on our servers (outside of analytics—and even then never the data we consider private like emotion ratings); it always stays on your device and within your device's datastore. (Matter is an iPhone app, so we store data on your phone with Core Data, and in a private database that syncs within your iCloud account, but is set up in a way that even we can't access it. The app code that runs within our developer credentials, on your device, can read and write this data, but it is never sent to us and we have no way of accessing it through Apple's tooling. It's truly private to you.)

We (again, the people and systems at Matter) do get the product of some of those inputs, but never in a way that we can reverse it. A very simple version of this is if we were to collect the product of an a multiplication operation with the value "600", we don't know if the inputs were "1 × 600", "100 × 6", "30 × 20", "12 × 50", etc. We don't know what went into a Matter Score for a memory but we do know the score. We know the "600" but not the "8" or the "75". We don't even know how you described a memory or what's in a photo you attached. All we know is that there is a memory, it has a score of 600, and it belongs to a numbered account.

Numbered account? Well, we also don't know who—specifically—our users are, and this is maybe the most controversial of our decisions. We don't have accounts; we don't even currently have email addresses, outside of our mailing list. There is no forced association between our mailing list users and our app users. In the future, we may allow users to opt in to self-identifying, but even then we'll continue to be careful about never collecting private data.

When users add memories to the app, they'll usually add content such as images. We don't want to (and we don't) hold these, either—at least not in a way we can see them. We primarily store these images on your device, but because the size of this storage is limited, we do have a system for storing assets such as images that have been encrypted on-device, and the actual photo contents or the decryption keys are never sent to us. We store data for users, here, but to us it looks like random noise (the binary ciphertext), never like a photo of whatever it is you're storing a photo of. I intend to write more about this in the future, since we expect to publish some open source tooling related to this.

So, we don't have your data in a database that we can access in any way (again, beyond collecting metrics on user-driven events that we don't consider private, so that we can know number of active users, performance in the system, etc.).

This poses a couple serious problems. The main problem is: if I lose my data, how can I recover it?

Well, the short answer here is: we can't. We can't identify you by email to reset your password. We don't have your email address (associated with your data, at least), and you don't have a password. Even if we did have those things, we don't have your data so we can't restore it.

Right now the app has backup/restore functionality and we expect users to use that to protect themselves from data loss. We've put a lot of thought into storing these backups for a user, but having that user identify themselves is a difficult problem. Storing that data on behalf of the user, in a way that we can't get to it is also a problem. But a very interesting problem. I think we have good solutions to these problems that we expect to build into the app before we're out of beta, and I also hope to post about this in the future.

There's a bit more info in our Privacy Policy, which we're bound by.

I've been working on lots of technology things at Matter, but overseeing our privacy implementation has been one of the most rewarding.

One time, almost a year into working on this stuff, Axel said to me "I love that you're thinking about this stuff and coming up with these kinds of solutions" to which I barely held back a tear and replied "I've been thinking about this stuff very seriously for over a decade, and I love that you're the first person who's really let me implement it."

Exists is the enemy of good

We've all heard the adage "perfect is the enemy of good." I take this to mean: if you only accept perfect, you might miss a good-enough solution. I still try to strive for perfect, but have learned to accept that good might be sometimes enough.

However, I've been saying something to friends and colleagues—for probably a decade now—that is closely related: exists is the enemy of good.

This idea is pretty simple, in principle: sometimes we miss a good-enough solution because a not-quite-good-enough solution is already out there and in use.

When I came up with this, it was specifically about bad software that seems good enough. Usually, stakeholders are less willing to invest in a good solution when one already exists.

Who of us hasn't hacked together a prototype that somehow became the code we released to production and has been doing a not-quite-good-enough job of serving clients since that fateful deadline? In my experience, these things often become absolute nightmares when it comes to maintenance (and security, and performance, and stability…), but they exist, so the urgency to replace them with a good version gets traded for the thousand papercuts of unexpected "well, I didn't think this would happen, so it's going to take me a few more hours than expected" or "oops; we didn't expect this kind of vulnerability…"

Or, we've cut corners: "Yeah, we've been meaning to fix that wildcarded security policy, but it's working for now, so we haven't made time to tighten things down." It exists though, so making it actually good doesn't get priority; it gets kicked down the road—sometimes forever.

This concept applies more broadly, too.

Imagine a community that is served by a single bridge. The bridge was built in 1958 and doesn't have safe walkways, or even a reasonable path for bicycles. Despite this, pedestrians and cyclists attempt to use the bridge daily—dangerously. The bridge exists, though. Replacing it with a better bridge is something the local council talks about every year or two, but no one is willing to risk their position by trying to justify the expense. Even worse: replacing the bridge would leave the community without a convenient link to the next town (they'd have to go the long way around) for a period of time while the replacement is being deployed, so even if someone were to get it into the budget, the citizens wouldn't stand for the downtime. Sure, everyone wants a better bridge—a good bridge—but the bridge that exists is the enemy of this. Now imagine how quickly (relatively, of course—this is local government after all) action would be taken if the bridge were to become unusable. A flash flood knocks out the bridge, and there's an opportunity—no, a downright necessity—to prop up a good bridge, because the old one practically no longer exists.

My home automation has been a bit neglected for the past year or so. Fundamentally, I'm still firmly in the position I mentioned (my lightbulbs can't join a botnet to DDoS you if they physically can't connect to the Internet (without a gateway)), but I'd like to tidy up a few things; recategorize, and maybe even explore a different system. But it exists and it mostly works, so it's hard to justify the time and hassle of naming/grouping rooms, or switching control platforms to something that is potentially more good.

When I've complained about Electron apps on my desktop, saying things like "This would be so much better if it used the native toolkit. It's super annoying that the tab key doesn't work right and this scrollbar is janky. It destroys my battery life," I've sometimes been met with the response "well, the Electron app is better than no app right?" Is it, though? If this bad app experience didn't exist, there might be some reason for them to build a good app. The motivation is squelched by "what we have is good enough" when usually it isn't good enough for me. My oven has what I would consider a terrible app. I have little hope that Anova will replace it with something good, though, because they have an app that exists.

Now, I'm not saying we need to greenfield everything, and we should usually avoid full rewrites—or at least approach them with a healthy dose of caution—but I've been thinking about this exists situation a lot and it's a real mess.

We need to be very careful about justifying bad experiences with "perfect is the enemy of good" when we should be striving harder for good itself. The popularity of this expression is responsible for a lot of garbage. Sometimes we just shouldn't do the quick and easy thing—even if we've tricked ourselves into thinking it's temporary. Exists is also the enemy of good.

A Secret Career Code

or: How, 25 years ago, I went viral and met King (Prince) Charles

Someone has been paying me to work on technology stuff for over 20 years, now. And there’s one weird trick that I’ve learned that I think it’s worth passing on.

This isn’t really a secret code (and the one weird trick bit is a joke, of course), but it was so non-obvious to me in my early career, and it’s so valuable to me in my mid career that I wanted to share it—and also tell the story about going viral before anyone would have ever called it going viral.

The revelation is at the very end of this post, so feel free to skip ahead if you don’t want to hear the stories of how I learned it, but I think they’re fun.

Also, this turned out longer than I expected…

1998

In 1998, I graduated from high school. That last year, especially, I spent a good deal of my time really refining my would-be professional interests. I’d been programming for years, but multimedia was the hot thing, and I had a genuine interest in audio and video.

Our school had a really interesting (and maybe unique) setup where the 3 anglophone high schools in my city all shared the technical programs space, which we school-bussed to and from between classes, as needed.

This meant that instead of each of the high schools having to stock their own specialties labs, we shared resources with the other 2 main schools, at a formerly-4th school. This Science and Technology Centre campus held well-equipped labs for all kinds of chronically-underfunded subjects:

  • electronics (I saw my first Tesla coil there)
  • home economics (like cooking, well-equipped kitchens)
  • automotive tech (a fully equipped garage with a lift where students could learn to work on cars and even fix their own)
  • control technologies (electropneumatics, actuators, PLCs, etc.)
  • traditional machine shop and CAD/CAM (with actual mini manufacturing/machining spaces)
  • wood shop (carpentry, but also a full shop with planers, jointers, lathes, cabinetry facilities, etc.)
  • a computer programming lab that was set up with actual compilers, not just typing and Microsoft Office classes
  • its own well-equipped theatre for drama, live music, and video presentations
  • likely many other spaces that I didn’t participate in or really even notice
  • and where I spent most of my time: the media lab

The 4th school has been turned back into a normal high school for years, now, so students there no longer have the same shared resources opportunities that we were so privileged—yet unthankful for the most part since we were kids—to participate in. It was called the MacNaughton Science and Technology Centre, for future web archaeologists searching for references (see my own archaeology woes below).

The media lab was a suite that contained a main classroom that was set up—in addition to regular classroom stuff—for viewing videos and teaching media literacy (we learned about Manufacturing Consent, and talked about 1984), but it also contained a small-but-equipped recording studio (I spent hundreds of hours in there recording friends, learning to mix, mic, bounce tracks, EQ…), and a video production suite that had an expensive-at-the-time near-broadcast quality camera (a Canon XL-1), and a few workstations for editing video, photos, digital painting, CD ROM production (hey, this was big at the time), and all of the related things.

Aside from the unbelievable opportunity that we had with all of this equipment, as high school students, the teacher who ran the lab—the only teacher we called by a first name, by his request—Andrew Campbell was one of those special teachers you look back on and recognize how much time and effort—how much care—they put into us. Weekends and evenings, he was often available to help us get set up, or at least unlock the doors (and secretly share his keys if he couldn’t be there for a Saturday recording session or to kick off a multi-hour video compile). I’m forever grateful for being a part of that program with Andrew back then.

Anyway, because of this experience and the time I was able to spend in the lab, I got pretty good at stringing videos together and producing them into something that was worth watching. There were a few of us from each of the 3 schools that had above-average abilities to run these System 7 SCSI-based Macs with Adobe Premiere.

In Canada, in the 90s—at least where I lived—it seemed like everyone (unless your family was exceptionally religious or you lived on a rural farm or—I guess—just didn’t have cable or access to a friend’s basement that had cable) watched MuchMusic. This was roughly the same as MTV in the ’States. Many of us just kind of ambiently turned it on if we weren’t actually watching something else on the TV—I fell asleep to the overnight VJs, many nights.

One recurring public service campaign, every year on MuchMusic, which was partly funded by the Canadian Government, was “Stop Racism!” which promoted March 21: the international day for the elimination of racial discrimination. If you grew up in Canada in the ’90s, you might remember the hand logo from this campaign.

Racism: Stop It hand logo

Each year, as part of this public service, they ran a video competition where they called on students from all over the country to make a short (90 second) video that would enter into a competition where a panel of judges would choose the best high-schooler-produced videos, and these videos would be cut down and aired every few hours on MuchMusic over the course of a month or so. The winners would also be honoured in a ceremony with musical guests and dignitaries.

A few days before the deadline to submit videos for this, a couple of my not-even-very-close friends from school asked me if we could put something together. I said sure. We made up a script (in hindsight, it was probably very cringey, with eggs spray-painted different colours, and the message was that they were all the same inside). We shot and edited the video, and submitted it barely on time (I think we had to postal-mail a VHS tape). We certainly did not expect to win.

As you may have guessed if you’re still reading: we did win. There were several groups of winners, but we were the only winners from our region. They flew us to Vancouver (this was a really big deal to me; I’d only ever been on one plane trip before at this point) to participate in an awards ceremony, hosted by Prince Charles, and several musical guests that we cared much more about, and we got to meet them all at the reception afterwards. I honestly don’t remember what we talked about, but we definitely had an audience with the not-yet-king. (I do remember chatting with Great Big Sea, Matthew Good, and Chantal Kreviazuk, though.)

Our video aired on Much every few hours for the next month or so. We weren’t famous, but if this kind of thing had happened 10 years later, and if it was driven by social networks, I’m pretty sure we’d have called it viral. This was as close to viral you could get (without already being a celebrity, or doing something horribly newsworthy) in 1998.

There’s not much online about these events. I kept poor records back then. (-; I did find someone’s portfolio site about the event, and a video from another group that was entered in 1998 but didn’t win. Here are some newspaper clippings and a print + scan from our school’s Geocities page (thanks to my mom for clipping these way back then). Last thing: I found my official recognition certificate.

Certificate of Recognition … presented to Sean Coates … for Your winning entry in the 1998 Stop Racism National Video Competition

I learned a lot from this experience, but I think the biggest lesson was: if you’re any good at all, just try because you might succeed. I didn’t yet know the second part.

2001-2005

The second part of the lesson was revealed to me in 2004 or 2005, but let’s step back a little bit in time.

There are a few main events that put me on the path to learning what I learned. The first was when I took a new job in mid-2001, and I met Kevin Yank (who was living in Montreal at the time, and was just finishing up working at the place that had hired me—we became friends for a while there, too, until he moved away and we lost touch other than an occasional encounter in the Fediverse these days). Kev had just published a book with SitePoint: Build Your Own Database Driven Website Using PHP & MySQL. I learned a lot of PHP & MySQL from that book (I was working with Coldfusion at the time), and I still have my copy.

My copy of the aforementioned book.

What really struck me, though, was that he was close to my age, and wanted something more from his career, so he wrote this book. I thought I wanted that—or at least something like that—for my own career.

A few months later, I signed up to volunteer with the PHP documentation team and I remember it being a really big deal to me when they gave me my sean@php.net email address.

In 2003 (I think), I attended the first Conférence PHP Québec where I met Rasmus and many other people who would become peers in the PHP community. This conference eventually became ConFoo.

In late 2003 I decided I wanted to write for php|architect Magazine. I had a topic I liked and really wanted to see if I could—and still wanted to—build on that idea Kevin had imprinted on me. I did not expect it to be accepted—it seemed so prestigious. But my idea and draft/outline were accepted, and I was told I needed to write 4000 words, which—for a 23 year old non-academic-person—was a LOT of words (even this far-too-long blog post is “only” around 2000 words). But I did it. I was ecstatic to have my piece published in the January 2004 issue.

It was either later that year or in 2005 that I ran into the publisher of the magazine, Marco Tabini on IRC where we’d both been hanging out with PHP people for some time. He’d just lost his Editor-in-Chief, and was venting about having to pick up the editing duties in addition to his regular work. I—oh so naïvely—suggested that “I like editing” and he asked if I wanted to do it. After he reviewed an editing sample exercise he gave me, I started learning how to become the EiC and picked up the role pretty quickly.

So here’s where all of this has been going: when I started editing the magazine, I got to see our content pipeline. We had to run four 4000 word main articles per month, in addition to the columns, and what I saw blew my mind. I’d come into this believing that it was the cream of the crop that wrote for this trade magazine. It was really the best people who got published. That’s what I thought. I was so proud of my own accomplishment of writing for this prestigious magazine. And you know what? Some of the articles were excellent, but more often than not, I had to scrape together barely enough content to make each month’s issue great (and—admittedly—sometimes not great or even all that good). I had to rewrite whole articles. We had to beg past authors to write again. The illusion of prestige was completely revealed to me.

And this… this is the secret I’ve learned: if you’re good at something, you’re probably better than most everyone else who does that something. There’s always going to be the top tier, sure, and you might be in that top tier, or you might not, but the average is shockingly average. It’s really not that hard for you to accomplish many of these things if you set reasonable goals, and it turns out the bar for some of those goals is much lower than I expected early in my career.

--

Tl;DR: If you want to do something and think you can do it: just do it. If you’re any good at all, you’re probably better than most people who’ve done it, and far better than those who won’t even try.

Modified Microphone

I've owned a Blue Yeti microphone for a little over five years, now. It's a pretty decent microphone for video calls, and I really like that it has its own audio interface for (pass-through) earphones, and a dedicated hardware mute feature. I've used it on probably 1500 calls, with only one complaint: the mute button makes noise.

For most of that five years, my colleagues could usually tell when I wanted into the conversation because the mute button has a satisfying ka-chunk tactile feedback that—unfortunately—transfers vibration into the thing to which it is attached, and that thing is a highly-sensitive vibration transducing machine… a microphone!

The ka-chunk noise has bothered me for years. Not so much for the signifier that I'm about to speak, but that I couldn't unmute myself without people expecting me to speak. Plus, it's kind of fundamentally annoying to me that a microphone has a button that makes noise.

So, I set out to fix this. I've been playing with these ESP32 microcontrollers from Espressif. These inexpensive chips (and development boards) are full of features and are way overpowered compared to the first-generation Arduinos I was playing with 15 years ago. They have WiFi and Bluetooth built in, come with enough RAM to do some basic stuff (and off-chip serial-bus-accessible RAM for more intensive work), and are easy to program (both from a software and hardware standpoint).

One of the cool built-in features of the ESP32 is a capacitive touch sensor. There are actually several of these sensors on board, and they're often used to sense touch for emulating buttons… silent buttons. You can see where I'm going with this.

I laid out a test circuit on a breadboard, and wrote a little bit of firmware (including some rudimentary debouncing) using the Arduino framework, then tested:

(This is on a ESP32 WROOM development board that's not ideal for some of my other projects, where I prefer the WROVER for the extra RAM, but is ideal—if not serious overkill—for this project.)

Not bad. But now I had to do the hard part: figure out how the microphone handles that button press. I looked around the Internet a little for someone who'd already done something similar, and I found some teardown videos, but couldn't track down a schematic.

I took the microphone apart, dug into the Yeti's board, and found the button. It was a bit more complicated than I'd imagined, mostly because the button is both illuminated (the built-in and light-piped LED will flash when muted, and will be lit solidly when unmuted), and mode-less (it's a momentary button). With some creative probing with a volt-meter and some less-than-ideal hotwiring of the +5V/GND provided by the USB interface, I tracked the button press down to a single pin on the switch, which was sunk low when the button is pressed. I soldered on a wire to use with my project:

I also soldered on a way-too-big barrel connector to tap into the USB interface's +5V and Ground lines. (Use what you have on-hand, right? Try not to get the connector genders backwards like I did… and also maybe solder better than me.)

My code would need to be modified to simulate this button "press". In addition to the debouncing, I'd have to pretend to press and release the button, and also instead of providing +5 volts to an output pin (the normal way to signal something like this), I'd actually have to sink the 5V to ground. Here's the (Arduino framework) code I ended up with (including some Serial debuggery):

#include <Arduino.h>

#define TOUCH_PIN 4
#define LED_PIN 2
#define EXT_LED_PIN 15

#define PULSE_DELAY 500

unsigned int threshold = 20;

// the last time the output pin was toggled:
unsigned long lastDebounceTime = 0;
// the debounce time
unsigned long debounceDelay = 500;

unsigned long pulseStartTime = 0;
bool toggledLow = false;

void gotTouch() {
  if (millis() < (lastDebounceTime + debounceDelay)) {
    return;
  }
  lastDebounceTime = millis();
  Serial.println("Touched.");

  // pulse the button
  digitalWrite(LED_PIN, LOW);
  digitalWrite(EXT_LED_PIN, LOW);
  Serial.println("(low)");
  pulseStartTime = millis();
  toggledLow = true;

}

void setup() {
  Serial.begin(9600);
  delay(100);
  Serial.println("Started.");
  pinMode(LED_PIN, OUTPUT);
  pinMode(EXT_LED_PIN, OUTPUT);
  digitalWrite(LED_PIN, HIGH);
  digitalWrite(EXT_LED_PIN, HIGH);
  touchAttachInterrupt(T0, gotTouch, threshold);
}

void loop() {
  // Touch0, T0, is on GPIO4
  Serial.println(touchRead(T0));  // get value using T0
  Serial.println("");
  delay(100);
  if (toggledLow && (millis() > (pulseStartTime + PULSE_DELAY))) {
    digitalWrite(LED_PIN, HIGH);
    digitalWrite(EXT_LED_PIN, HIGH);
    toggledLow = false;
    Serial.println("(high)");
  }
}

Please at least attempt to refrain from making fun of my weak C++ skills… but this seems to be surprisingly stable code in practice.

Now, I'd need to attach the ESP32 dev board, and reassemble the microphone part-way. The case of the Yeti is cast aluminum (or another softish metal, but I assume aluminum). This means that I could maybe—hopefully—use the case of the the Yeti itself as the touch sensor. I rigged up a sensor wire to loosely connect to the mounting hole (which gets a thumb-screwed bolt, and will, by force, make a good connection to the case), since it's a huge pain (at best) to solder to aluminum:

Then, some bench testing before really putting it back together: it works! (You can see the blinking light in the middle of the Yeti's board go solid and back to blinking when I touch the unassembled-but-connected case.)

Great! Success! I managed to do that mostly in a single evening! I put the microphone back together, including putting the case-mounting bolts back in and… suddenly it no longer worked. I disassembled, hooked up the serial monitor via USB, and… well… it works! Maybe I just pinched a connection or shorted a pin or something. More Kapton tape! Reassembled, and… failed again. So I ran a cable through the volume knob hole and reassembled, and tested it in-situ. Weird. The capacitance numbers are all very low. In fact, the might be always just (very near) 0 plus some occasional noise. What?

After a day or two of head-scratching, and then some measuring to confirm the hypothesis, I realized that when the bolts go into the case, the case gets connected to the chassis, and the chassis is grounded to the board, then through to the USB ground. So the case itself gets grounded. And that's bad for a floating capacitance sensor. Turns out it didn't work after all.

This led to experimentation with some insulating enamel paint for transformers, and me certainly burning through a few too many brain cells with the fumes from said paint. I gave up on isolating the case from ground (which is probably good, anyway, all things considered), and made a little touch pad out of some aluminum ducting tape, some solderable copper tape, and a chunk of cardboard, that I happened to have on hand (back to using what you have here).

Actual successful hack.

As you can see in the video, I also added a little toggle switch to the back of the microphone that could allow me to completely detach the switching line from the microphone, just in case my hack started to fail, and I was on an important call—the stock mute button still works, of course. But, I'm happy to report that it's been nothing but stable for the past few weeks—it didn't even overflow the millis() checks, somehow, which still surprises me—and I use the new ka-chunk-free touch-to-mute-unmute feature of my microphone every day.

Anova Precision Oven (after a year)

This post doesn't exactly fit the normal theme of my blog, but over the past few weeks, several people have asked me about this, so I thought it was worth jotting down a few thoughts.

In January 2021, after eyeballing the specs and possibilities for the past few months, I splurged and ordered the Anova Precision Oven. I've owned it for over a year, now, and I use it a lot. But I wish it was quite a bit better.

There were a few main features of the APO that had me interested.

First, we have a really nice Wolf stove that came with our house. The range hood is wonderful, and the burners are great. The oven is also good when we actually need it (and we do still need it, sometimes; see below), but it's propane, so there are a few drawbacks: it takes a while to heat up because there's a smart safety feature that's basically a glow plug that won't let gas flow until it's built up enough heat to ignite the gas, preventing a situation where the oven has an ideal gas-air mix and is ready to explode. It's also big. And it uses propane (which I love for the burners, but is unnecessary (mostly) for the oven, and not only is it relatively expensive to run (we have a good price on electricity in Quebec because of past investments in giant hydro-electric projects), it measurably reduces the air quality in the house if the hood fan isn't running (and running the fan in the dead of winter or summer cools/heats the house in opposition to our preference).

The second feature that had me really interested in the APO is the steam. I've tried and mostly-failed many times to get my big oven (this gas one and my previous electric oven) to act like a steam oven. Despite trying the tricks like a pan of water to act as a hydration reservoir, and spraying the walls with a mist of water, it never really steamed like I'd hoped—especially when making baguette.

I'm happy to say that the APO meets both of these needs very well: it's pretty quick to heat up—mostly because it's smaller; I do think it's under-powered (see below)—and the steam works great.

There are, however, a bunch of things wrong with the APO.

The first thing I noticed, after unpacking it and setting it up the first time, is that it doesn't fit a half sheet pan. It almost fits. I'm sure there was a design or logistics restriction (like maybe these things fit significantly more on a pallet or container when shipping), but sheet pans come in standard sizes, and it's a real bummer that I not only can't use the pans (and silicone mats) I already owned, but finding the right sized pan for the APO is also difficult (I bought some quarter and eighth sheet pans, but they don't fill up the space very well).

Speaking of the pan: the oven comes with one. That one, however, was unusable. It's made in such a way that it warps when it gets hot. Not just a little bit—a LOT. So much that if there happens to be liquid on the pan, it will launch that liquid off of the pan and onto the walls of the oven when the pan deforms abruptly. Even solids are problematic on the stock pan. I noticed other people complaining online about this and that they had Anova Support send them a new pan. I tried this. Support was great, but the pan they sent is unusable in a different way: they "solved" the warping problem by adding rigidity to the flat bottom part of the pan by pressing ribs into it. This makes the pan impossible to use for baking anything flat like bread or cookies.

I had to contact Support again a few months later when the water tank (the oven uses this for steam, but also even when steam mode is 0%, to improve the temperature reading by feeding some of the water to the thermometer, in order to read the "wet bulb" temperature). The tank didn't leak, but the clear plastic cracked in quite a large pattern, threatening to dump several litres of water all over my kitchen at any moment. Support sent me a new tank without asking many questions. Hopefully the new one holds up; it hasn't cracked yet, after ~3 months.

Let's talk about the steam for a moment: it's great. I can get a wonderful texture on my breads by cranking it up, and it's perfect for reheating foods that are prone to drying out, such as mac & cheese—it's even ideal to run a small amount of steam for reheating pizza that might be a day or two too old. I rarely use our microwave oven for anything non-liquid (melting butter, reheating soups), and the APO is a great alternative way to reheat leftovers (slower than the microwave, sure, but it doesn't turn foods into rubber, so it's worth trading time for texture).

So it's good for breads? Well, sort of. The steam is great for the crust, definitely. However, it has a couple problems. I mentioned above that it's under-powered, and what I mean by that is two-fold: it has a maximum temperature of 250°C (482°F), and takes quite a long time to recover from the door opening—like, 10 minutes long. Both of these are detrimental to making an ideal bread. I'd normally bake bread at a much higher temperature—I do 550°F in the big oven, and pizza even hotter (especially in the outdoor pizza oven which easily gets up to >800°F). 482°F is—at least in my casual reasoning—pretty bad for "oven spring". My baguettes look (and taste) great, but they're always a bit too flat. The crust forms, but the steam bubbles don't expand quite fast enough to get the loaf to inflate how I'd like. The recovery time certainly doesn't help with this, either. I've managed to mitigate the slow-reheat problem by stacking a bunch of my cast iron pans in the oven to act as a sort of thermal ballast, and help the oven recover more quickly.

Also on the subject of bread: the oven is great for proofing/rising yeast doughs. Well, mostly great. It does a good job of holding the oven a bit warmer than my sometimes-cold-in-winter kitchen, and even without turning on the steam, it seems to avoid drying out the rising dough. I say "mostly" because one of the oven's fans turns on whenever the oven is "on", even at low temperatures. The oven has a pretty strong convection fan which is great, but this one seems to be the fan that cools the electronics. I realize this is necessary when running the oven itself, but it's pretty annoying for the kitchen to have a fairly-loud fan running for 24-48+ hours while baguette dough is rising at near-ambient temperatures.

The oven has several "modes" where you can turn on different heating elements inside the oven. The main element is the "rear" one, which requires convection, but there's a lower-power bottom element that's best for proofing, and a top burner that works acceptably (it's much less powerful than my big gas oven, for example) for broiling. One huge drawback to the default rear+convection mode, though, is that the oven blows a LOT of bubbling liquid all over the place when it's operating. This means that it gets really dirty, really quickly (see the back wall in the photo with the warped pan, above). Much faster than my big oven (even when running the convection fan over there). This isn't the end of the world, but it can be annoying.

The oven has controls on the door, as well as an app that works over WiFi (locally, and even when remote). I normally don't want my appliances to be in the Internet (see Internet-Optional Things), but the door controls are pretty rough. The speed-up/slow-down algorithm they use when holding the buttons for temperature changes is painful. It always overshoots or goes way too slow. They've improved this slightly, with a firmware update, but it's still rough.

The app is a tiny bit better, but it has all of the problems you might expect from a platform-agnostic mobile app that's clearly built on a questionable web framework. The UI is rough. It always defaults to the wrong mode for me (I rarely use the sous-vide mode), and doesn't seem to allow things like realtime temperature changes without adding a "stage" and then telling the oven to go to that stage. It's also dangerous: you can tell the app to turn the oven on, without any sort of "did one of the kids leave something that's going to catch fire inside the oven" interlock. I'd much prefer (even as optional configuration) a mode where I'm required to open and close the door within 90 seconds of turning the oven on, or it will turn off, or something like that.

Speaking of firmware… one night last summer, while I was sitting outside doing some work, my partner sent me a message "did you just do something to the oven? it keeps making the sound like it's just turned on." I checked the app and sure enough, it just did a firmware update. I told her "it's probably just restarted after the firmware update." When I went inside a little while later, I could hear it making the "ready" chime over and over. Every 10-15 seconds or so. I didn't realize this is what she'd meant. I tried everything to get it to stop, but it was in a reboot loop. We had to unplug it to save our sanity. Again, I looked online to see if others were having this issue, and sure enough, there were thousands of complaints about how everyone's ovens were doing this same thing. Some people were about to cook dinner, others had been rising bread for cooking that night, but we all had unusable ovens. They'd just reboot over and over, thanks to a botched (and automatic!) firmware update. Anova fixed this by the next morning, but it was a good reminder that software is terrible, and maybe our appliances shouldn't be on the Internet. (I've since put it back online because of the aforementioned door controls and the convenience of the—even substandard—app. I wish we could just use the door better, though.)

So, should you buy it? Well, I don't know. Truthfully, I'm happy we have this in our house. It's definitely become our main oven, and it fits well in our kitchen (it's kind of big, but we had a part of the counter top that turned out perfect for this). It needs its own circuit, really, and is still underpowered at 120V (~1800W). However, I very very often feel like I paid a lot of money to beta test a product for Anova (it was around the same price as I paid for my whole slide-in stove (oven + burners, "range"), at the previous house), and that's a bummer.

If they announce a Version 2 that fixes the problems, I'd definitely suggest getting that, or even V1 if you need it sooner, and are willing to deal with the drawbacks—I just wish you didn't have to.

A USB Mystery

TL;DR: got some hardware that didn't have a driver I could use. Dived into some packet captures, learned some USB, wrote some code.

I've been on a mini quest to instrument a bunch of real-life things, lately.

One thing that's been on my mind is "noise" around my house. So, after a very small amount of research, I bought a cheap USB Sound Pressure Level ("SPL") meter. (I accidentally bought the nearly-same model that was USB-powered only (it had no actual USB connection), before returning it for this one, so be careful if you happen to find yourself on this path.) Why not use a regular microphone attached to a regular sound device? Calibration.

When the package arrived, and I connected it, I found that it was not the same model from my "research" above. I was hoping that many/most/all of these meters had the same chipset. So now I had a little bit of a mystery: how do I get data from this thing?

I managed to find it in my system devices (on Mac; I'd have used lsusb on Linux—this thing will eventually end up on a Raspberry Pi, though):

❯ system_profiler SPUSBDataType
(snip)
WL100:

  Product ID: 0x82cd
  Vendor ID: 0x10c4  (Silicon Laboratories, Inc.)
  Version: 0.00
  Speed: Up to 12 Mb/s
  Manufacturer: SLAB
  Location ID: 0x14400000 / 8
  Current Available (mA): 500
  Current Required (mA): 64
  Extra Operating Current (mA): 0
(snip)

So, I at least know it actually connects to the computer and identifies itself. But I really had no idea where to go from there. I found that Python has a PyUSB library, but even with that set up, my Mac was unhappy that I'd try accessing USB devices from userspace (non-sudo). I found there was also another way to connect to devices like this, over "HID", which is the protocol normally used for things like the keyboard/mouse, but is over-all a simpler way to connect things.

The vendor supplied software on a mini-CD. Hilarious. There was also a very sketchy download link for Windows-only software. I have a Windows box in the networking closet for exactly this kind of thing (generally: testing of any sort). So, I went looking for some USB sniffing software, and a friend remembered that he thought Wireshark could capture USB. Perfect! I'd used Wireshark many times to debug networking problems, but never for USB. This was a lead nonetheless.

I fired up the vendor's software and connected the SPL meter:

Okay. It's ugly, but it seems to work. This app looks like it's from the Win32 days, and I thought that was no longer supported… but it works—or at least seems to. I asked Wireshark to capture on USBPcap1, and waited until I saw it update a few times. Disconnected the capture, saved the pcap session file, and loaded it into Wireshark on my main workstation. Unfortunately, I didn't have much of an idea what I was looking at.

I could, however, see what looked like the conversation between the computer (host), and the SPL meter (1.5.0). The was marked USBHID (as opposed to some other packets marked only USB), so this was a great clue:

The led to some searches around GET_REPORT and USB/HID/hidapi. Turns out that USB HID devices have "endpoints", "reports", and a lexicon of other terms I could only guess about. I didn't plan to become a full USB engineer, and was hoping I could squeeze by with a bit of mostly-naïve-about-USB-itself-but-otherwise-experienced analysis.

Eventually, I figured out that I can probably get the data I want by asking for a "feature report". Then I found get_feature_report in the Python hidapi bindings.

This function asks for a report_num and max_length:

def get_feature_report(self, int report_num, int max_length):
    """Receive feature report.

    :param report_num:
    :type report_num: int
    :param max_length:
    :type max_length: int
    :return: Incoming feature report
    :rtype: List[int]
    :raises ValueError: If connection is not opened.
    :raises IOError:
    """
    

These two values sound familiar. From the Wireshark capture:

Now I was getting somewhere. Let's use that decoded ReportID of 5 and a max_length (wLength) of 61.

import hid
import time

h = hid.device()
# these are from lsusb/system_profiler
h.open(0x10C4, 0x82CD)

while True:
    rpt = h.get_feature_report(5, 61)
    print(rpt)
    time.sleep(1)

This gave me something like:

[5, 97, 239, 60, 245, 0, 0, 1, 85, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5, 97, 239, 60, 246, 0, 0, 1, 99, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5, 97, 239, 60, 247, 0, 0, 1, 172, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5, 97, 239, 60, 248, 0, 0, 3, 63, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5, 97, 239, 60, 249, 0, 0, 2, 168, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5, 97, 239, 60, 250, 0, 0, 1, 149, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5, 97, 239, 60, 251, 0, 0, 1, 71, 0, 0, 1, 44, 5, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

I played around with this data for a bit, and eventually noticed that the 8th and 9th (rpt[7:9]) values were changing. Sure enough, if I made a noise, the 9th value would change, and if it was a loud noise, the 8th value would also change:

1, 85
1, 99
1, 172
3, 63
2, 168

I was about to start throwing data into a spreadsheet when I made a guess: what if that's a 16 (or 12…) bit number? So, if I shift the first byte over 8 bits and add the second byte…

(1 << 8) + 85 == 341
(1 << 8) + 99 == 355
(1 << 8) + 172 == 428
(3 << 8) + 64 == 831
(2 << 8) + 167 == 680 

The meter claims to have a range of 30dBA to 130dBA, and it sits around 35dBA when I'm not intentionally making any noise, in my office with the heat fan running. Now I'm worried that it's not actually sharing dBA numbers and maybe they're another unit or… wait… those ARE dBA numbers, just multiplied to avoid the decimal! 34.1, 35.5, 42.8, 83.1, 68.0

Got it!

Anyway, I wrote some (better) code to help read this data, on Python: scoates/wl100 on GitHub. Let me know if you use it!

IoT: Internet-Optional Things

I both love and hate the idea of "smart" devices in my home. It's tough to balance the convenience of being able to turn lights on and off automatically, and adjust thermostats with my phone, with the risk that all of my devices are doing evil things to my fellow Internet citizens. But, I think I've landed on a compromise that works.

I've had Internet-connected devices for a long time now. I've even built devices that can go online. At some point a year or two ago, I realized that I could do better than what I had. Here's a loose list of requirements I made up for my own "IoT" setup at home:

  • works locally as a primary objective
  • works when my Internet connection is down or slow
  • avoids phoning home to the vendor's (or worse: a third party's) API or service
  • can be fully firewalled off from the actual Internet, ideally through a physical limitation
  • isn't locked up in a proprietary platform that will either become expensive, limited, or will cease to exist when it's no longer profitable

My setup isn't perfect; it doesn't fully meet all of these criteria, but it's close, and it's been working well for me.

At the core of my home IoT network is a device called Hubitat Elevation. It works as a bridge between the actual Internet, and my devices which are actually incapable (for the most part) of connecting to the Internet directly. My devices, which range from thermostats, to lights, to motion sensors, to switchable outlets, and more, use either Zigbee or Z-Wave to communicate with each other (they form repeating mesh networks automatically) and with the hub. Again, they don't have a connection to my WiFi or my LAN, except through the hub, because they're physically incapable of connecting to my local network (they don't have ethernet ports, nor do they have WiFi radios). The hub brokers all of these connections and helps me control and automate these devices.

The hub—the Hubitat Elevation—is fairly inexpensive, and is not fully "open" (as I'd like), but has good integration abilities, is well-maintained, is compatible with many devices (many of them are devices compatible with the more-proprietary but similar SmartThings hub), and has an active community of people answering questions, coming up with new ideas, and maintaining add-ons. These add-ons are written in Groovy, which I hadn't really used in earnest before working with the Hubitat, but you can write and modify them to suit your needs.

The hub itself is mostly controlled through a web UI, which I'll admit is clunky, or through a mobile app. The mobile app adds capabilities like geo-fencing, presence, and notifications. The hub can also be connected to other devices; I have mine connected to my Echo, for example, so I can say "Alexa turn off the kitchen lights."

The devices themselves are either mains-powered (such as my Hue lightbulbs, baseboard thermostats, and switching outlets), or are battery powered (such as motion sensors, door contact switches, and buttons). Many of these devices also passively measure things like local temperature, and relay this data, along with battery health to the hub.

I'm going to get into some examples of how I have things set up, here, though not a full getting-started tutorial, but first I wanted to mention a few things that were not immediately obvious to me, and will get you off on the right foot, if you choose to follow a similar path to mine.

  • Third-party hub apps are a bit weird in their structure (there are usually parent and child apps), and keeping them up to date can be a pain. Luckily, Hubitat Package Manager exists, and many add-ons can be maintained through this useful tool.
  • There's a built-in app called "Maker API" which provides a REST interface to your various devices, which technically goes against one of my loose requirements above, but I have it limited to the LAN, and authenticating, and this feels like a fair trade-off to me for when I want to use this kind of connection.
  • There's an app that will send measured data to InfluxDB, which is a timeseries database that I have running locally on my SAN (as a Docker container on my Synology DSM), and it works well as a data source for Grafana (the graphs in this post come from Grafana).

Programmable Thermostats

My house is heated primarily through a centralized heat pump (which also provides cooling in the summer), but many rooms have their own baseboard heaters + independent thermostats. Before automation, these thermostats were either completely manual, or had a hard-to-manage on-device scheduling function.

I replaced many of these thermostats with connected versions. My main heat pump's thermostat (low voltage) is the Honeywell T6 Pro Z-Wave, and my baseboard heaters (line voltage) are now controlled with Zigbee thermostats from Sinopé.

Managing these through the web app is much better than the very limited UI available on programmable thermostats, directly. The Hubitat has a built-in app called "Thermostat Scheduler." Here's my office, for example (I don't like cold mornings (-: ):

Lighting

An often-touted benefit of IoT is lighting automation, and I have several lights I control with my setup. Much of this is through cooperation with the Hue bridge, which I do still have on my network, but I could remove at some point, since the bulbs speak Zigbee. The connected lights that are not Hue bulbs are mostly controlled by Leviton Decora dimmers, switches, and optional dimmer remotes for 3-way circuits. Most of this is boring/routine stuff such as "turn on the outdoor lighting switch at dusk and off at midnight," configured on the hub with the "Simple Automation Rules" app, but I have a couple more interesting applications.

Countertop

My kitchen counter is long down one side—a "galley" style. There's under-cabinet countertop lighting the whole length of the counter, but it's split into two separate switched/dimmed circuits of LED fixtures—one to the left of the sink and one to the right. I have these set to turn on in the morning and off at night. It's kind of annoying that there are two dimmers that work independently, though, and I find it aesthetically displeasing when half of the kitchen is lit up bright and the other half is dim.

Automation to the rescue, though. I found an app called Switch Bindings that allows me to gang these two dimmers together. Now, when I adjust the one on the left, the dimmer on the right matches the new brightness, and vice versa. A mild convenience, but it sure is nice to be able to effectively rewire these circuits in software.

Cellar

I have an extensive beer cellar that I keep cool and dark most of the time. I found myself sometimes forgetting to turn off the lights next to the bottles, and—as someone who is highly sensitive to mercaptans/thiols (products of lightstuck beers, a "skunky" smell/fault)—I don't want my beer to see any more light than is necessary.

With my setup, I can have the outlet that my shelf lighting is plugged into turn on and off when the door is opened or closed. There's also a useful temperature sensor and moisture sensor on the floor so I can know quickly if the floor drain backs up, or if a bottle somehow breaks/leaks enough for the sensor to notice, via the notification system, and keep track of cellar temperature over time.

these lights turn on and off when the door is opened and closed, respectively

I also receive an alert on my phone when the door is opened/closed, which is increasingly useful as the kids get older.

Foyer

Our house has an addition built onto the front, and there's an entrance room that is kind of separated off from the rest of the living space. The lighting in here has different needs from elsewhere because of this. Wouldn't it be nice if the lights in here could automatically turn on when they need to?

Thanks to Simple Automation Rules (the built-in app), and a combination of the SmartThings motion sensor and the DarkSky Device Driver (which will need to be replaced at some point, but it still works for now), I can have the lights in there—in addition to being manually controllable from the switch panels—turn on when there's motion, but only if it's dark enough outside for this to be needed. The lights will turn themselves off when there's no more motion.

Ice Melting

We have a fence gate that we keep closed most of the time so Stanley can safely hang out in our backyard. We need to use it occasionally, and during the winter this poses a problem because that side of the house has a bit of water runoff that is not normally a big deal, but in the winter, it sometimes gets dammed up by the surrounding snow/ice and freezes, making the gate impossible to open.

In past winters, I've used ice melting chemicals to help free the gate, but it's a pain to keep these on hand, and they corrode the fence posts where the powder coating has chipped off. Plus, it takes time for the melting to work and bags of this stuff are sometimes hard to find (cost aside).

This year, I invested in a snow melting mat. Electricity is relatively cheap here in Quebec, thanks to our extensive Hydro-Electric investment, but it's still wasteful to run this thing when it's not needed (arguably still less wasteful than bag after bag of ice melter). I'm still tweaking the settings on this one, but I have the mat turn on when the temperature drops and off when the ambient temperature is warmer. It's working great so far:

Desk foot-warming mat

My office is in the back corner of our house. The old part. I suspect it's poorly insulated, and the floor gets especially cold. I bought a warming mat on which to rest my feet (similar to this one). It doesn't need to be on all of the time, but I do like to be able call for heat on demand, and have it turn itself off after a few minutes.

I have the mat plugged into a switchable outlet. In the hub, I have rules set up to turn this mat on when I press a button on my desk. The mat turns itself off after 15 minutes, thanks to a second rule in the built-in app "Rule Machine". Warm toes!

When I first set this up, I found myself wondering if the mat was already on. If I pressed the button and didn't hear a click from the outlet's relay, I guessed it was already on. But the hub allows me to get a bit more insight. I didn't want something as distracting (and redundant) as an alert on my phone. I wanted something more of an ambient signifier. I have a Hue bulbed lamp on my desk that I have set up to tint red when the mat is on, and when it turns off, to revert to the current colour and brightness of another similar lamp in my office. Now I have a passive reminder of the mat's state.

Graphs

An additionally interesting aspect of all of this (to me as someone who uses this stuff in my actual work, anyway) is that I can get a visual representation of different sensors in my house, now that we have these non-intrusive devices.

For example, you can see here that I used my office much less over the past two weeks (both in presence and in the amount I used the foot mat), since we took a much-needed break (ignore the CO2 bits for now, that's maybe a separate post):

As I mentioned on Twitter a while back, a graph helped me notice that a heating/cooling vent was unintentionally left open when we switched from cooling to heating:

Or, want to see how well that outdoor mat on/off switching based on temperature is working?

An overview of the various temperatures in my house (and outside; the coldest line) over the past week:

Tools

What's really nice about having all of this stuff set up, aside from the aforementioned relief of it not being able to be compromised directly on the Internet is that I now have tools that I can use within this infrastructure. For example, when we plugged in the Christmas tree lights, this year, I had the outlet's schedule match the living room lighting, so it never gets accidentally left on overnight.

Did it now

I originally wrote this one to publish on Reddit, but also didn't want to lose it.

Many many years ago, I worked at a company in Canada that ran some financial services.

The owner was the kind of guy who drove race cars on weekends, and on weekdays would come into the programmers' room to complain that our fingers weren't typing fast enough.

On a particularly panicky day, one of the web servers in the pool that served our app became unresponsive. We had these servers hosted in a managed rack at a hosting provider offsite. After several hours of trying to bring it back, our hosting partner admitted defeat and declared that they couldn't revive WEB02. It had a hardware failure of some sort. We only had a few servers back then, and they were named according to their roles in our infrastructure: WEB01, WEB02, CRON01, DB03, etc.

Traffic and backlog started piling up with WEB02 out of the cluster, despite our efforts to mitigate the loss (which we considered temporary). Our head of IT was on the phone with our hosting provider trying to come up with a plan to replace the server. This was before "cloud" was a thing and each of our resources was a physically present piece of hardware. The agreed-upon solution was to replace WEB02 with a new box, which they were rushing into place from their reserve of hardware, onsite.

By this point, the race-car-driving, finger-typing-speed-complaining owner of the company was absolutely losing it. It seemed like he was screaming at anyone and everyone who dared make eye contact, even if they had truly nothing to do with the server failure or its replacement.

Our teams worked together to get the new box up and running in record time, and were well into configuring the operating system and necessary software when they realized that no one wanted to go out on a limb and give the new machine a name. President Screamy was very particular about these names for some reason and this had been the target of previous rage fests, so neither the hosting lead nor our internal soldiers wanted to make a decision that they knew could be deemed wrong and end up the target of even more yelling. So, they agreed that the hosting provider would call the CEO and ask him what he'd like to name the box.

But before that call could be made, the CEO called our hosting provider to tear them up. He was assured that the box was almost ready, and that the only remaining thing was whether to name it WEB02 to replace the previous box or to give it a whole new name like WEB06. Rage man did not like this at all, and despite being at the other end of the office floor from his office, we could all hear him lay fully into the otherwise-innocent phone receiver on the other end: "I just need that box up NOW. FIX IT. I don't care WHAT you call it! It just needs to be live! DO IT NOW!"

And that, friends, is how we ended up with a web pool of servers named WEB01, WEB03, WEB04, WEB05, and (the new server) DOITNOW. It also served well as a cautionary tale for new hires who happened to notice.

Cache-Forever Assets

I originally wrote this to help Stoyan out with Web Performance Calendar; republishing here.

A long time ago, we had a client with a performance problem. Their entire web app was slow. The situation with this client's app was a bit tricky; this client was a team within a very large company, and often—in my experience, anyway—large companies mean that there are a lot of different people/teams who exert control over deployed apps and there's a lot of bureaucracy in order to get anything done.

The client's team that had asked us to help with slow page loads only had passive access to logs (they couldn't easily add new logging), and they were mostly powerless to do things like optimize SQL queries, of which there were logs already, and really only controlled the web app itself, which was a very heavy Java/Spring-based app. The team we were working with knew just enough to maintain the user-facing parts of the app.

We, a contracted team brought in to help with guidance (and we did eventually build some interesting technology for this client), had no direct ability to modify the deployed app, nor did we even get access to the server-side source code. But we still wanted to help, and the client wanted us to help, given all of these constraints. So, we did a bit of what-we-can-see analysis, and came up with a number of simple, but unimplemented optimizations. "Low-hanging fruit" if you will.

These optimizations included things like "improve the size of these giant images (and here's how to do it without losing any quality)", "concatenate and minify these CSS and JavaScript assets" (the app was headed by a HTTP 1.x reverse proxy), and "improve user-agent caching". It's the last of these that I'm going to discuss here.

Now, before we get any deeper into this, I want to make it clear that the strategy we implemented (or, more specifically: advised the client to implement) is certainly not ground-breaking—far from it. This client, whether due to geographic location, or perhaps being shielded from outside influence within their large corporate infrastructure, had not implemented even the most basic of browser-facing optimizations, so we had a great opportunity to teach them things we'd been doing for years—maybe even decades—at this point.

We noticed that all requests were slow. Even the smallest requests. Static pages, dynamically-rendered for the logged-in user pages, images, CSS, even redirects were slow. And we knew that we were not in a position to do much about this slowness, other than to identify it and hope the team we were in contact with could request that the controlling team look into the more-general problem. "Put the assets on a CDN and avoid the stack/processing entirely" was something we recommended but it wasn't even something we could realistically expect to be implemented given the circumstances.

"Reduce the number of requests" was already partially covered in the "concatenate and minify" recommendation I mentioned above, but we noticed that because all requests were slow, the built-in strategy of using the stack's HTTP handler to return 304 not modified if a request could be satisfied via Last-Modified or ETag was, itself, sometimes taking several seconds to respond.

A little background: normally (lots of considerations like cache visibility glossed over here), when a user agent makes a request for an asset that it already has in its cache, it tells the server "I have a copy of this asset that was last modified at this specific time" and the server, once it sees that it doesn't have a newer copy, will say "you've already got the latest version, so I'm not going to bother sending it to you" via a 304 Not Modified response. Alternatively, a browser might say "I've got a copy of this asset that you've identified to have unique properties based on this ETag you sent me; here's the ETag back so we can compare notes" and the server will—again, if the asset is already current—send back a 304 response. In both cases, if the server has a newer version of the asset it will (likely) send back a 200 and the browser will use and cache a new version.

It's these 304 responses that were slow on the server side, like all other requests. The browser was still making the request and waiting a (relatively) long time for the confirmation that it already had the right version in its cache, which it usually did.

The strategy we recommended (remember: because we were extremely limited in what we expected to be able to change) was to avoid this Not Modified conversation altogether.

With a little work at "build" time, we were able to give each of these assets, not only a unique ETag (as determined by the HTTP dæmon itself), but a fully unique URL, based on its content. By doing so, and setting appropriate HTTP headers (more on the specifics of this below), we could tell the browser "you never even need to ask the server if this asset is up to date. We could cache "forever" (in practice: a year in most cases, but that was close enough for the performance gain we needed here).

Fast forward to present time. For our own apps, we do use a CDN, but I still like to use this cache-forever strategy. We now often deploy our main app code to AWS Lambda, and find ourselves uploading static assets to S3, to be served via CloudFront (Amazon Web Services' CDN service).

We have code that collects (via either a pre-set lookup, or by filesystem traversal) the assets we want to upload. We do whatever preprocessing we need to do to them, and when it's time to upload to S3, we're careful to set certain HTTP headers that indicate unconditional caching for the browser:

def upload_collected_files(self, force=False):
    for f, dat in self.collected_files.items():

        key_name = os.path.join(
            self.bucket_prefix, self.versioned_hash(dat["hash"]), f
        )

        if not force:
            try:
                s3.Object(self.bucket, key_name).load()
            except botocore.exceptions.ClientError as e:
                if e.response["Error"]["Code"] == "404":
                    # key doesn't exist, so don't interfere
                    pass
                else:
                    # Something else has gone wrong.
                    raise
            else:
                # The object does exist.
                print(
                    f"Not uploading {key_name} because it already exists, and not in FORCE mode"
                )
                continue

        # RFC 2616:
        # "HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future"
        headers = {
            "CacheControl": "public,max-age=31536000,immutable",
            "Expires": datetime.today() + timedelta(days=365),
            "ContentType": dat["mime"],
            "ACL": "public-read",
        }

        self.upload_file(
            dat["path"],
            key_name,
            self.bucket,
            headers,
            dry_run=os.environ.get("DRY_RUN") == "1",
        )

The key name (which extends to the URL) is a shortened representation of a file's contents, plus a "we need to bust the cache without changing the contents" version on our app's side, followed by the asset's natural filename, such as (the full URL): https://static.production.site.faculty.net/c7a1f31f4ed828cbc60271aee4e4f301708662e8a131384add7b03e8fd305da82f53401cfd883d8b48032fb78ef71e5f-2020101000/images/topography-overlay.png

This effectively tells S3 to relay Cache-Control and Expires headers to the browser (via CloudFront) to only allow the asset to expire in a year. Because of this, the browser doesn't even make a request for the asset if it's got it cached.

We control cache busting (such as a new version of a CSS, JS, image, etc.) completely via the URL; our app has access (via a lookup dictionary) to the uploaded assets, and can reference the full URL to always be the latest version.

The real beauty of this approach is that the browser can entirely avoid even asking the server if it's got the latest version—it just knows it does—as illustrated here:

Developer tools showing "cached" requests for assets on faculty.com