Monolith: Real Time Recommendation System With Collisionless Embedding Table

I didn’t get that much from this paper, probably because it’s pretty high level and I don’t have a strong background in recommendation systems.

The approach is their Cuckoo Hashmap for embedding from which they can update parameters on the fly using existing data engineering pipeline technology.

Instead of reading mini-batch examples from the storage, a training worker consumes realtime data on-the-fly and updates the training PS. The training PS periodically syn- chronizes its parameters to the serving PS, which will take effect on the user side immediately. This enables our model to interactively adapt itself according to a user’s feedback in realtime.

Eight Things to Know about Large Language Models

A bunch of stuff that maybe was somewhat surprising a year ago but by now should be common knowledge for anybody even half following the developments in this field.

Some interesting bits in there but for the rest it’s a bit rah-rah because the author works at Anthropic.

In particular, models can misinterpret ambiguous prompts or incentives in unreason- able ways, including in situations that appear unambiguous to humans, leading them to behave unexpectedly.

Our techniques for controlling systems are weak and are likely to break down further when applied to highly capable models. Given all this, it is reasonable to expect a substantial increase and a substantial qualitative change in the range of misuse risks and model misbehaviors that emerge from the development and deployment of LLMs.

The recent trend toward limiting access to LLMs and treating the details of LLM training as proprietary information is also an obstacle to scientific study.

Here’s Simon Willison writing about how he approaches his link blog.

I do something along similar lines here. I share links to various things that I find interesting and try to add what I think is interesting about them. From here I then schedule posts to Bluesky, Mastodon and LinkedIn using Buffer.

I’m not sure who reads my stuff here but I know for sure that people see the exhaust on those platforms. The main reason why I blog them here is to have my own repository of knowledge and links for if I ever have to refer back to it. For that I annotate things in a way where I hopefully can find it again and use site search to find ‘that one link about X I shared a while back’.

So yes, WordPress works just fine as a personal knowledge management system.

The Digital Patient Record system in Germany is built on smart cards and hardware which make it impossible to update and keep secure.

Of course a company like Gematik can’t update algorithms and keys on such a widespread heterogenous system. This is a competency that is impossible to organise except at the largest scales and even then companies like Microsoft will routinely leak their root keys.

The ‘hackers’ who made this presentation also can’t make something better than this and their culture is what led us to this point in the first place. It’s the same story with the German digital ID card which nobody uses.

The recipe is simple:

  • Demand absurd levels of security for threat models that are outlandish and paranoid
  • Have those demands complicate your architecture with security measures that look good but are impossible to maintain
  • Reap the exploits that you can run against that architecture and score publicity
  • <repeat>

It’s a great way to make sure that everybody loses in the German IT landscape.

Solution: Simplify the architecture to a server model with a normal 2FA login and keep that server secure. Done.

https://www.golem.de/news/elektronische-patientenakte-so-laesst-sich-auf-die-epa-aller-versicherten-zugreifen-2412-192003.html

Witches Kitchen

From the Grothendieck biography, funny to see that the legend would express himself in German.

Riemann-Roch’scher Satz: der letzte Schrei: der Diagramm […] ist kommutatif!
Um dieser Aussage über f:X->Y einen approximativen Sinn zu geben, musste ich nahezu zwei Stunden lang die Geduld der Zuhörer missbrauchen. Schwartz auf weiss (in Springer Lecture Notes) nimmt’s wohl an die 400,500 Seiten.
Ein packendes Beispiel dafür, wie unser Wissens und Entdeckungsdrang sich immer mehr in einem lebensentrückten logischen Delirium auslebt, während das Lebens selbst auf Tausendfache Art zum Teufel geht – und mit endgültiger Vernichtung bedroht ist. Höchste Zeit unsern Kurs zu ändern!

—Alexander Grothendieck

The low-latency user wants Bigtable’s request queues to be (almost always) empty so that the system can process each outstanding request immediately upon arrival. (Indeed, inefficient queuing is often a cause of high tail latency.) The user concerned with offline analysis is more interested in system throughput, so that user wants request queues to never be empty. To optimize for throughput, the Bigtable system should never need to idle while waiting for its next request.

This is also at the moment my abject suffering where we have lots of shared resources which need to stay available but can also be hammered by various parties.

Good to read that in this piece by Dan Slimmon: The Latency/Throughput Tradeoff: Why Fast Services Are Slow And Vice Versa. I read the SRE book as well but that part did not register with me back then.

I use AI tools to help me program despite them being mostly very disappointing. They save me some typing once in a while.

At least, now that I have switched from Perplexity to Cursor, I can ask my questions in my editor directly without having to open a browser search tab. I pass through a lot of different technologies in a given workday, so I have a lot of questions to ask.

For my use cases, it’s rare that Cursor can do even a halfway decent code change even in domains where there is a bunch of prior art (“convert this file from using alpine.js to htmx”). I know people who say they have generated thousands of LoC using LLMs that they actively use but there the old adage comes in: “We can generate as much code as you want, if only all the code is allowed to be shit.”

The position below is one of the more charitable positions of how AI can help a programmer and even that I don’t think is particularly convincing.

https://www.geoffreylitt.com/2024/12/22/making-programming-more-fun-with-an-ai-generated-debugger.html

Attention Is All You Need

I thought I’d dive back into history and read the original paper that started it all. It’s somewhat technical about encode/decoder layouts and matrix multiplications. None of the components are super exciting for somebody who’s been looking at neural networks for the past decade.

What’s exciting is that such a simplification generates results that are that much better and how they came up with it. Unfortunately, they don’t write how they found this out.

The paper itself is a bit too abstract so I’m going to look for some of those YouTube videos that explain what is actually going on here and why it’s such a big deal. I’ll update this later.

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

I came across this paper after the recent o3 high score on the ARC-AGI-PUB test. It’s a quick read and details how to scale LLMs at inference stage by generating new states at every node and so create a tree on which to perform DFS/BFS search algorithms.

A specific instantiation of ToT involves answering four questions: 1. How to decompose the intermediate process into thought steps; 2. How to generate potential thoughts from each state; 3. How to heuristically evaluate states; 4. What search algorithm to use.

For each of these steps they can deploy the LLM to generate the desired results which then scaled over the search space balloons the number of calls that need to be done (costing almost 200x the compute).

This isn’t your normal LLM stochastic parrot anymore. We’ve gone one up the abstraction chain and here we have a computer science algorithm running with LLM calls as its basic atoms.

December Adventure

So I felt I couldn’t really bring myself to do Advent of Code this year since I have more than enough other things to do (and watch and play) and with work and the kids, it’s always pretty miserable to keep up.

I saw this thing called December Adventure though and that fits in nicely with my current push to release a major update for Cuppings. If I’m going to be programming until late this month, then I’d prefer it to be on something that I can release.

I can’t promise that I won’t do any AoC (Factor is looking mighty cool) but I won’t force myself to do anything. With that, let’s get going.

1/12

I started working on the map view which clicking around looked like it could be really annoying. I found some dead ends and was afraid I’d have to hack in Leaflet support myself but I found a dioxus example hidden in the leaflet-rs repository.

Yes, I’m writing this website in Rust/WASM, why do you ask?

That example required a bunch of fiddling with the configuration and a couple of false starts, but now I have a vanilla map view.

I can say that I’m amazed that in this ecosystem 1. an example exists 2. that example works 3. it works in my project with a bit of diffing and 4. it seems to do what I need.

I raised a PR to the project to advertise this example on its README just like it does the others so that others wouldn’t have to search like I did. That PR got merged:

https://github.com/slowtec/leaflet-rs/pull/36

2/12

Today I’ll see if I can tweak the map view to show the location of the cafe we tapped and get things to a point where I can commit the change.

To do this I need to figure out how to pass information along to a router when we tap a venue. That should be easy enough but the Dioxus documentation is between 0.5 and 0.6 now and a lot of it is broken.

A tip from the Discord said I need to put the data into a context from a parent and then get it out again in a child. It’s a bit roundabout and required some refactoring, but it works.

Done on time even for a reasonable bed time.

3/12

Turns out my changes from yesterday did not make it to the staging server. I’ll fix that and manually run the job again.

That’s these annoying wasm-bindgen version errors that keep happening and that require a reinstall of this: cargo install -f wasm-bindgen-cli --version 0.2.97 and the dioxus-cli. Dioxus which by the way is preparing its long awaited 0.6.0 release.

Yes, I build this on the same Hetzner box that hosts it. So here you go: https://staging.cuppin.gs

Other than that not that much will happen today since I spent most of the evening noodling around with Factor (despite my intention not to do any weird programming). It’s a nice language that’s very similar to Uiua which I tried out a while back but not being an array programming language makes it feel somewhat more ergonomic.

4/12

I can’t describe how nice it is to wake up and not have to deal with a mediocre story line involving elves and try to find time to attack a programming problem.

After today, I’m going to need that quiet morning, because I spent until 01:30 debugging an issue: Going to a detail view from the frontpage worked, but loading a detail view directly would throw an error.

There were two issues at play here:

Leaflet maps don’t deal well with being created multiple times so either we have to call `map.remove() or we have to check whether the map has already been created and keep a reference to it somehow.

I solved it by pushing the map into a global variable:

thread_local!(static MAP: RefCell> = RefCell::new(None));

These are Rust constructs I would normally never use so that’s interesting. More interesting is that they work in one go and that they work on the WASM target.

Then the error was gone but the page was blank. Not entirely sure what was happening I poked at the DOM to see all the map elements there but simply not visible. Turns out that because of the different path, the path for the stylesheet was being added to the URL like this: http://127.0.0.1:8080/venue/176/main.css

It just has these two lines:

#map {
    width: 100%;
    height: 100vh;
}

But without a height the map is invisible.

Both issues are solved but not committed. I’ll see tomorrow whether I’m happy with the solution and how to package this up. Also I’m not sure how main.css is being served on production and whether the same fix will work there.

5/12

I couldn’t help but noodle on Advent of Code a bit. Here’s my day 1 part 1 in Factor: https://github.com/alper/advent-of-code/blob/main/2024/day-01/day-01.factor

I like Factor the programming language. It’s like Lisp or Haskell but without all the annoying bits.

The environment that’s provided with it, I’m not so keen about. It’s annoying to use and has lots of weird conventions that aren’t very ergonomic.

6/12

I’ve been bad and I’ve finished part 2 of day 1 of the Advent of Code: https://github.com/alper/advent-of-code/blob/main/2024/day-01/day-01.factor#L27

Not so December Adventure after all maybe. I’ll promise I’ll finish the mapping improvements I was working on tomorrow.

7/12

Went on my weekly long bike ride. Then in the evening I didn’t have that much energy for programming other than finishing Advent of Code day 3 part 1: https://github.com/alper/advent-of-code/commit/0a74c38e7641141e10b4c48203c9e414cc492e1c

(I looked at day 2 part 2 but that just looked very tedious.)

8/12

Got in a ton of commits on Cuppin.gs today. After fixing the map, I wanted to see what would happen if I would add all 2000 markers to the map.

Performance seems to be doable but this is probably not ideal for a webpage. Dynamically rendering the venues is something for later. For now I can probably get away with filtering for the 100-200 nearest locations by distance and dumping those into the map view.

Now I’m back debugging Github Actions. I’m splitting up the build and deploy of the backend and the frontend into separate actions. Compiling dioxus-cli takes forever which is a step I hope I can skip with cargo-binstall.

Iterating on Github Actions takes forever and there really doesn’t seem to be a better way to develop this or a better CI solution that everybody is willing to use.

10/12

Spent some hours massaging the data that goes into the app. I had to add all new venues and after that I wanted to check whether any place in our 2k venue set had closed so we can take them off the display. This is a somewhat tedious multi-step process.

I have an admin binary that calls the Google Maps API for each venue to check the venue data and the business status (CLOSED_TEMPORARILY and such). But to be able to do that you have to feed each place ID into the API. The only issue with place IDs is that they expire from time to time. There’s a free API call that you can use to refresh them.

That expiration does not happen that often. What happens more, I found, is that a place will disappear entirely of Google Maps. For some reason it will be deleted. I don’t handle that case yet so there my updaters break entirely and the quickest fix around it is to delete the venue from the database and restart.

The only data issue that I still have outstanding is when venues move their location to a different address. I have a place around here that I think is still showing on its old spot.

11/12

Tried to run Cuppings in Xcode to be met with some weird compilation errors. Turns out that there’s an Expression type in Foundation that’s overriding my SQLite.swift Expression. It’s a pretty silly reason for code to be broken: Expression – name space conflict with Xcode 16/iOS 18

Also still fighting with the frontend deployments which seem to need a --frozen passed to them to not proactively go update package versions.

14/12

Love to have a crash on startup for the Cuppings TestFlight build and then sit down today to bake a new one and upload that and for that one to work. No clue what the issue was even though I took a look at the crashlog (that I sent in myself).

I’ve also automated building the iOS app to be done by Xcode Cloud which should making new versions (whenever the database is updated) a lot easier.

16/12

Upgraded the frontend to Dioxus 0.6.0 which just came out and has lots of quality of life issues. For my case, I did not need to change a single line of code, just change some version numbers and build a new dioxus-cli.

Nice TUI for serving the frontend

I hope that maybe solves the wasm-bindgen issues on the frontend deploy. The annoying part about the build is that it takes so long that it’s very hard to iterate on.

It’s too late even for me to see what this does. I’m off to bed. You may or may not get a new version of the website by tomorrow morning.

18/12

Spent some iterations running the frontend deploy and rerunning it but now it should be working.

22/12

I spent the evening doing manual data munging and correcting some venue locations that hadn’t been updated correctly through my data life cycle.

That forced me to clarify the two name fields the venues table has.

  • name was the original name field and was pulled from the Foursquare metadata
  • google_name is the name field that’s pulled from Google Maps and was effectively leading but not updated correctly yet when refreshing the data

So to figure that out I did a bunch of auditing in the list to see venues where there was a large discrepancy between the names. Something that happens is that a place will change its name but keep the same location and Google Maps place.

I also added a label to the iOS app to indicate whether this is a DEBUG build but that messed up the layout and I guess I might as well remove it. Sometimes I get confused what I’m running, but since it’s just me running DEBUG builds on their phone, I think I can do without.

I also started a rewrite that I’m not sure I’m going to pull over the line: I wanted to remove the search dependency on Alpine.js and replace it with htmx. For this I asked Cursor to do the translation which it did a stab at but ultimately rather failed to do even the basic steps for it. Then I did it myself and while htmx is super easy to setup, the data juggling I have to do with what I get from Google Maps is very fragile and needs to be cleaned up (which I may or may not do given that things are working right now).

23/12

Working with the backend was very annoying because every time the server restarts, it would log me out. To fix that I changed the persistency of tower-sessions from MemoryStore to FileSessionStorage and that fixed it without issues. There is now a .sessions folder in the backend which needs to be ignored for cargo watch but other than that it’s a drop-in replacement.

That means I will need to write a logout view at some point.