Standard Notes Sync Protocol, and SFRS, a Rust implementation

24 Feb, 2020

As you may know, I have been a user of Standard Notes for a long time. Since I am a self-hosting nerd, during this whole time I was using a self-hosted server of Standard Notes, using the Go implementation of the Standard File protocol (the former name of the sync protocol of Standard Notes). There is just one slight problem: this implementation seems to be abandoned since mid-2019.

Of course, Standard File (and thus Standard Notes Sync Protocol) is not a particularly complicated protocol that needs maintanence that often, but things have happened since the Go implementation last updated. For example, the whole Standard File name was discarded and the protocol became officially named as the synchronization protocol of Standard Notes. But the protocol itself was later updated too, at least introducing a different conflict resolution algorithm, which the Go implementation did not support and totally relies on backwards-compatibility of the client-side.

As a programmer, the first instinct after encountering such things would be to rewrite one and maintain it for oneself. And so did I. To be completely fair, I could have just switched to the official Ruby implementation and call it a day, but I am not a big fan of the official server-side code. I think their client-side software, including the UI and the code itself, is fantastic and I may not be able to do something at the same level of competence, but the Ruby on Rails backend is really not something I like that much. Similar to the Go server, it depends on timestamps from system clock (it may not be strictly monotonic) for synchronization tokens, which, though should be totally fine 99.9% of the times, could bring some unexpected synchronization behavior during multi-client synchronization scenarios, as one or two bug reports may indicate (this is purely speculation), especially when there is no lock in place to limit how many parallel synchronization requests can happen at once for each user, and their conflict-detection logic somehow ignores conflicts whose timestamps are within an arbitrary amount of time apart, for some reason that I could not yet understand. Again, this is far from a disaster and should be fine for most of the use-cases, and I am probably just thinking too much (I cannot even give any specific reason why those designs were not the best ideas), but still, not something I love. So I set out to rewrite a synchronization server in the language I like, Rust, and named it SFRS.

During implementation, I noticed that none of the aforementioned updates is obvious from what we can see directly. The new Standard Notes website includes some documentation, but its description of the synchronization protocol was still identical to the old one that the Go implementation uses. The explanation on the official website is also a bit too vague and general, making me confused while trying to implement myself. I have reported these discrepancies to Standard Notes, and it seems that they are in the middle of a big refactoring of the client-side (probably to get rid of AngularJS and use React instead), and the documentation will be updated after they finish. In case of anyone like me who tried to re-invent the wheel, I decided to list out all of these discrepancies in this blog article before the official documentation is updated.

Please keep in mind that I am not the designer of the protocol and all of these are based on the official documentation plus what I can extract from the source code of the official implementation. I cannot guarantee that these will be correct, but they at least worked for me and I will try my best to explain what I discovered during my trial-and-error process of getting things to work.

Requests and Responses

This is not a discrepancy, but something not stated clearly in the documentation. At least for the official client, the client-side will always send request bodies in the form of application/json, and will expect responses also in the form of application/json. However, since GET requests do not have a body, the parameters will be passed in the query string instead, and this is the only exception.

I am not sure if other client-side software may use other formats for requests, like application/x-www-form-urlencoded, but at least using JSON everywhere makes the official clients work perfectly.

Authorization Endpoints

The most unexpected discrepancy comes from the authorization endpoints, i.e. /auth*. The official documentation says that the response of endpoints that return a token (the registration endpoint /auth and the sign-in endpoint /auth/sign_in) should be

{"token" : "..."}

...which is incompatible with their client-side implementation. The actual Standard Notes client expects an extra field, user, with email and uuid as its attributes. The full response should look like

{
  "token": "...",
  "user": {
    "email": "...",
    "uuid": "..."
  }
}

I am not sure why the client expects such an object, and why the user must have a UUID even though it seems to be used nowhere in the client. My initial implementation did not even need a UUID to identify users. Whatever the case is, simply adding these fields made the client happy and stopped crashing, which is a good sign.

Synchronization Tokens

The description of the /items/sync endpoint is, to me, the most confusing piece of information in their documentation. The confusion starts in the basis of synchronization -- sync_token and cursor_token -- which they described as

sync_token: the sync token returned from the previous sync call. Leave empty if first sync.
limit: (optional) the number of results to return. cursor_token is returned if more results are available.

I don't know about others, but for me this is not clear at all what they are supposed to be. I know that to synchronize, you need something to record where the client was last time, so the server can send whatever the client does not yet have on the next request. But here we have two different similarly-named entities with seemingly similar functionality -- both should be used to record where the client was last time -- but apparently they are totally different.

At first, I assumed that only one of cursor_token and sync_token is needed depending on the circumstance, i.e. maybe when limit is set, sync_token is no longer needed. This was not the case, and it caused the client to misbehave as of my testing. Then I tried several different approaches, whose process was too messy to talk about in an organized manner.

Finally, after several failures and some digging into the source code of the official implementation and the Go implementation, I ended up with something that works. In this working configuration, sync_token and cursor_token are defined as follows, respectively:

Conflict Types and Detection

The official documentation describes the unsaved field in the response of /items/sync to contain items that conflicted during synchronization. This is already obsolete as of the latest client-side implementation. Instead, now the conflicted items should be returned in a field called conflicts and with the following structure:

{
  "type": "sync_conflict|uuid_conflict",
  "unsaved_item": { ... },
  "server_item": { ... }
}

where | means OR. If type is set to sync_conflict, then unsaved_item should be set to null and server_item should be set to the conflicting item that exists on the server. If type is set to uuid_conflict, then server_item should be set to null and unsaved_item should be set to the conflicting item sent by the client.

The distinction between sync_conflict and uuid_conflict is not clear from the official client-side source code. It only says uuid_conflict could happen if a user imports old backups, which is better than nothing but still confusing. It turned out that this distinction is specific to the official server implementation:

The whole reason that uuid_conflict is a thing is due to the design choice of the official server: it uses the client-provided uuid field as a primary key in its database. This is fine, because we can deal with it through this uuid_conflict, but it can be avoided by choosing not to use the field as primary key. This is exactly what I did in my Rust implementation, though I was not aware of this issue beforehand. It was only after I have tried to understand why uuid_conflict is a thing for a whole night that I realized its purpose.

SFRS, Rust for Standard Notes

The above is about what I can recall from dealing with the protocol. Although faced with a few challenges, I am happy to say that I ended up with something that works, and the protocol itself is relatively simple and consice overall. The source code of my implementation is on GitHub, and I am already dogfooding it to test if there was anything else I missed. That said, I have to warn you that this is still in very early stage, documentation is still missing (though I think my comments on the synchronization part of the code is better than official) and I might decide to make breaking changes in case anything critical happens, but I am pretty confident that the likelihood of such an event is pretty low.

There are just a few things left that I would like to mention. The first is about my choice to use a per-user mutex to limit concurrent calls to /items/sync for each users to one and only one. Though I have simulated and examined some possible scenarios in my head, I was afraid that due to non-atomicity of the synchronization operation, unexpected things could happen had two parallel synchronizations been processed in just the right order. This should be fine for most users as I do not believe anybody will be synchronizing from a bazillion devices at the same time to trigger any performance hits caused by mutex.

Also, I have opted to use the Rocket web framework for Rust to write this implementation. This is far from the optimal choice, since Rocket does not even support async/await (yet, I can see great progress in that direction and it looks like support is coming very soon), and uses a threadpool to handle connections. However, I feel really attracted by the API design of Rocket, as it is very elegant and abstracts away a lot of the verbosity of Rust when used with web development. Considering the fact that I will need a threadpool (or a queue on another thread) anyway to handle SQLite transactions, this should not be performance hit too severe, and I did not intend to make SFRS something suitable to be used by a million users simultaneously anyway.