Models & Relations
Modeling Data
TrailBase gives you full, untethered access to SQLite, as such data is modeled on top of ISO SQL and SQLite concepts. This means data is organized as rows or records across columns of a table as defined by their schema. Relationships between records are expressed by simply referencing other records via their primary key, which works across table boundaries. Data can then be joined together within the same database at access or query time. If you’re new to SQL and this sounds abstract, don’t worry it will become clear very soon and SQL is an evergreen useful well beyond TrailBase.
One of the main benefits of SQL databases is that you can define your models based on intrinsic properties of the underlying data and their relations, rather than having to worry about how the data is or will be used in the future. Instead, SQL queries let you flexibly define what data is being accessed, how it’s transformed and what’s returned. How data is accessed is never explicitly defined and is derived by the query optimizer. This means, if you discover new use-cases in your data that require combining data in ways that would be slow, you can optimize it after the fact by adding or removing indexes without having to touch the models, the queries, or the downstream code consuming the data 🎉.
Tables, Schemas & Data Types
When creating a new table to hold your data, you define a table schema telling the database what columns there are, what kind of data they contain, and potential constraints. For example1,
This creates a table to hold posts in a blog storing an integer creation timestamp, the author, the title and lastly the contents. We also set it up such that an author deleting their account, will cascade deleting their posts as well.
Coming from other SQL databases, it may come as a surprise that despite the
data types above SQLite isn’t strictly typed by default, e.g. the title
column above may hold values other than TEXT
.
SQLite interprets types merely as affinities to judge how literals and
parameters should be interpreted on insert or or update but will happily accept
other values.
While “flexible”, this has far reaching consequences on downstream code, now
has to deal explicitly with unexpected data types.
We therefore encourage working with STRICT
schemas whenever possible.
In fact, TrailBase APIs are type-safe and thus require the underlying tables to
be STRICT
, i.e.:
At the fundamental storage-level SQLite supports the following data types:
NULL
, INTEGER
, REAL
, TEXT
, BLOB
, which all type affinities boil down
to.
For example, columns with JSON
or JSONB
type affinities are stored as
TEXT
or BLOB
, respectively2.
When defining STRICT
tables there’s no affinity, limiting you to SQLite’s
fundamental types.
However, TrailBase provides you with some extensions to enforce strict schema
compatibility on a sub-column-level, e.g.
Constraints
Using TrailBase you can make use of any of SQLite’s column and table
constraints.
We’ve already encountered some of the former, e.g. NOT NULL
, REFERENCES
or
CHECK
, which all constrain the values a column may contain.
Similarly, one can define more complex table constraints constraining tuples of values, e.g.
For a complete list of constraints, check out the SQLite manual.
Generated Columns
SQLite’s generated columns allow you to either materialized derived columns at modification-time or compute values for virtual columns on the fly. Check out SQLite manual for comprehensive guidance.
Relations
SQL typically distinguishes between three types of relations:
- 1:1 relations, e.g. each user has exactly one user profile.
- 1:M or one-to-many relations, e.g. users may have many blog posts but each post belongs to exactly one user.
- N:M or many-to-many relations, e.g. a blog post may be tagged with many tags, and each tag can be assigned to many posts.
In practice, all relations are simply edges, i.e. tuples of the shape
(parent id, child id)
.
There’s some freedom as to where these edges are stored. In case of 1:1 and 1:M
relations they can be denormalized into the child record effectively becoming
(primary key, foreign key)
.
Alternatively, edges can always be stored in a separate “bridge” table as
(foreign key, foreign key)
edge records.
In case of N:M relations, this is even necessary to achieve the required
cardinality, since foreign keys can only reference a single record.
This is not a limitation of TrailBase but rather common SQL practice.
Linking children to their parents individually2 with foreign keys exposes
the relationship to the database allowing actions like ON DELETE
or ON UPDATE
to propagate.
For example, a user deletion may trigger related data to be deleted
automatically.
Let’s look at the following data schema for a blog example,
Each block represents a table schema. We can see:
- a 1:1 relation between users and user profiles,
- a 1:M relationship between users and posts,
- and an N:M relationship between posts and tags using the
post_tag
bridge table.
We could have implemented the 1:1 user-profile and 1:M user-post relationships via separate bridge tables with appropriate uniqueness constraints, however pulling the parent key into the child record leads to less indirection.
In order to combine related data we can simply join on the keys. For example to get a list of all users with profiles:
To connect posts and tags we have to do two joins across the post_tag
bridge
table:
Record APIs
There’s an elephant in the room: while this is all pretty standard fair, at least in SQL land, connecting relations across TrailBase’s record APIs without joins is a lot more painful. For N:M relations, one would have to expose 3 different APIs including the bridge table, manage their ACLs and then traverse the edges manually on the client side.
This is an area where we’re actively exploring how to expand API capabilities,
however the most general approach is to push more responsibility to the
server.
Concretely, we can expose a single API tailored for specific client use-cases
implementing a server-side join using VIEW
s:
More generally, views can be useful to decouple an API definition from the underlying data model. For example, you may want to restructure your data model or APIs while keeping the other stable.
Alternatively, custom JS/TS handlers can provide a more free-form approach to push joining edges and other responsibilities to the server.
Footnotes
-
This is not meant as full SQL tutorial. We’ll keep it simple so hopefully it is clear from context what any statement intends to do. Note further that SQL keywords are case-insensitive. We’ll use all-caps to highlight them. ↩
-
This also means than one can use SQLite’s builtin JSON operators on any TEXT column as long as it parses as JSON. ↩ ↩2