(core) Add documentation to grist-core

Summary: Adds some documentation about Grist's components and infrastructure.

Test Plan: N/A

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: jarek

Differential Revision: https://phab.getgrist.com/D3941
This commit is contained in:
George Gevoian 2023-07-14 17:21:03 -04:00
parent a26eef05b0
commit e208f827af
5 changed files with 325 additions and 1 deletions

View File

@ -130,5 +130,7 @@ Check out this repository: https://github.com/gristlabs/grist-widget#readme
## Documentation
Some documentation to help you starting developing:
- [Grainjs](https://github.com/gristlabs/grainjs/) (The library used to build the DOM)
- [Overview of Grist Components](./overview.md)
- [GrainJS & Grist Front-End Libraries](./grainjs.md)
- [GrainJS Documentation](https://github.com/gristlabs/grainjs/) (The library used to build the DOM)
- [The user support documentation](https://support.getgrist.com/)

119
documentation/grainjs.md Normal file
View File

@ -0,0 +1,119 @@
# GrainJS & Grist Front-End Libraries
In the beginning of working on Grist, we chose to build DOM using pure Javascript, and used Knockout.js to tie DOM elements and properties to variables, called “observables”. This allowed us to describe the DOM structure in one place, using JS, and to keep the dynamic aspects of it separated into observables. These observables served as the model of the UI; other code could update these observables to cause UI to update, without knowing the details of the DOM construction.
Over time, we used the lessons we learned to make a new library implementing these same ideas, which we called GrainJS. It is open-source, written in TypeScript, and available at https://github.com/gristlabs/grainjs.
## [GrainJS documentation](https://github.com/gristlabs/grainjs#documentation)
GrainJS documentation is available at https://github.com/gristlabs/grainjs#documentation. Its the best place to start, since most Grist code is now based on GrainJS, and new code should be written using it too.
## Older Grist Code
Before GrainJS, Grist code was based on a combination of Knockout and custom dom-building building functions.
### Knockout Observables
You can find full documentation of knockout at https://knockoutjs.com/documentation/introduction.html, but you shouldnt need it. If youve read GrainJS documentation, here are the main differences.
Creating and using observables:
```
import * as ko from 'knockout';
const kObs = ko.observable(17);
kObs(); // Returns 17
kObs(8);
kObs(); // Returns 8
kObs.peek(); // Returns 8
```
```
import {Computed, Observable} from 'grainjs';
const gObs = Observable.create(null, 17)
gObs.get(); // Returns 17
gObs.set(8);
gObs.get(); // Returns 8
```
Creating and using computed observables
```
ko.computed(() => kObs() * 10);
```
```
Computed.create(null, use => use(gObs) * 10);
```
Note that in Knockout, the dependency on `kObs()` is created implicitly — because `kObs()` was called in the context of the computed's callback. In case of GrainJS, the dependency is created because the `gObs` observable was examined using the callback's `use()` function.
In Knockout, the `.peek()` method allows looking at an observables value quickly without any potential dependency-creation. So technically, `kObs.peek()` is whats equivalent to `gObs.get()`.
### Building DOM
Older Grist code builds DOM using the `dom()` function defined in `app/client/lib/dom.js`. It is entirely analogous to [dom() in GrainJS](https://github.com/gristlabs/grainjs/blob/master/docs/basics.md#dom-construction).
The method `dom.on('click', (ev) => { ... })` allows attaching an event listener during DOM construction. It is similar to the same-named method in GrainJS ([dom.on](https://github.com/gristlabs/grainjs/blob/master/docs/basics.md#dom-events)), but is implemented actually using JQuery.
Methods `dom.onDispose`, and `dom.autoDispose` are analogous to GrainJS, but rely on Knockouts cleanup.
For DOM bindings, which allow tying DOM properties to observable values, there is a `app/client/lib/koDom.js` module. For example:
```
import * as dom from 'app/client/lib/dom';
import * as kd from 'app/client/lib/koDom';
dom(
'div',
kd.toggleClass('active', isActiveObs),
kd.text(() => vm.nameObs().toUpperCase()),
)
```
Note that `koDom` methods work only with Knockout observables. Most dom-methods are very similar to GrainJS, but there are a few differences.
In place of GrainJSs `dom.cls`, older code uses `kd.toggleClass` to toggle a constant class name, and `kd.cssClass` to set a class named by an observable value.
What GrainJS calls `dom.domComputed`, is called `kd.scope` in older code; and `dom.forEach` is called `kd.foreach` (all lowercase).
Observable arrays, primarily needed for `kd.foreach`, are implemented in `app/client/lib/koArray.js`. There is an assortment of tools around them, not particularly well organized.
### Old Disposables
We had to dispose resources before GrainJS, and the tools to simplify that live in `app/client/lib/dispose.js`. In particular, it provides a `Disposable` class, with a similar `this.autoDispose()` method to that of GrainJS.
What GrainJS calls `this.onDispose()`, is called `this.autoDisposeCallback()` in older code.
The older `Disposable` class also provides a static `create()` method, but that one does NOT take an `owner` callback as the first argument, as it pre-dates that idea. This makes it quite annoying to use side-by-side classes that extend older or newer `Disposable`.
### Saving Observables
The module `app/client/models/modelUtil.js` provides some very Grist-specific tools that doesnt exist in GrainJS at all. In particular, it allows extending observables (regular or computed) with something it calls a “save interface”: `addSaveInterface(observable, saveFunc)` adds to an observable methods:
* `.saveOnly(value)` — calls `saveFunc(value)`.
* `.save()` — calls `saveFunc(obs.peek())`.
* `.setAndSave(value)` — calls `obs(value); saveFunc(value)`.
These are used in practice for observables created that represent pieces of data in a Grist document, such as metadata values or cells in user tables, and in these cases `saveFunc` is arranged to send a UserAction to Grist to update the stored value in the document.
This should help you understand what you see, and you may use it in new code if it uses existing old-style “saveable” observables. But in new code, there is no reason to package up this functionality with an observable. For example, if some UI component allows changing a value, have it accept a callback to call with the new value. Depending on what you need, this callback could set an observable, or it could send an action to the server.
### DocModel
The metadata of a Grist document, which drives the UI of the Grist application, is organized into a `DocModel`, which contains tables, each table with rows, and each row with a set of observables for each field:
* `DocModel` — in `app/client/models/DocModel`
* `MetaTableModel` — in `app/client/models/MetaTableModel` (for metadata tables, which Grist frontend understands and uses)
* `MetaRowModel` — in `app/client/models/MetaRowModel`. These have particular typed fields, and are enhanced with helpful computeds, according to the table to which they belong to, using classes in `app/client/models/entities`.
* `DataTableModel` — in `app/client/models/DataTableModel` (for user-data tables, which Grist can only treat generically)
* `DataRowModel` — in `app/client/models/DataRowModel`.
* `BaseRowModel` — base class for `MetaRowModel` and `DataRowModel`.
A RowModel contains an observable for each field. While there is old-style code that uses these observables, they all remain knockout observables.
Note that new code can use these knockout observables fairly seemlessly. For instance, a knockout observable can be used with GrainJS dom-methods, or as a dependency of a GrainJS computed.
Eventually, it would be nice to convert old-style code to use the newer libraries (and convert to TypeScript in the process), and to drop the need for old-style code entirely.

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

203
documentation/overview.md Normal file
View File

@ -0,0 +1,203 @@
# Overview of Grist Components
## Example Setup
Grist can be run as a single server, or as a composition of several. Here we describe a scalable setup used by Grist Labs. A single server would work fine for most individual organizations running Grist, but the concepts are still useful when working on the codebase.
![Infrastructure](images/infrastructure.png)
Grist Labs runs Grist as two primary kinds of servers: Home Servers and Doc Workers.
* **Home Servers:** handle most user requests around documents, such as listing documents, checking access and sharing, handling API requests (forwarding to Doc Worker if needed), and more.
* **Doc Workers:** responsible for in-document interactions. Each open document is assigned to a single Doc Worker. This Doc Worker has a local copy of the corresponding SQLite file containing the document data, and spawns a sandboxed python interpreter responsible for evaluating formulas in the document (the “data engine”).
For load balancing, Application Load Balancer (ALB) is used.
* **ALB:** Application Load Balancer handles SSL, and forwards HTTP requests to Home Servers and in some cases (mainly for websockets) to Doc Workers.
For storage, Home Servers and Doc Workers rely on HomeDB, Redis, and S3.
* **Home DB:** Postgres database (AWS RDS) containing information about users, orgs, workspaces, documents, sharing, and billing.
* **S3:** used for long-term storage of documents. It is primarily used by Doc Workers, which fetch SQLite files from S3 when a doc gets opened, and sync back SQLite files regularly as the document is modified by the user. Grist also supports other S3-compatible stores through [MinIO](https://support.getgrist.com/install/cloud-storage/#s3-compatible-stores-via-minio-client).
* **Redis:** Redis instance (AWS ElastiCache) keeps track of which Doc Workers are available, and which open documents are assigned to which Doc Workers.
### Home Servers
Home Servers have just one node process. They communicate with HomeDB, Redis, and Doc Workers. They dont open documents or start subprocesses. Browser clients interact with them using plain HTTP requests.
Home Servers also serve static files (e.g. the JS bundle with the entirety of the client-side app), and can be configured to serve assets from an external source (e.g. AWS CloudFront and S3).
### Doc Workers
Doc Workers deal with documents. They bring documents (SQLite files) from S3 to local disk, start a sandboxed Python interpreter for each one, and communicate with it to apply changes to a document. Browsers interact with Doc Workers directly via websockets. When browsers make doc-specific HTTP API requests, those get forwarded to Doc Workers via Home Servers.
![Doc Worker](images/doc-worker.png)
## Loading Documents
When a user opens a document, the Home Server is responsible for picking a Doc Worker. Once a document is assigned to a Doc Worker, other users (browser tabs) opening the same document will be serviced by the same worker.
When a Doc Worker is assigned a document, it brings a copy of the backing SQLite file (also known as `.grist` file) from S3 to local disk. It instantiates an ActiveDoc object in Node, which connects various components while the document is open.
On open, Doc Worker starts up a data engine process. This is a Python process, executed in a sandboxed environment because it runs user formulas (which can be arbitrary Python code). This process remains active as long as the doc is open on this Doc Worker.
On open, all document data gets read from SQLite file and loaded into the Python data engine. The data engine doesnt have direct access to the SQLite file. (Loading data fully into memory is limiting and not great, but thats what happens today. There is one exception: tables marked as [“on-demand” tables](https://support.getgrist.com/on-demand-tables/) are not loaded into the data engine.)
A typical client, from Grists point of view, is a browser tab or an API client. A browser tab differs from an API client by communicating via a websocket, and receiving updates when data changes. Calls from either kind of client are translated into method calls on [ActiveDoc](#server-side).
When a doc is opened, the browser requests the full content of metadata tables (these are all the tables starting with the `_grist_` prefix). Other tables are fetched in full when needed to display data. Once fetched, data is maintained in memory in the browser using the updates that come via websocket. In practice, this means that data is fully in memory, once in the Python data engine, and also in the Javascript environment of each tab that has the document open. (Again, on-demand tables are an exception.)
## Changes to documents
A user-initiated change to a document is sent to the server as a **User Action**. Most of these are simple data changes such as `UpateRecord`, `AddRecord`, `RenameTable`, etc. There are some fancier ones like `CreateViewSection`. Here is a typical life of a User Action:
* Created in the frontend client as the effect of some actual action by the user, like typing into a cell.
* Sent from the client to Node via WebSocket.
* Forwarded from Node to Python data engine.
* Converted by Python data engine into a series of **Doc Actions**. Doc Actions are only simple data or schema changes. These include results of formula calculations. These Doc Actions update the in-memory representation of the doc inside the data engine, and get returned to Node.
* Node translates Doc Actions to SQL to update the local SQLite file. (It also periodically syncs this file to S3.)
* Node forwards the Doc Actions to all browsers connected via websocket, including the client that sent the action originally. All clients update their in-memory representation of the doc using these Doc Actions.
* Node responds to the original client with return value of the action (e.g. rowId of an added record).
The authoritative list of available User Actions is the list of all the methods of `sandbox/grist/useractions.py` with `@useraction` decorator.
Doc Actions are handled both in Python and Node. Here is the full list of Doc Actions:
```
// Data Actions
export type AddRecord = ['AddRecord', string, number, ColValues];
export type BulkAddRecord = ['BulkAddRecord', string, number[], BulkColValues];
export type RemoveRecord = ['RemoveRecord', string, number];
export type BulkRemoveRecord = ['BulkRemoveRecord', string, number[]];
export type UpdateRecord = ['UpdateRecord', string, number, ColValues];
export type BulkUpdateRecord = ['BulkUpdateRecord', string, number[], BulkColValues];
export type ReplaceTableData = ['ReplaceTableData', string, number[], BulkColValues];
// This is the format in which data comes when we fetch a table from the sandbox.
export type TableDataAction = ['TableData', string, number[], BulkColValues];
// Schema Actions
export type AddColumn = ['AddColumn', string, string, ColInfo];
export type RemoveColumn = ['RemoveColumn', string, string];
export type RenameColumn = ['RenameColumn', string, string, string];
export type ModifyColumn = ['ModifyColumn', string, string, ColInfo];
export type AddTable = ['AddTable', string, ColInfoWithId[]];
export type RemoveTable = ['RemoveTable', string];
export type RenameTable = ['RenameTable', string, string];
```
Data actions take a numeric `rowId` (or a list of them, for “bulk” actions) and a set of values:
```
export interface ColValues { [colId: string]: CellValue; }
export interface BulkColValues { [colId: string]: CellValue[]; }
```
In case of “bulk” actions, note that the values are column oriented.
Note also that all Doc Actions are themselves valid User Actions, i.e. User Actions are a superset of Doc Actions. User Actions, however, are less strict. For example, the User Actions `AddRecord` is typically used with a rowId of `null`; on processing it, the data engine picks the next unused rowId, and produces a similar `AddRecord` Doc Action, with an actual number for rowId.
## Codebase Overview
### Server Side
Most server code lives in **`app/server`**.
* `app/server/lib/`**`FlexServer.ts`**
Sets up Express endpoints and initializes all other components to run the home server or doc worker or to serve static files. Hence the “Flex” in the name. The home servers and doc workers run using the same code, and parameters and environment variables determine which type of server it will be. Its possible to run all servers in the same process.
* `app/server/lib/`**`ActiveDoc.ts`**
The central dispatcher for everything related to an open document — it connects NSandbox, DocStorage, GranularAccess components (described below), as well as connected clients, and shuttles user actions and doc actions between them.
* `app/server/lib/`**`GranularAccess.ts`**
Responsible for granular access control. It checks user actions coming from clients before they are sent to the data engine, then again after the data engine translates them (reversing them if needed), and filters what gets sent to the clients based on what they have permission to see.
* `app/server/lib/`**`NSandbox.ts`**
Starts a subprocess with a sandboxed Python process running the data engine, and sets up pipes to and from it to allow making RPC-like calls to the data engine.
* `app/server/lib/`**`DocStorage.ts`**
Responsible for storing Grist data in a SQLite file. It satisfies fetch requests by retrieving data from SQLite, and knows how to translate every Doc Action into suitable SQL updates.
* `app/server/lib/`**`HostedStorageManager.ts`**
Responsible for getting files to and from storage, syncing docs to S3 (or an S3-compatible store) when they change locally, and creating and pruning snapshots.
Some code related to the Home DB lives in **`app/gen-server`**.
* `app/gen-server/lib/`**`HomeDBManager.ts`**
Responsible for dealing with HomeDB: it handles everything related to sharing, as well as finding, listing, updating docs, workspaces, and orgs (aka “team sites”). It also handles authorization needs — checking what objects a user is allowed to access, looking up users by email, etc.
### Common
The **`app/common`** directory contains files that are included both in the server-side, and in the client-side JS bundle. Its an assortment of utilities, libraries, and typings.
* `app/common/`**`TableData.ts`**
Maintains data of a Grist table in memory, and knows how to apply Doc Actions to it to keep it up-to-date.
* `app/common/`**`DocData.ts`**
Maintains a set of TableData objects, in other words all data for a Grist document, including the logic for applying Doc Actions to keep the in-memory data up-to-date.
* `app/common/`**`gutil.ts`**
Assorted functions and helpers like `removePrefix`, `countIf`, or `sortedIndex`.
### Client Side
Much of the application is on the browser side. The code for that all lives in `app/client`. It uses some lower-level libraries for working with DOM, specifically GrainJS (https://github.com/gristlabs/grainjs#documentation). Older code uses knockout and some library files that are essentially a precursor to GrainJS. These live in `app/client/lib`. See also [GrainJS & Grist Front-End Libraries](grainjs.md).
* **`app/client/models`**
Contains modules responsible for maintaining client-side data.
* `app/client/models/`**`TableData.ts`**, `app/client/models/`**`DocData.ts`**
Enhancements of same-named classes in `app/common` (see above) which add some client-side functionality like helpers to send User Actions.
* `app/client/models/`**`DocModel.ts`**
Maintains *observable* data models, for all metadata and user data tables in a document. For metadata tables, the individual records are enhanced to be specific to each type of metadata, using classes in `app/client/models/entities`. For example, `docModel.columns` is a `MetaTableModel` containing records of type `ColumnRec` (from `app/client/models/entities/ColumnRec.ts`) which are derived from `MetaRowModel`.
* `app/client/models/`**`TableModel.js`**
Base class for `MetaTableModel` and `DataTableModel`. It wraps `TableData` to make the data observable, i.e. to make it possible to subscribe to changes in it. This is the basis for how we build most UI.
* `app/client/models/`**`MetaTableModel.js`**
Maintains data for a metadata table, making it available as observable arrays of `MetaRowModel`s. The difference between metadata tables and user tables is that the Grist app knows whats in metadata, and relies on it for its functionality. We also assume that metadata tables are small enough that we can instantiate [observables](https://github.com/gristlabs/grainjs/blob/master/docs/basics.md#observables) for all fields of all rows.
* `app/client/models/`**`DataTableModel.js`**
Maintains data for a user table, making it available as `LazyArrayModel`s (defined in the same file), which are used as the basis for `koDomScrolly` (see `app/client/lib/koDomScrolly.js` below).
* `app/client/models/`**`BaseRowModel.js`**
An observable model for a record (aka row) of a user-data or metadata table. It takes a reference to the containing TableModel, a rowId, and a list of column names, and creates an observable for each field.
* `app/client/models/`**`MetaRowModel.ts`**
Extends BaseRowModel for built-in (metadata) tables. It has an observable for every field, and in addition gets enhanced with various table-specific [computeds](https://github.com/gristlabs/grainjs/blob/master/docs/basics.md#computed-observables) and methods. Each module in `app/client/models/entities/` becomes an extension of a `MetaRowModel`.
* `app/client/models/`**`DataRowModel.ts`**
Extends BaseRowModel for user tables. There are few assumption we can make about those, so it adds little, and is mainly used for the observables it creates for each field. These observables are extended with a “save interface”, so that calling `field.save()` will translate to sending an action to the server. Note that `BaseRowModel` are not instantiated for *all* rows of a table, but only for the visible ones. As a table is scrolled, the same `BaseRowModel` gets updated to reflect a new row, so that the associated DOM gets updated rather than rebuilt (and is moved around to where its expected to be in the scrolled position).
* **`app/client/models/entities/`**
Table-specific extensions of `MetaRowModel`, such as `ColumnRec`, `ViewFieldRec`, `ViewSectionRec`, etc.
* **`app/client/ui`, `app/client/components`, `app/client/ui2018`**
For obscure reasons, client-side components are largely shuffled between these three directories. There isnt a clear rule where to put things, but most new components are placed into `app/client/ui`.
* `app/client/components/`**`GristDoc.ts`**
The hub for everything related to an open document, similar to ActiveDoc on the server side. It contains the objects for communicating with the server, objects containing the in-memory data, it knows the currently active page, cursor, etc.
* `app/client/components/`**`GridView.js`**
The component for the most powerful “page widget” we have: the mighty grid. Its one of the oldest pieces of code in Grist. And biggest. In code, we often refer to “page widgets” (like grid) as “view sections”, and sometimes also as just “views” (hence “GridView”).
* `app/client/components/`**`BaseView.js`**
Base component for all page widgets: GridView, DetailView (used for Cards and Card Lists), ChartView, and CustomView. Its takes care of setting up various data-related features, such as column filtering and link-filtering, and has some other state and methods shared by different types of page widgets.
* `app/client/components/`**`Comm.ts`** and `app/client/components/`**`DocComm.ts`**
Implement communication with the NodeJS Doc Worker via websocket; specifically they implements an RPC-like interface, so that client-side code can call methods such as `applyUserActions`.
* `app/client/components/`**`GristWSConnection.ts`**
Implements the lower-level websocket communication, including finding the Doc Workers address, connecting the websocket, and reconnecting on disconnects.
* `app/client/ui/`**`UserManager.ts`**
Implements the UI component for managing the access of users to a document, workspace, or team site.
* **`app/client/aclui`**
Contains the pieces of the UI component for editing granular access control rules.
* **`app/client/lib`**
Contains lower-level utilities and widgets. Some highlights:
* `app/client/lib/`**`autocomplete.ts`**
The latest of the several autocomplete-like dropdowns weve used. Its whats used for the dropdowns in Reference columns, for example.
* `app/client/lib/`**`TokenField.ts`**
Our own token-field library, used for ChoiceList columns.
* `app/client/lib/`**`dom.js`**, **`koDom.js`**, **`koArray.js`**
Utilities superceded by GrainJS but still used by a bunch of code.
* `app/client/lib/`**`koDomScrolly.js`**
A special beast used for scrolling a very long list of rows by limiting rendering to those that are visible, and trying to reuse the rendered DOM as much as possible. It is a key component of grid and card-list views that allows them to list tens of thousands of rows fairly easily.
* `app/client/widgets`
Contains code for cell widgets, such as `TextBox`, `CheckBox`, or `DateTextBox`, and for the corresponding editors, such as `TextEditor`, `DateEditor`, `ReferenceEditor`, `FormulaEditor`, etc.
* `app/client/widgets/`**`FieldBuilder.ts`**
A FieldBuilder is created for each column to render the cells in it (using a widget like `TextBox`), as well as to render the column-specific configuration UI, and to instantiate the editor for this cell when the user starts to edit it.
* `app/client/widgets/`**`FieldEditor.ts`**
Instantiated when the user starts editing a cell. It creates the actual editor (like `TextEditor`), and takes care of various logic thats shared between editors, such as handling Enter/Escape commands, and actually saving the updated value.
### Python Data Engine
User-created formulas are evaluated by Python in a process we call the “data engine”, or the “sandbox” (since it runs in a sandboxed environment). Its job is to evaluate formulas and also keep track of dependencies, so that when a cell changes, all affected formula can be automatically recalculated.
* **`sandbox/grist/`**
Contains all data engine code.
* `sandbox/grist/`**`engine.py`**
Central class for the documents data engine. It has the implementation of most methods that Node can call, and is responsible to dispatch User Actions, evaluate formulas, and collect the resulting Doc Actions.
* `sandbox/grist/`**`useractions.py`**
Contains the implementation of all User Actions. Even simple ones require some work (e.g. a user should not manually set values to a formula column). Actions to metadata tables often trigger other work — e.g. updating metadata for a column may produce an additional schema action such as `RenameColumn` for the user table that corresponds to the metadata. Other complex User Actions (such as `CreateViewSection`) are implemented here because its easier and allows for simple single-step undos.