Commit Graph

18 Commits

Author SHA1 Message Date
Paul Fitzpatrick
05d1cdf140 (core) limit retries of uploads to external store in tests
Summary:
If an external store fails completely, Grist will continue to
retry uploading to it. This diff updated the HostedStorageManager
test to limit the extent of these retries to the test itself -
otherwise they continue for all other tests in the same process,
potentially disrupting those that read logs. There are other tests
that use s3, but they aren't run in the same process with delicate
log-reading tests, and it isn't quite as clear what improvement
to make there.

Test Plan:
artificially made external store fail, and checked that
test contamination seen previously no longer occurs.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3469
2022-06-06 16:19:41 -04:00
George Gevoian
f48d579f64 (core) Add API endpoint to get site usage summary
Summary:
The summary includes a count of documents that are approaching
limits, in grace period, or delete-only. The endpoint is only accessible
to site owners, and is currently unused. A follow-up diff will add usage
banners to the site home page, which will use the response from the
endpoint to communicate usage information to owners.

Test Plan: Browser and server tests.

Reviewers: alexmojaki

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D3420
2022-05-16 11:16:19 -07:00
Paul Fitzpatrick
e6983e9209 (core) add machinery for self-managed flavor of Grist
Summary:
Currently, we have two ways that we deliver Grist. One is grist-core,
which has simple defaults and is relatively easy for third parties to
deploy. The second is our internal build for our SaaS, which is the
opposite. For self-managed Grist, a planned paid on-premise version
of Grist, I adopt the following approach:

 * Use the `grist-core` build mechanism, extending it to accept an
   overlay of extra code if present.
 * Extra code is supplied in a self-contained `ext` directory, with
   an `ext/app` directory that is of same structure as core `app`
   and `stubs/app`.
 * The `ext` directory also contains information about extra
   node dependencies needed beyond that of `grist-core`.
 * The `ext` directory is contained within our monorepo rather than
   `grist-core` since it may contain material not under the Apache
   license.

Docker builds are achieved in our monorepo by using the `--build-context`
functionality to add in `ext` during the regular `grist-core` build:

```
docker buildx build --load -t gristlabs/grist-ee --build-context=ext=../ext .
```

Incremental builds in our monorepo are achieved with the `build_core.sh` helper,
like:

```
buildtools/build_core.sh /tmp/self-managed
cd /tmp/self-managed
yarn start
```

The initial `ext` directory contains material for snapshotting to S3.
If you build the docker image as above, and have S3 access, you can
do something like:

```
docker run -p 8484:8484 --env GRIST_SESSION_SECRET=a-secret \
  --env GRIST_DOCS_S3_BUCKET=grist-docs-test \
  --env GRIST_DOCS_S3_PREFIX=self-managed \
  -v $HOME/.aws:/root/.aws -it gristlabs/grist-ee
```

This will start a version of Grist that is like `grist-core` but with
S3 snapshots enabled. To release this code to `grist-core`, it would
just need to move from `ext/app` to `app` within core.

I tried a lot of ways of organizing self-managed Grist, and this was
what made me happiest. There are a lot of trade-offs, but here is what
I was looking for:

 * Only OSS-code in grist-core. Adding mixed-license material there
   feels unfair to people already working with the repo. That said,
   a possible future is to move away from our private monorepo to
   a public mixed-licence repo, which could have the same relationship
   with grist-core as the monorepo has.
 * Minimal differences between self-managed builds and one of our
   existing builds, ideally hewing as close to grist-core as possible
   for ease of documentation, debugging, and maintenance.
 * Ideally, docker builds without copying files around (the new
   `--build-context` functionality made that possible).
 * Compatibility with monorepo build.

Expressing dependencies of the extra code in `ext` proved tricky to
do in a clean way. Yarn/npm fought me every step of the way - everything
related to optional dependencies was unsatisfactory in some respect.
Yarn2 is flexible but smells like it might be overreach. In the end,
organizing to install non-core dependencies one directory up from the
main build was a good simple trick that saved my bacon.

This diff gets us to the point of building `grist-ee` images conveniently,
but there isn't a public repo people can go look at to see its source. This
could be generated by taking `grist-core`, adding the `ext` directory
to it, and pushing to a distinct repository. I'm not in a hurry to do that,
since a PR to that repo would be hard to sync with our monorepo and
`grist-core`. Also, we don't have any licensing text ready for the `ext`
directory. So leaving that for future work.

Test Plan: manual

Reviewers: georgegevoian, alexmojaki

Reviewed By: georgegevoian, alexmojaki

Differential Revision: https://phab.getgrist.com/D3415
2022-05-12 12:39:52 -04:00
Alex Hall
4408315f2e (core) Add AzureExternalStorage
Summary:
Adds a new implementation of the interface ExternalStorage that works for Azure Blob Storage as an alternative to S3, for a specific self-hosting case.

Tweaks HostedStorageManager and ICreate to allow configuring different core implementations of ExternalStorage.

Followup tasks:

- Make this code available to self hosters, possibly by making it open source.
- Add an env var or other config option to specify the preferred type of storage. Currently using the var `AZURE_STORAGE_CONNECTION_STRING` to know how to connect to Azure when requested, but that choice still only lives in test code.

Test Plan: Generalized HostedStorageManager and ExternalStorage tests to test the new AzureExternalStorage alongside S3ExternalStorage. The HostedStorageManager tests also now test the 'cached' in-memory test storage in a way that's closer to the real storage methods.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3413
2022-05-09 21:44:57 +02:00
Alex Hall
a701b4bf13 (core) Remove expired attachments every hour and on shutdown
Summary:
Call ActiveDoc.removeUnusedAttachments every hour using setInterval, and in ActiveDoc.shutdown (which also clears said interval).

Unrelated: small fix to my webhooks code which was creating a redis client on shutdown just to quit it.

Test Plan:
Tweaked DocApi test to remove expired attachments by force-reloading the doc, so that it removes them during shutdown. Extracted a new testing endpoint /verifyFiles to support this test (previously running that code only happened with `/removeUnused?verifyfiles=1`).

Tested the setInterval part manually.

Reviewers: paulfitz, dsagal

Reviewed By: paulfitz

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3387
2022-04-22 20:43:59 +02:00
Alex Hall
ec8460b772 (core) Prune snapshots outside the window in product features
Summary:
- Add a method `getSnapshotWindow` to `IInventory` and `DocSnapshotInventory`. It returns a `SnapshotWindow`, which represents a duration of time for which we keep backups for a particular document.
- `DocSnapshotPruner` calls this method and passes the window to `shouldKeepSnapshots` to determine which document versions have fallen outside the window and should be pruned.
- The implementation passed to `DocSnapshotInventory` uses a new method `getDocProduct` in `HomeDBManager` which directly returns the `Product` associated with a document, given only the document ID. Other methods in `HomeDBManager` require passing more information, especially about a user, but `DocSnapshotPruner` only knows about document IDs.

Test Plan: Added a test for `getDocProduct` and a test for `DocSnapshotPruner` where `getSnapshotWindow` is specified.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3322
2022-03-18 18:48:14 +02:00
Paul Fitzpatrick
c4d3d7d3bb (core) be careful when reassigning a doc to a worker it was on before
Summary:
Importing a .grist document is implemented in a somewhat clunky way, in a multi-worker setup.

 * First a random worker receives the upload, and updates Grist's various stores appropriately (database, redis, s3).
 * Then a random worker is assigned to serve the document.

If the worker serving the document fails, there is a chance the it will end up assigned to the worker that handled its upload. Currently the worker will misbehave in this case. This diff:

 * Ports a multi-worker test from test/home to run in test/s3, and adds a test simulating a bad scenario seen in the wild.
 * Fixes persistence of any existing document checksum in redis when a worker is assigned.
 * Adds a check when assigned a document to serve, and finding that document already cached locally. It isn't safe to rely only on the document checksum in redis, since that may have expired.
 * Explicitly claims the document on the uploading worker, so this situation becomes even less likely to arise.

Test Plan: added test

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3305
2022-03-08 17:20:01 -05:00
Paul Fitzpatrick
a94905dd0a (core) make sure forks with no changes are persisted
Summary:
This fixes a problem where a fork could be created, have no changes
made, and then (e.g. if worker rolled over) fail to open with a
`cannot create fork` error. Adds a test that fails priot to this diff.

Test Plan: added test

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3162
2021-12-01 22:27:56 -05:00
Dmitry S
f2f4fe0eca (core) Add LogMethods helper and use it for more JSON data in logs. Reduce unhelpful logging.
Summary:
- Sharing, Client, DocClients, HostingStorageManager all include available info.
- In HostingStorageManager, log numSteps and maxStepTimeMs, in case that helps
  debug SQLITE_BUSY problem.
- Replace some action-bundle logging with a JSON version aggregating some info.
- Skip logging detailed list of actions in production.

Test Plan: Tested manually by eyeballing log output in dev environment.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3086
2021-10-25 10:25:18 -04:00
Dmitry S
d1c1416d78 (core) Add rules to eslint to better match our coding conventions.
Summary:
We used tslint earlier, and on switching to eslint, some rules were not
transfered. This moves more rules over, for consistent conventions or helpful
warnings.

- Name private members with a leading underscore.
- Prefer interface over a type alias.
- Use consistent spacing around ':' in type annotations.
- Use consistent spacing around braces of code blocks.
- Use semicolons consistently at the ends of statements.
- Use braces around even one-liner blocks, like conditionals and loops.
- Warn about shadowed variables.

Test Plan: Fixed all new warnings. Should be no behavior changes in code.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2831
2021-05-24 12:56:18 -04:00
Paul Fitzpatrick
0e22716761 (core) uncheck FullCopy special when copying/forking a document
Summary:
When a document has an exception to allow copies,
unset that option on any copies of the document.

Test Plan: added test

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2794
2021-04-29 08:56:54 -04:00
Dmitry S
526b0ad33e (core) Configure more comprehensive eslint rules for Typescript
Summary:
- Update rules to be more like we've had with tslint
- Switch tsserver plugin to eslint (tsserver makes for a much faster way to lint in editors)
- Apply suggested auto-fixes
- Fix all lint errors and warnings in core/, app/, test/

Test Plan: Some behavior may change subtly (e.g. added missing awaits), relying on existing tests to catch problems.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2785
2021-04-26 18:54:55 -04:00
Paul Fitzpatrick
438f259687 (core) start reconciling forking with granular access
Summary:
This allows a fork to be made by a user if:
 * That user is an owner of the document being forked, or
 * That user has full read access to the document being forked.

The bulk of the diff is reorganization of how forking is done.  ActiveDoc.fork is now responsible for creating a fork, not just a docId/urlId for the fork. Since fork creation should not be limited to the doc worker hosting the trunk, a helper endpoint is added for placing the fork.

The change required sanitizing worker allocation a bit, and allowed session knowledge to be removed from HostedStorageManager.

Test Plan: Added test; existing tests pass.

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2700
2021-01-12 14:08:49 -05:00
Paul Fitzpatrick
d5b00f5169 (core) add explicit doc and inventory creation step
Summary:
Currently, if a document is created by importing a file, inventory
creation is a little haphazard - it works, but triggers a
"surprise" message.  This diff makes initialization of inventory
explicit, so that surprise messages shouldn't happen during
document creation.

Test Plan: manual

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2696
2020-12-21 11:13:03 -05:00
Paul Fitzpatrick
24e76b4abc (core) add endpoints for clearing snapshots and actions
Summary:
This adds a snapshots/remove and states/remove endpoint, primarily
for maintenance work rather than for the end user.  If some secret
gets into document history, it is useful to be able to purge it
in an orderly way.

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2694
2020-12-18 13:32:31 -05:00
Paul Fitzpatrick
e30d0fd5d0 (core) fix sync to s3 when doc is marked as dirty but proves to be clean
Summary:
This fixes a two problems:
 * A mistake in `KeyedMutex.runExclusive`.
 * Logic about saving a document to s3 when the document is found to match what is already there.

`HostedStorageManager.flushDoc` could get caught in a loop if a document was uploaded to s3 and then, without any change to it, marked as dirty.  Low level code would detect there was no change and skip the upload; but then the snapshotId could be unknown, causing an error and retries. This diff fixes that problem by discovering the snapshotId on downloads and tracking it. It also corrects a mutex problem that may have been creating the scenario. A small delay is added to `flushDoc` to mitigate the effect of similar problems in future. Exponential backoff would be good, but `flushDoc` is called in some situations where long delays would negatively impact worker shutdown or user work.

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2654
2020-11-10 08:12:31 -05:00
Paul Fitzpatrick
71519d9e5c (core) revamp snapshot inventory
Summary:
Deliberate changes:
 * save snapshots to s3 prior to migrations.
 * label migration snapshots in s3 metadata.
 * avoid pruning migration snapshots for a month.

Opportunistic changes:
 * Associate document timezone with snapshots, so pruning can respect timezones.
 * Associate actionHash/Num with snapshots.
 * Record time of last change in snapshots (rather than just s3 upload time, which could be a while later).

This ended up being a biggish change, because there was nowhere ideal to put tags (list of possibilities in diff).

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2646
2020-10-30 13:52:46 -04:00
Paul Fitzpatrick
5ef889addd (core) move home server into core
Summary: This moves enough server material into core to run a home server.  The data engine is not yet incorporated (though in manual testing it works when ported).

Test Plan: existing tests pass

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2552
2020-07-21 20:39:10 -04:00