Commit Graph

10 Commits

Author SHA1 Message Date
Alex Hall
ec8460b772 (core) Prune snapshots outside the window in product features
Summary:
- Add a method `getSnapshotWindow` to `IInventory` and `DocSnapshotInventory`. It returns a `SnapshotWindow`, which represents a duration of time for which we keep backups for a particular document.
- `DocSnapshotPruner` calls this method and passes the window to `shouldKeepSnapshots` to determine which document versions have fallen outside the window and should be pruned.
- The implementation passed to `DocSnapshotInventory` uses a new method `getDocProduct` in `HomeDBManager` which directly returns the `Product` associated with a document, given only the document ID. Other methods in `HomeDBManager` require passing more information, especially about a user, but `DocSnapshotPruner` only knows about document IDs.

Test Plan: Added a test for `getDocProduct` and a test for `DocSnapshotPruner` where `getSnapshotWindow` is specified.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3322
2022-03-18 18:48:14 +02:00
Paul Fitzpatrick
c4d3d7d3bb (core) be careful when reassigning a doc to a worker it was on before
Summary:
Importing a .grist document is implemented in a somewhat clunky way, in a multi-worker setup.

 * First a random worker receives the upload, and updates Grist's various stores appropriately (database, redis, s3).
 * Then a random worker is assigned to serve the document.

If the worker serving the document fails, there is a chance the it will end up assigned to the worker that handled its upload. Currently the worker will misbehave in this case. This diff:

 * Ports a multi-worker test from test/home to run in test/s3, and adds a test simulating a bad scenario seen in the wild.
 * Fixes persistence of any existing document checksum in redis when a worker is assigned.
 * Adds a check when assigned a document to serve, and finding that document already cached locally. It isn't safe to rely only on the document checksum in redis, since that may have expired.
 * Explicitly claims the document on the uploading worker, so this situation becomes even less likely to arise.

Test Plan: added test

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3305
2022-03-08 17:20:01 -05:00
Edward Betts
d6e0e1fee3 Correct spelling mistakes 2022-02-19 09:46:49 +00:00
Paul Fitzpatrick
438f259687 (core) start reconciling forking with granular access
Summary:
This allows a fork to be made by a user if:
 * That user is an owner of the document being forked, or
 * That user has full read access to the document being forked.

The bulk of the diff is reorganization of how forking is done.  ActiveDoc.fork is now responsible for creating a fork, not just a docId/urlId for the fork. Since fork creation should not be limited to the doc worker hosting the trunk, a helper endpoint is added for placing the fork.

The change required sanitizing worker allocation a bit, and allowed session knowledge to be removed from HostedStorageManager.

Test Plan: Added test; existing tests pass.

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2700
2021-01-12 14:08:49 -05:00
Paul Fitzpatrick
d5b00f5169 (core) add explicit doc and inventory creation step
Summary:
Currently, if a document is created by importing a file, inventory
creation is a little haphazard - it works, but triggers a
"surprise" message.  This diff makes initialization of inventory
explicit, so that surprise messages shouldn't happen during
document creation.

Test Plan: manual

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2696
2020-12-21 11:13:03 -05:00
Paul Fitzpatrick
24e76b4abc (core) add endpoints for clearing snapshots and actions
Summary:
This adds a snapshots/remove and states/remove endpoint, primarily
for maintenance work rather than for the end user.  If some secret
gets into document history, it is useful to be able to purge it
in an orderly way.

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2694
2020-12-18 13:32:31 -05:00
Paul Fitzpatrick
b1c4af4ee9 (core) correctly delete pruned document versions
Summary:
After switch to using an inventory file, old document versions were
not in fact being pruned.  This corrects that and adds a test
that fails with the previous implementation.

The pruner was operating correctly, but was being applied to an
inventory list rather than s3 directly - and the inventory list
did not pass through version removals to s3.

This fix will leave a stock of undeleted versions that can
be eliminated by an external script (there are alternatives
but that seems simplest overall).

Test Plan: updated test

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2693
2020-12-16 15:11:15 -05:00
Paul Fitzpatrick
e30d0fd5d0 (core) fix sync to s3 when doc is marked as dirty but proves to be clean
Summary:
This fixes a two problems:
 * A mistake in `KeyedMutex.runExclusive`.
 * Logic about saving a document to s3 when the document is found to match what is already there.

`HostedStorageManager.flushDoc` could get caught in a loop if a document was uploaded to s3 and then, without any change to it, marked as dirty.  Low level code would detect there was no change and skip the upload; but then the snapshotId could be unknown, causing an error and retries. This diff fixes that problem by discovering the snapshotId on downloads and tracking it. It also corrects a mutex problem that may have been creating the scenario. A small delay is added to `flushDoc` to mitigate the effect of similar problems in future. Exponential backoff would be good, but `flushDoc` is called in some situations where long delays would negatively impact worker shutdown or user work.

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2654
2020-11-10 08:12:31 -05:00
Paul Fitzpatrick
71519d9e5c (core) revamp snapshot inventory
Summary:
Deliberate changes:
 * save snapshots to s3 prior to migrations.
 * label migration snapshots in s3 metadata.
 * avoid pruning migration snapshots for a month.

Opportunistic changes:
 * Associate document timezone with snapshots, so pruning can respect timezones.
 * Associate actionHash/Num with snapshots.
 * Record time of last change in snapshots (rather than just s3 upload time, which could be a while later).

This ended up being a biggish change, because there was nowhere ideal to put tags (list of possibilities in diff).

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2646
2020-10-30 13:52:46 -04:00
Paul Fitzpatrick
5ef889addd (core) move home server into core
Summary: This moves enough server material into core to run a home server.  The data engine is not yet incorporated (though in manual testing it works when ported).

Test Plan: existing tests pass

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2552
2020-07-21 20:39:10 -04:00