Commit Graph

32 Commits

Author SHA1 Message Date
Paul Fitzpatrick
bfd0fa8c7f
add an endpoint for doing SQL selects (#641)
* add an endpoint for doing SQL selects

This adds an endpoint for doing SQL selects directly on a Grist document. Other kinds of statements are not supported. There is a default timeout of a second on queries.

This follows loosely an API design by Alex Hall.

Co-authored-by: jarek <jaroslaw.sadzinski@gmail.com>
2023-09-04 09:21:18 -04:00
Paul Fitzpatrick
958ea096f3
fix a node-sqlite3-ism that breaks record removal in grist-static (#566)
Grist by default uses node-sqlite3 to manipulate data in an
SQLite database. If a single parameter is passed to `run`
and it is a list, the list is unpacked and its contents treated
as the actual parameters. In grist-static, we use other SQLite
interfaces that don't have that automatic unpacking. Most
calls like this have been removed from Grist, but at least one
was missed, and was causing symptoms such as
https://github.com/gristlabs/grist-static/issues/5

This change should make no difference to regular Grist, but
resolves the grist-static problems.
2023-07-11 05:52:06 -04:00
Paul Fitzpatrick
7be0ee289d
support other SQLite wrappers, and various hooks needed by grist-static (#516) 2023-05-23 15:17:28 -04:00
Paul Fitzpatrick
ea71312d0e (core) deal with write access for attachments
Summary:
Attachments are a special case for granular access control. A user is now allowed to read a given attachment if they have read access to a cell containing its id. So when a user writes to a cell in an attachment column, it is important that they can only write the ids of cells to which they have access. This diff allows a user to add an attachment id in a cell if:

  * The user already has access to that a attachment via some existing cell, or
  * The user recently updated the attachment, or
  * The attachment change is from an undo/redo of a previous action attributed to that user

Test Plan: Updated tests

Reviewers: georgegevoian, dsagal

Reviewed By: georgegevoian, dsagal

Differential Revision: https://phab.getgrist.com/D3681
2022-11-15 09:52:32 -05:00
Dmitry S
af77824618 (core) Add caching for measuring data size in DocStorage, when data isn't changing
Summary:
getDataSize() call can be expensive and involve lots of disk reading. We can
avoid doing it repeatedly when the document isn't actually changing.

Test Plan: Should have no change in behavior except for timings.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3605
2022-08-25 09:50:23 -04:00
Paul Fitzpatrick
ec8ab598cb (core) add a yarn run cli tool, and add a sqlite gristify option
Summary:
This adds rudimentary support for opening certain SQLite files in Grist.

If you have a file such as `landing.db` in Grist, you can convert it to Grist format by doing (either in monorepo or grist-core):
```
yarn run cli -h
yarn run cli sqlite -h
yarn run cli sqlite gristify landing.db
```

The file is now openable by Grist. To actually do so with the regular Grist server, you'll need to either import it, or convert some doc you don't care about in the `samples/` directory to be a soft link to it (and then force a reload).

This implementation is a rudimentary experiment. Here are some awkwardnesses:
 * Only tables that happen to have a column called `id`, and where the column happens to be an integer, can be opened directly with Grist as it is today. That could be generalized, but it looked more than a Gristathon's worth of work, so I instead used SQLite views.
 * Grist will handle tables that start with an uncapitalized letter a bit erratically. You can successfully add columns, for example, but removing them will cause sadness - Grist will rename the table in a confused way.
 * I didn't attempt to deal with column names with spaces etc (though views could deal with those).
 * I haven't tried to do any fancy type mapping.
 * Columns with constraints can make adding new rows impossible in Grist, since Grist requires that a row can be added with just a single cell set.

Test Plan: added small test

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3502
2022-07-14 12:00:30 -04:00
Alex Hall
f1df6c0a46 (core) Prevent logging pointless errors about attachments and data size on shutdown
Summary: As suggested in https://grist.slack.com/archives/CR8HZ4P9V/p1652365399661569?thread_ts=1652364998.893169&cid=CR8HZ4P9V, check if DocStorage is initialized before trying to use it when shutting down, to avoid noisy logging of errors about removing attachments and updating data size.

Test Plan: Tested manually that errors early in loadDoc caused logging of errors about attachments/data size before the changed, but not after.

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D3522
2022-07-14 12:09:19 +02:00
Paul Fitzpatrick
f91f45b26d (core) support granular read access for attachments
Summary:
When a user requests to read the contents of an attachment, only allow the request if there exists a cell in an attachment column that contains the attachment and which they have read access to.

This does not cover:
 * Granular write access for attachments. In particular, a user who can write to any attachment column should be considered to have full read access to all attachment columns, currently.
 * Access control of attachment metadata such as name and format.

The implementation uses a sql query that requires a scan, and some notes on how this could be optimized in future. The web client was updated to specify the cell to check for access, and performance seemed fine in casual testing on a doc with 1000s of attachments. I'm not sure how performance would hold up as the set of access rules grows as well.

Test Plan: added tests

Reviewers: alexmojaki

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D3490
2022-07-07 07:22:02 -04:00
Dmitry S
51ff72c15e (core) Faster builds all around.
Summary:
Building:
- Builds no longer wait for tsc for either client, server, or test targets. All use esbuild which is very fast.
- Build still runs tsc, but only to report errors. This may be turned off with `SKIP_TSC=1` env var.
- Grist-core continues to build using tsc.
- Esbuild requires ES6 module semantics. Typescript's esModuleInterop is turned
  on, so that tsc accepts and enforces correct usage.
- Client-side code is watched and bundled by webpack as before (using esbuild-loader)

Code changes:
- Imports must now follow ES6 semantics: `import * as X from ...` produces a
  module object; to import functions or class instances, use `import X from ...`.
- Everything is now built with isolatedModules flag. Some exports were updated for it.

Packages:
- Upgraded browserify dependency, and related packages (used for the distribution-building step).
- Building the distribution now uses esbuild's minification. babel-minify is no longer used.

Test Plan: Should have no behavior changes, existing tests should pass, and docker image should build too.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3506
2022-07-04 10:42:40 -04:00
George Gevoian
1e42871cc9 (core) Add attachment and data size usage
Summary:
Adds attachment and data size to the usage section of
the raw data page. Also makes in-document usage banners
update as user actions are applied, causing them to be
hidden/shown or updated based on the current state of
the document.

Test Plan: Browser tests.

Reviewers: jarek

Reviewed By: jarek

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3395
2022-05-04 13:46:55 -07:00
Alex Hall
a701b4bf13 (core) Remove expired attachments every hour and on shutdown
Summary:
Call ActiveDoc.removeUnusedAttachments every hour using setInterval, and in ActiveDoc.shutdown (which also clears said interval).

Unrelated: small fix to my webhooks code which was creating a redis client on shutdown just to quit it.

Test Plan:
Tweaked DocApi test to remove expired attachments by force-reloading the doc, so that it removes them during shutdown. Extracted a new testing endpoint /verifyFiles to support this test (previously running that code only happened with `/removeUnused?verifyfiles=1`).

Tested the setInterval part manually.

Reviewers: paulfitz, dsagal

Reviewed By: paulfitz

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3387
2022-04-22 20:43:59 +02:00
Alex Hall
d7514e9cfc (core) Create _grist_Attachments_fileIdent index in new docs
Summary: Patching up the mistake in https://phab.getgrist.com/D3374#inline-38023.

Test Plan: this

Reviewers: dsagal, paulfitz

Reviewed By: dsagal, paulfitz

Differential Revision: https://phab.getgrist.com/D3382
2022-04-19 21:21:52 +02:00
Alex Hall
64a5c79dbc (core) Limit total attachment file size per document
Summary:
- Add a new parameter `Features.baseMaxAttachmentsBytesPerDocument` and set it to 1GB for the free team product.
- Add a method to DocStorage to calculate the total size of existing and used attachments.
- Add a migration to DocStorage adding an index to make the query in the above method fast.
- Check in ActiveDoc if uploading attachment(s) would exceed the product limit on that document.

Test Plan: Added test in `limits.ts` testing enforcement of the attachment limit.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3374
2022-04-14 16:33:09 +02:00
Alex Hall
09da815c0c (core) Add /attachments/removeUnused DocApi endpoint to hard delete all unused attachments in document
Summary: Adds methods to delete metadata rows based on timeDeleted. The flag expiredOnly determines if it only deletes attachments that were soft-deleted 7 days ago, or just all soft-deleted rows. Then any actual file data that doesn't have matching metadata is deleted.

Test Plan: DocApi test

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3364
2022-04-12 17:11:11 +02:00
Alex Hall
64369df4c3 (core) Add /attachments/updateUsed DocApi endpoint to soft delete all unused attachments in document
Summary:
Builds on https://phab.getgrist.com/D3352

Add DocStorage.scanAttachmentsForUsageChanges to do fancy JSON query to find all attachment metadata rows whose soft deletion status needs updating.

Add ActiveDoc.updateUsedAttachments which uses the above and then applies the appropriate user action if needed to soft delete/undelete metadata rows.

Add endpoint in DocApi calling ActiveDoc method.

Test Plan: Added DocApi test

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3357
2022-04-07 15:08:22 +02:00
Alex Hall
251d79704b (core) Migrate Attachments columns from marshalled blobs to JSON
Summary: Adds a migration in preparation for future work on tracking and deleting attachments. This includes a `_grist_Attachments.timeDeleted` column which isn't used yet, and changing the storage format of user columns of type `Attachments`. DocStorage now treats Attachments like RefList in general (since they use JSON), which also prompted a tiny bit of refactoring.

Test Plan: Added a migration test case showing the change in format.

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D3352
2022-04-06 13:28:47 +02:00
Alex Hall
21b0ac3eff (core) Enforcing data size limit
Summary:
Track 'data size' in ActiveDoc alongside row count. Measure it at most once every 5 minutes after each change as before, or after every change when it becomes high enough to matter.

A document is now considered to be approaching/exceeding 'the data limit' if either the data size or the row count is approaching/exceeding its own limit.

Unrelated: tweaked teamFreeFeatures.snapshotWindow based on Quip comments

Test Plan: Tested manually that data size is now logged after every change once it gets high enough, but only if the row limit isn't also too high. Still too early for automated tests.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3341
2022-03-30 17:56:05 +02:00
Alex Hall
77a5d31afe (core) More accurate data size measurement
Summary: As suggested by @dsagal in https://phab.getgrist.com/D3277#inline-36801, change to query `SUM(pgsize - unused)` instead of `SUM(pgsize)` to measure actual data size more accurately. Technically this doesn't reflect the database file size as accurately, but it should reflect sandbox memory usage better, and more importantly it should allow users to see data size decreasing when they delete stuff.

Test Plan: Tested manually by adding rows to a doc and looking at the logs. The data size is smaller and changes more granularly.

Reviewers: dsagal, paulfitz

Reviewed By: paulfitz

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3313
2022-03-09 12:04:16 +02:00
Alex Hall
f1002c0e67 (core) Regularly log data size in DocStorage.applyStoredActions using sqlite dbstat
Summary:
- Small cleanup: Make DocStorage implement OnDemandStorage, and remove unused execWithBackup
- Upgrade to new versions (.3) of @gristlabs/sqlite3 and connect-sqlite3 to use dbstat
- Add _logDataSize method which queries dbstat, adding up pgsize for tables loaded into the data engine
- Only complete _logDataSize every 5 minutes using new field _lastLoggedDataSize

Test Plan: Tested manually

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3277
2022-02-22 00:59:04 +02:00
Edward Betts
d6e0e1fee3 Correct spelling mistakes 2022-02-19 09:46:49 +00:00
Alex Hall
a64fb105e3 (core) Use GristObjCode in CellValue
Summary: Makes type checking a bit stronger

Test Plan: it just has to compile

Reviewers: jarek

Reviewed By: jarek

Differential Revision: https://phab.getgrist.com/D3065
2021-10-11 14:11:32 +02:00
George Gevoian
e1780e4f58 (core) Migrate import code from data engine to Node
Summary:
Finishing imports now occurs in Node instead of the
data engine, which makes it possible to import into
on-demand tables. Merging code was also refactored
and now uses a SQL query to diff source and destination
tables in order to determine what to update or add.

Also fixes a bug where incremental imports involving
Excel files with multiple sheets would fail due to the UI
not serializing merge options correctly.

Test Plan: Browser tests.

Reviewers: jarek

Reviewed By: jarek

Differential Revision: https://phab.getgrist.com/D3046
2021-10-04 10:27:00 -07:00
Alex Hall
04e5d90f86 (core) Barely working reference lists in frontend
Summary:
This makes it possible to set the type of a column to ReferenceList, but the UI is terrible

ReferenceList.ts is a mishmash of ChoiceList and Reference that sort of works but something about the CSS is clearly broken

ReferenceListEditor is just a text editor, you have to type in a JSON array of row IDs. Ignore the value that's present when you start editing. I can maybe try mashing together ReferenceEditor and ChoiceListEditor but it doesn't seem wise.
I think @georgegevoian should take over here. Reviewing the diff as it is to check for obvious issues is probably good but I don't think it's worth trying to land/merge anything.

Test Plan: none

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: georgegevoian

Differential Revision: https://phab.getgrist.com/D2914
2021-07-23 18:41:44 +02:00
Paul Fitzpatrick
6e15d44cf6 (core) start applying defenses for untrusted document uploads
Summary:
This applies some mitigations suggested by SQLite authors when
opening untrusted SQLite databases, as we do when Grist docs
are uploaded by the user.  See:
  https://www.sqlite.org/security.html#untrusted_sqlite_database_files

Steps implemented in this diff are:
  * Setting `trusted_schema` to off
  * Running a SQLite-level integrity check on uploads

Other steps will require updates to our node-sqlite3 fork, since they
are not available via the node-sqlite3 api (one more reason to migrate
to better-sqlite3).

I haven't yet managed to create a file that triggers an integrity
check failure without also being detected as corruption by sqlite
at a more basic level, so that is a TODO for testing.

Test Plan:
existing tests pass; need to come up with exploits to
actually test the defences and have not yet

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2909
2021-07-14 18:34:27 -04:00
Dmitry S
8d62a857e1 (core) Add ChoiceList type, cell widget, and editor widget.
Summary:
- Adds a new ChoiceList type, and widgets to view and edit it.
- Store in SQLite as a JSON string
- Support conversions between ChoiceList and other types

Test Plan: Added browser tests, and a test for how these values are stored

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2803
2021-05-12 10:38:32 -04:00
Dmitry S
526b0ad33e (core) Configure more comprehensive eslint rules for Typescript
Summary:
- Update rules to be more like we've had with tslint
- Switch tsserver plugin to eslint (tsserver makes for a much faster way to lint in editors)
- Apply suggested auto-fixes
- Fix all lint errors and warnings in core/, app/, test/

Test Plan: Some behavior may change subtly (e.g. added missing awaits), relying on existing tests to catch problems.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2785
2021-04-26 18:54:55 -04:00
Paul Fitzpatrick
9e8e895abd (core) fix filters with many values when querying directly from db
Summary:
This fixes DocStorage.fetchQuery when the number of parameters
exceeds the maximum that can be passed directly to sqlite.
In this case, parameters are now stored and used from a temporary
table.

Problem first noticed via a use of DocStorage.fetchQuery by
granular access controls.  Access control should be optimized
to make fewer such queries, but that is a separate issue.

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2772
2021-04-14 12:44:02 -04:00
Paul Fitzpatrick
d8df2404c2 (core) return to using meaningful SQL types for columns
Summary:
Previously in {{D1053}} we switched to using BLOB as the "type" for all columns, to prevent SQLite from casting data unexpectedly.  This diff now returns to more meaningful types.  We apply marshalling to values when being placed in a column where a cast might occur, to inhibit such casting.

The benefit is that Grist documents become easier to interact with via regular database clients/libraries, which often rely on the column type more than a purely SQLite tool would.

On column type conversion, we run all blobs in the column through a decode/encode cycle so if they no longer need to be marshalled they revert to native type.  This could be optimized further, it is somewhat brute force.

Test Plan: Updated tests and reference document

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2755
2021-03-25 10:26:39 -04:00
Dmitry S
5b2de988b5 (core) Perform migrations of Grist schema using only metadata tables when possible.
Summary:
Loading all user data to run a migration is risky (creates more than usual
memory pressure), and almost never needed (only one migration requires it).

This diff attempts to run migrations using only metadata (_grist_* tables),
but retries if the sandbox tells it that all data is needed.

The intent is for new migrations to avoid needing all data.

Test Plan: Added a somewhat contrived unittest.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2659
2020-11-11 19:21:40 -05:00
Dmitry S
e2226c3ab7 (core) Store formula values in DB, and include them into .stored/.undo fields of actions.
Summary:
- Introduce a new SQLiteDB migration, which adds DB columns for formula columns
- Newly added columns have the special ['P'] (pending) value in them
  (in order to show the usual "Loading..." on the first load that triggers the migration)
- Calculated values are added to .stored/.undo fields of user actions.
- Various changes made in the sandbox to include .stored/.undo in the right order.
- OnDemand tables ignore stored formula columns, replacing them with special SQL as before
- In particular, converting to OnDemand table leaves stale values in those
  columns, we should maybe clean those out.

Some tweaks on the side:
- Allow overriding chai assertion truncateThreshold with CHAI_TRUNCATE_THRESHOLD
- Rebuild python automatically in watch mode

Test Plan: Fixed various tests, updated some fixtures. Many python tests that check actions needed adjustments because actions moved from .stored to .undo. Some checks added to catch situations previously only caught in browser tests.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2645
2020-11-04 16:45:47 -05:00
Paul Fitzpatrick
71519d9e5c (core) revamp snapshot inventory
Summary:
Deliberate changes:
 * save snapshots to s3 prior to migrations.
 * label migration snapshots in s3 metadata.
 * avoid pruning migration snapshots for a month.

Opportunistic changes:
 * Associate document timezone with snapshots, so pruning can respect timezones.
 * Associate actionHash/Num with snapshots.
 * Record time of last change in snapshots (rather than just s3 upload time, which could be a while later).

This ended up being a biggish change, because there was nowhere ideal to put tags (list of possibilities in diff).

Test Plan: added tests

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2646
2020-10-30 13:52:46 -04:00
Paul Fitzpatrick
5ef889addd (core) move home server into core
Summary: This moves enough server material into core to run a home server.  The data engine is not yet incorporated (though in manual testing it works when ported).

Test Plan: existing tests pass

Reviewers: dsagal

Reviewed By: dsagal

Differential Revision: https://phab.getgrist.com/D2552
2020-07-21 20:39:10 -04:00