Commit Graph

699 Commits

Author SHA1 Message Date
Alex Hall
21b0ac3eff (core) Enforcing data size limit
Summary:
Track 'data size' in ActiveDoc alongside row count. Measure it at most once every 5 minutes after each change as before, or after every change when it becomes high enough to matter.

A document is now considered to be approaching/exceeding 'the data limit' if either the data size or the row count is approaching/exceeding its own limit.

Unrelated: tweaked teamFreeFeatures.snapshotWindow based on Quip comments

Test Plan: Tested manually that data size is now logged after every change once it gets high enough, but only if the row limit isn't also too high. Still too early for automated tests.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3341
2022-03-30 17:56:05 +02:00
Dmitry S
8269c33d01 (core) When importing JSON, create columns of type Numeric rather than Int
Summary:
JSON import logic was creating columns of type Int when JSON contained integral
values. This causes errors with large errors (e.g. millisecond timestamps), and
Numeric is generally the more convenient and common default.

Test Plan: TBD

Reviewers: jarek, alexmojaki

Reviewed By: jarek, alexmojaki

Subscribers: jarek, alexmojaki

Differential Revision: https://phab.getgrist.com/D3339
2022-03-30 09:54:35 -04:00
Alex Hall
06956f84a5 (core) Make Attachments columns get treated like RefLists more
Summary:
Treat the column type 'Attachments' as equivalent to 'RefList:_grist_Attachments' in a few places, because that's essentially what it is. The main goal was to fix parsing strings representing attachments (reflists).

Also removed an unused function.

Test Plan: Tested manually that pasting a CSV/JSON string representation of an attachments reflists works now.

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: paulfitz

Differential Revision: https://phab.getgrist.com/D3338
2022-03-28 23:14:29 +02:00
Paul Fitzpatrick
24522e61ff
remove stray redis dependency, and upgrade node in tests (#173)
* remove stray redis dependency in test
* tweak handling of database connection between tests
* upgrade node versions in tests, type guessing in node 10 has problems
2022-03-28 15:43:47 -04:00
Paul Fitzpatrick
c41c07e4d0 (core) add missing sandbox/run.sh script
Summary:
Adds a small missing script now used in core docker
container to create a python3 gvisor checkpoint on startup.

Test Plan: manual

Reviewers: georgegevoian

Reviewed By: georgegevoian

Subscribers: georgegevoian

Differential Revision: https://phab.getgrist.com/D3340
2022-03-27 12:57:22 -04:00
Alex Hall
4ce492b9e5 (core) Compute sample_record lazily using a property
Summary: This changes Table.sample_record from a regular attribute to a property that's only computed when it's needed, which is only for autocompletion. This means it's not cached any more, but it's also not recomputed every time the schema changes. Profiling showed that _make_sample_record took a signification portion of time, and this change makes the tests 2 or 3 seconds faster.

Test Plan: existing tests

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: paulfitz

Differential Revision: https://phab.getgrist.com/D3334
2022-03-25 20:07:32 +02:00
Alex Hall
59436d2bca (core) Grace period and delete-only mode when exceeding row limit
Summary:
Builds upon https://phab.getgrist.com/D3328

- Add HomeDB column `Document.gracePeriodStart`
- When the row count moves above the limit, set it to the current date. When it moves below, set it to null.
- Add DataLimitStatus type indicating if the document is approaching the limit, is in a grace period, or is in delete only mode if the grace period started at least 14 days ago. Compute it in ActiveDoc and send it to client when opening.
- Only allow certain user actions when in delete-only mode.

Follow-up tasks related to this diff:

- When DataLimitStatus in the client is non-empty, show a banner to the appropriate users.
- Only send DataLimitStatus to users with the appropriate access. There's no risk landing this now since real users will only see null until free team sites are released.
- Update DataLimitStatus immediately in the client when it changes, e.g. when user actions are applied or the product is changed. Right now it's only sent when the document loads.
- Update row limit, grace period start, and data limit status in ActiveDoc when the product changes, i.e. the user upgrades/downgrades.
- Account for data size when computing data limit status, not just row counts.

See also the tasks mentioned in https://phab.getgrist.com/D3331

Test Plan: Extended FreeTeam nbrowser test, testing the 4 statuses.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3331
2022-03-25 13:41:33 +02:00
Paul Fitzpatrick
134ae99e9a (core) add gvisor-based sandboxing to core
Summary:
This adds support for gvisor sandboxing in core. When Grist is run outside of a container, regular gvisor can be used (if on linux), and will run in rootless mode. When Grist is run inside a container, docker's default policy is insufficient for running gvisor, so a fork of gvisor is used that has less defence-in-depth but can run without privileges.

Sandboxing is automatically turned on in the Grist core container. It is not turned on automatically when built from source, since it is operating-system dependent.

This diff may break a complex method of testing Grist with gvisor on macs that I may have been the only person using. If anyone complains I'll find time on a mac to fix it :)

This diff includes a small "easter egg" to force document loads, primarily intended for developer use.

Test Plan: existing tests pass; checked that core and saas docker builds function

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3333
2022-03-24 17:04:49 -04:00
Paul Fitzpatrick
de703343d0 (core) disentangle some server tests, release to core, add GRIST_PROXY_AUTH_HEADER test
Summary:
This shuffles some server tests to make them available in grist-core,
and adds a test for the `GRIST_PROXY_AUTH_HEADER` feature added in
https://github.com/gristlabs/grist-core/pull/165

It includes a fix for a header normalization issue for websocket connections.

Test Plan: added test

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3326
2022-03-24 15:11:32 -04:00
Jarosław Sadziński
64c9717ac1 (core) Undo bug with summary table and raw data view
Summary:
Clicking undo/redo after converting a table to a summary table navigated
to the raw data view.

Test Plan: new test

Reviewers: georgegevoian, alexmojaki

Reviewed By: georgegevoian, alexmojaki

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3337
2022-03-24 20:03:33 +01:00
Alex Hall
546096fcc9 (core) Clean up and refactor uses of HomeDBManager.getDoc
Summary:
Firstly I just wanted some more consistency and less repetition in places where Documents are retrieved from the DB, so it's more obvious when code differs from the norm. Main changes for that part:

- Let HomeDBManager accept a `Request` directly and convert it to a `Scope`, and use this in a few places.
- `getScope` tries `req.docAuth.docId` if `req.params` doesn't have a docId.

I also refactored how `_createActiveDoc` gets the document URL, separating out getting the document from getting a URL for it. This is because I want to use that document object in a future diff, but I also just find it cleaner. Notable changes for that:

- Extracted a new method `HomeDBManager.getRawDocById` as an alternative to `getDoc` that's explicitly for when you only have a document ID.
- Removed the interface method `GristServer.getDocUrl` and its two implementations because it wasn't used elsewhere and it didn't really add anything on top of getting a doc (now done by `getRawDocById`) and `getResourceUrl`.
- Between `cachedDoc` and `getRawDocById` (which represent previously existing code paths) also try `getDoc(getScope(docSession.req))`, which is new, because it seems better to only `getRawDocById` as a last resort.

Test Plan: Existing tests

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3328
2022-03-24 13:42:36 +02:00
Jarosław Sadziński
b1c3943bf4 (core) Conditional formatting rules
Summary:
Adding conditional formatting rules feature.

Each column can have multiple styling rules which are applied in order
when evaluated to a truthy value.

- The creator panel has a new section: Cell Style
- New user action AddEmptyRule for adding an empty rule
- New columns in _grist_Table_columns and fields

A new color picker will be introduced in a follow-up diff (as it is also
used in choice/choice list/filters).

Design document:
https://grist.quip.com/FVzfAgoO5xOF/Conditional-Formatting-Implementation-Design

Test Plan: new tests

Reviewers: georgegevoian

Reviewed By: georgegevoian

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3282
2022-03-23 13:15:02 +01:00
Jarosław Sadziński
96a34122a5 (core) Restoring cursor position on raw data views
Summary:
This diff introduces cursor features for raw data views:
- Restoring cursor position when the browser window is reloaded
- Restoring the last edit position when the browser window is reloaded

Test Plan: Added tests

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: jarek

Differential Revision: https://phab.getgrist.com/D3314
2022-03-23 12:24:18 +01:00
Dmitry S
3b76b33423 (core) Fix bugs when both welcomeTour and docTour are available
Summary:
- Unify where in the code tours get initiated.
- Avoid start a new tour while one is being started or is in progress.
- Ignore welcome tour when on a doc that has a doc tour.
- Fix tours when starting with a special page like Access Rules.
- Remove mention of the no-longer-present "Give Feedback" button in the last
  message of the welcome tour.

Test Plan:
Add a browser test case that docTour preempts the welcome tour and shows no errors
(this test case fails in multiple ways without the changes).

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3330
2022-03-22 16:51:05 -04:00
Alex Hall
1452b6efc3 (core) Improve stacktraces from pyCall
Summary: Capture the stacktrace (via SandboxError) in `_pyCallWait` instead of `_onSandboxMsg` where it's always the same.

Test Plan:
Tested manually, found for example that the stacktrace in the logs changed from being rather useless:

```
at NSandbox._onSandboxMsg (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:229:36)
at /home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:179:18
at Unmarshaller.parse (/home/alex/work/grist/_build/core/app/common/marshal.js:289:21)
at NSandbox._onSandboxData (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:174:28)
at Socket.<anonymous> (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:63:59)
at Socket.emit (events.js:315:20)
at Socket.EventEmitter.emit (domain.js:467:12)
at addChunk (internal/streams/readable.js:309:12)
at readableAddChunk (internal/streams/readable.js:284:9)
at Socket.Readable.push (internal/streams/readable.js:223:10)
at Pipe.onStreamRead (internal/stream_base_commons.js:188:23)
```

to being somewhat more helpful:

```
at NSandbox._pyCallWait (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:134:19)
at processTicksAndRejections (internal/process/task_queues.js:93:5)
at async ActiveDoc.applyActionsToDataEngine (/home/alex/work/grist/_build/core/app/server/lib/ActiveDoc.js:1080:39)
at async Sharing._applyActionsToDataEngine (/home/alex/work/grist/_build/core/app/server/lib/Sharing.js:325:37)
```

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: paulfitz

Differential Revision: https://phab.getgrist.com/D3329
2022-03-22 17:00:02 +02:00
Alex Hall
2c9ae6dc94 (core) Enforce daily limit on API usage
Summary:
Keep track of the number of API requests made for this document today in redis. Uses local caches of the count and the document so that usually requests can proceed without waiting for redis or the database.

Moved the free standing function apiThrottle to become a method to avoid adding another layer of request handler callbacks.

Test Plan: Added a DocApi test

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3327
2022-03-22 00:22:45 +02:00
Cyprien P
b6f146d755 (core) Add options to switch chart orientation
Test Plan: Adds nbrowser tests

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3323
2022-03-21 11:28:44 +01:00
Alex Hall
ec8460b772 (core) Prune snapshots outside the window in product features
Summary:
- Add a method `getSnapshotWindow` to `IInventory` and `DocSnapshotInventory`. It returns a `SnapshotWindow`, which represents a duration of time for which we keep backups for a particular document.
- `DocSnapshotPruner` calls this method and passes the window to `shouldKeepSnapshots` to determine which document versions have fallen outside the window and should be pruned.
- The implementation passed to `DocSnapshotInventory` uses a new method `getDocProduct` in `HomeDBManager` which directly returns the `Product` associated with a document, given only the document ID. Other methods in `HomeDBManager` require passing more information, especially about a user, but `DocSnapshotPruner` only knows about document IDs.

Test Plan: Added a test for `getDocProduct` and a test for `DocSnapshotPruner` where `getSnapshotWindow` is specified.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3322
2022-03-18 18:48:14 +02:00
Cyprien P
21f1dfa56c (core) Add 'stacked' option to charts
Summary:
Adds nbrowser test

 - Also makes sort spec taken into account by Group Data options
 - This is a continuation of https://phab.getgrist.com/D3271
 - We still need to decide whether to add stack chart to area chart type

Test Plan: TBD

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3274
2022-03-18 10:59:12 +01:00
Paul Fitzpatrick
a5f5ecce19 (core) add grist.getTable(tableId) and a getTableId() method in plugin api
Summary:
Makes the new TableOperations API available for all tables
in the document. Adds methods for discovering the tableId of the
selected table. I was very tempted to implement the select() TODO
in the TableOperations API, but it requires a significant refactor
of the backend.

Test Plan: added test

Reviewers: alexmojaki

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D3325
2022-03-17 16:31:19 -04:00
Dmitry S
fa75f60bfd (core) Fix selection of rows after rows are dragged
Summary: After dragging rows up, selection was set incorrectly.

Test Plan: Expanded a browser test for dragging rows to check selection, which fails without this fix

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3324
2022-03-17 11:49:52 -04:00
George Gevoian
0f4f0d3dad (core) Migrate to SRP and add change password dialog
Summary:
Moves some auth-related UI components, like MFAConfig, out
of core, and adds a new ChangePasswordDialog component for
allowing direct password changes, replacing the old reset password
link to hosted Cognito.

Updates all MFA endpoints to use SRP for authentication.

Also refactors MFAConfig into smaller files, and polishes up some parts
of the UI to be more consistent with the login pages.

Test Plan: New server and deployment tests. Updated existing tests.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3311
2022-03-16 21:35:06 -07:00
Paul Fitzpatrick
7ba4dff18f (core) updates from grist-core 2022-03-15 13:40:22 -04:00
Paul Fitzpatrick
a641517bb1
Merge pull request #165 from MHOOO/reverse-proxy-auth-support
Reverse proxy auth support
2022-03-15 13:39:29 -04:00
Paul Fitzpatrick
98f64a8461 (core) add grist.selectedTable.create/update/destroy/upsert to custom widget api
Summary: This makes an equivalent of the /records REST endpoint available within custom widgets. For simple operations, it is compatible with https://github.com/airtable/airtable.js/. About half of the diff is refactoring code from DocApi that implements /records using applyUserActions, to make that code available in the plugin api.

Test Plan: added tests

Reviewers: alexmojaki

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D3320
2022-03-15 11:11:58 -04:00
Alex Hall
02e69fb685 (core) Crudely show row count and limit in UI
Summary:
Add rowCount returned from sandbox when applying user actions to ActionGroup which is broadcast to clients.

Add rowCount to ActiveDoc and update it after applying user actions.

Add rowCount to OpenLocalDocResult using ActiveDoc value, to show when a client opens a doc before any user actions happen.

Add rowCount observable to DocPageModel which is set when the doc is opened and when action groups are received.

Add crude UI (commented out) in Tool.ts showing the row count and the limit in AppModel.currentFeatures. The actual UI doesn't have a place to go yet.

Followup tasks:

- Real, pretty UI
- Counts per table
- Keep count(s) secret from users with limited access?
- Data size indicator?
- Banner when close to or above limit
- Measure row counts outside of sandbox to avoid spoofing with formula
- Handle changes to the limit when the plan is changed or extra rows are purchased

Test Plan: Tested UI manually, including with free team site, opening a fresh doc, opening an initialised doc, adding rows, undoing, and changes from another tab. Automated tests seem like they should wait for a proper UI.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3318
2022-03-14 21:49:32 +02:00
Alex Hall
d154b9afa7 (core) Make lookups depend on all rows
Summary:
This is a fix for a bug discussed in https://grist.slack.com/archives/C069RUP71/p1645138610722889

I still haven't completely wrapped my head around it or figured out how to make a simple reproducible example, but the problem seems to be that a lookup can happen before the column(s) being looked up (the summary helper column in this case) have been computed fully (I think it got interrupted halfway by an OrderError). `do_lookup` would check via `engine._use_node` that the row IDs it found had all been computed already, but there might still be other rows that hadn't been computed yet and would also have values matching the lookup key, so it missed those.

This diff instead calls `_use_node` with no `row_ids` argument, which should ensure that all rows have already been computed.

At first I was worried about how this would affect performance, which led me down an optimisation rabbit hole, hence a bit of unrelated cleanup here and also https://phab.getgrist.com/D3310 . But it doesn't seem to be a problem, and IIUC it should actually make things better, although this code is pretty confusing.

Test Plan: Tested manually that the doc no longer behaves weirdly

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal, paulfitz

Differential Revision: https://phab.getgrist.com/D3308
2022-03-14 19:42:51 +02:00
Thomas Karolski
ccdd551b4d style fixes 2022-03-14 17:51:10 +01:00
Paul Fitzpatrick
b2715ae9ef (core) forbid use of sqlite ATTACH except during VACUUM
Summary:
This calls sqlite3_limit(SQLITE_LIMIT_ATTACHED, 0) so that
if ever an `ATTACH` were snuck into an SQL query, it would be denied.
The limit needs to be waived when calling VACUUM since the implementation
of VACUUM uses ATTACH.

Test Plan: added test; existing tests should pass

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3316
2022-03-14 09:34:44 -04:00
Paul Fitzpatrick
3a8e7032bc (core) updates from grist-core 2022-03-14 09:18:44 -04:00
George Gevoian
ad1b4f3cff (core) Record new user sign-ups
Summary:
Adds Google Tag Manager snippet to all login pages, and a new user
preference, recordSignUpEvent, that's set to true on first sign-in. The
client now checks for this preference, and if true, dynamically loads
Google Tag Manager to record a sign-up event. Afterwards, it removes
the preference.

Test Plan: Tested manually.

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3319
2022-03-12 14:34:46 -08:00
Thomas Karolski
87af7a36cd [README] Update with doc on header based auth 2022-03-12 21:00:58 +01:00
Thomas Karolski
e5dc2d198f [comm] Use getRequestProfile from Authorizer 2022-03-12 21:00:52 +01:00
Thomas Karolski
9f3ed989c4 [authorizer] Determine auth header to use via an environment variable 2022-03-12 21:00:44 +01:00
Thomas Karolski
c459037b04 [authorizer] Move code for extracting auth header into a function 2022-03-12 21:00:36 +01:00
Jarosław Sadziński
eff78ae2e1 (core) New UI for raw data views
Summary:
Creating new UI for raw data views based on design.
- Renaming left for follow up diff
- Link in the menu is hidden for now
To access raw UI, use /p/data URL.

Test Plan: new tests

Reviewers: georgegevoian

Reviewed By: georgegevoian

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3306
2022-03-12 13:51:48 +01:00
Alex Hall
2d0978559b (core) Replace compute stack and frames with _current_node and _current_row_id
Summary:
This is an attempt to optimise Engine._use_node. It doesn't seem to actually improve overall performance significantly, but it shouldn't make it worse, and I think it's an improvement to the code.

It turns out that there's no need to track a stack of compute frames any more. The only time we get close to nested evaluation, we set allow_evaluation=False to prevent it actually happening. So there's only one 'frame' during actual evaluation, which means we can get rid of the concept of frames entirely. This allows simplifying the code and letting the computer do less work in general.

Test Plan: this

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3310
2022-03-11 12:34:00 +02:00
George Gevoian
f02174eb7e (core) Fix error when canceling import
Summary:
If cancel was clicked while a transform section was still being
generated in the Importer, an error was thrown. This refactors
the cancelImportFiles API action to take in the file upload id
in place of the entire DataSourceTransformed parameter, which
contains other values that are irrelevant to canceling. One of those
values, the transform section id, was causing the error to be thrown
since it was momentarily null.

Test Plan: Tested manually.

Reviewers: alexmojaki

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D3317
2022-03-10 16:24:49 -08:00
Paul Fitzpatrick
00eb44fa00 v0.7.6 2022-03-09 10:11:30 -05:00
Alex Hall
77a5d31afe (core) More accurate data size measurement
Summary: As suggested by @dsagal in https://phab.getgrist.com/D3277#inline-36801, change to query `SUM(pgsize - unused)` instead of `SUM(pgsize)` to measure actual data size more accurately. Technically this doesn't reflect the database file size as accurately, but it should reflect sandbox memory usage better, and more importantly it should allow users to see data size decreasing when they delete stuff.

Test Plan: Tested manually by adding rows to a doc and looking at the logs. The data size is smaller and changes more granularly.

Reviewers: dsagal, paulfitz

Reviewed By: paulfitz

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3313
2022-03-09 12:04:16 +02:00
Alex Hall
222dc505eb (core) Generate Engine._autocomplete_context lazily
Summary: I ran the python tests through a profiler and found that just generating the autocomplete context was a major cost, and this would happen with every schema change. By only generating it when needed for autocomplete, the time for tests reduced from ~38s to ~33s.

Test Plan: this

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3312
2022-03-09 12:03:11 +02:00
Thomas Karolski
a584bc3a19 [Comm.js] Return a session profile based on the x-remote-user header if set 2022-03-09 10:00:03 +00:00
Jarosław Sadziński
351d70d4fb (core) Serving widget info page from home url
Summary:
Custom widget into page is served from a homeUrl instead
of untrusted URL, which might be not used in grist-core.

Test Plan: manual test

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3307
2022-03-09 10:34:50 +01:00
Jarosław Sadziński
d2b82b84c7 (core) Fixing bug with resuming search on a hidden column.
Summary: Fix for error that happens when a search is resumed after one of the columns was hidden.

Test Plan: Added test that shows the error.

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3309
2022-03-09 10:34:17 +01:00
George Gevoian
7ead97b913 (core) Fill name on test login page when possible
Summary:
Updates simulateLogin to fill in the name field of
the test login page. Docker tests were failing because
users created via the test login page were falling back
to their email for their name.

Test Plan: N/A (fixing Docker tests)

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3315
2022-03-08 20:29:39 -08:00
Paul Fitzpatrick
c4d3d7d3bb (core) be careful when reassigning a doc to a worker it was on before
Summary:
Importing a .grist document is implemented in a somewhat clunky way, in a multi-worker setup.

 * First a random worker receives the upload, and updates Grist's various stores appropriately (database, redis, s3).
 * Then a random worker is assigned to serve the document.

If the worker serving the document fails, there is a chance the it will end up assigned to the worker that handled its upload. Currently the worker will misbehave in this case. This diff:

 * Ports a multi-worker test from test/home to run in test/s3, and adds a test simulating a bad scenario seen in the wild.
 * Fixes persistence of any existing document checksum in redis when a worker is assigned.
 * Adds a check when assigned a document to serve, and finding that document already cached locally. It isn't safe to rely only on the document checksum in redis, since that may have expired.
 * Explicitly claims the document on the uploading worker, so this situation becomes even less likely to arise.

Test Plan: added test

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3305
2022-03-08 17:20:01 -05:00
Thomas Karolski
116295e42f Minor refactor & comments 2022-03-08 19:40:25 +00:00
Thomas Karolski
953ac7c689 Use /usr/bin/env instead of /bin/bash 2022-03-08 19:24:36 +00:00
Thomas Karolski
82a7f0a796 Implement support for webserver header based auth 2022-03-08 19:24:11 +00:00
Alex Hall
321019217d (core) Lossless imports
Summary:
- Removed string parsing and some type guessing code from parse_data.py. That logic is now implicitly done by ValueGuesser by leaving the initial column type as Any. parse_data.py mostly comes into play when importing files (e.g. Excel) containing values that already have types, i.e. numbers and dates.
- 0s and 1s are treated as numbers instead of booleans to keep imports lossless.
- Removed dateguess.py and test_dateguess.py.
- Changed what `guessDateFormat` does when multiple date formats work equally well for the given data, in order to be consistent with the old dateguess.py.
- Columns containing numbers are now always imported as Numeric, never Int.
- Removed `NullIfEmptyParser` because it was interfering with the new system. Its purpose was to avoid pointlessly changing a column from Any to Text when no actual data was inserted. A different solution to that problem was already added to `_ensure_column_accepts_data` in the data engine in a recent related diff.

Test Plan:
- Added 2 `nbrowser/Importer2` tests.
- Updated various existing tests.
- Extended testing of `guessDateFormat`. Added `guessDateFormats` to show how ambiguous dates are handled internally.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3302
2022-03-08 12:14:39 +02:00