Summary:
The changes in this diff are sufficient to make this sequence work again:
```
./build electron-dev
bin/electron app/electron/runPrebuild.js
```
This brings up the local server within an electron window.
This is an unambitious diff, aimed at checking how rusty electron support had become. It does not revive Grist as a packaged electron app. The first substantial work needed would be to make the app aware of the local file system again, and think through how local files should be visualized and accessed now. In the past, there was a simple list of grist docs in a directory.
Test Plan: manual
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3534
Summary:
Fills in the title and description/thumbnail (for templates) in app.html if the
page being requested is for a document.
Test Plan: Tested manually.
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3544
Summary:
This calls a new `initialize` method on the sandbox before we start
doing calculations with it, to make sure that `random.seed()` has
been called. Otherwise, if the sandbox is cloned from a checkpoint,
the seed will have been reset.
The `initialize` method includes the functionality previously done
by `set_doc_url` since it is also initialization/personalization and
this way we avoid introducing another round trip to the sandbox.
Test Plan: tested with grist-core configured to use gvisor
Reviewers: georgegevoian, dsagal
Reviewed By: georgegevoian, dsagal
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3549
Summary:
Adds the new personal plan as a product that will be available
in the future. Can be enabled along with other plan-related via
an environment variable.
Test Plan: Browser tests and existing tests.
Reviewers: jarek
Reviewed By: jarek
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3533
Summary:
With this, a custom widget can render an attachment by doing:
```
const tokenInfo = await grist.docApi.getAccessToken({readOnly: true});
const img = document.getElementById('the_image');
const id = record.C[0]; // get an id of an attachment
const src = `${tokenInfo.baseUrl}/attachments/${id}/download?auth=${tokenInfo.token}`;
img.setAttribute('src', src)
```
The access token expires after a few mins, so if a user right-clicks on an image
to save it, they may get access denied unless they refresh the page. A little awkward,
but s3 pre-authorized links behave similarly and it generally isn't a deal-breaker.
Test Plan: added tests
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3488
Summary:
Use table titles (i.e. the raw data widget titles) in dropdowns and other parts of the Acess Rules page, instead of the table ID. This is particularly meant for summary tables which have/had an ID of the form `GristSummary_SourceTable_N`, but https://phab.getgrist.com/D3508 is changing that anyway.
The server method `getAclResources` now returns more metadata about each table so that the UI can display titles.
Test Plan: Extended and updated `nbrowser/AccessRules2.ts`. Added a small unit test for constructing table titles from the new description returned by `getAclResources`.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3494
Summary:
Most logging now includes altSessionId, but not the message logged at the end
of every request by the 'morgan' logger. This includes altSessionId in those
messages.
Test Plan: Verified that with GRIST_HOSTED_VERSION env var set, altSessionId is included in morgan-produced JSON messages.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: georgegevoian
Differential Revision: https://phab.getgrist.com/D3523
Summary:
1. Log errors in `ActiveDoc.loadDoc` as errors, not just warnings, except for a common 'Cannot create fork' error caused by deployment tests.
2. Log the method name that had an error in `server/lib/Client.ts`.
Discussion: https://grist.slack.com/archives/CR8HZ4P9V/p1652364998893169
Following up on https://phab.getgrist.com/D3522
Test Plan: tested manually, particularly by running the nbrowser/Fork test that led to the initial noisy errors in Slack.
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D3525
Summary: The previous access check in `getFormulaError` was not strict enough, allowing users to read the values of individual formula cells that they shouldn't be able to. Now `getCellValue` is used to check the access for the specific cell first.
Test Plan: Extended GranularAccess server test.
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3526
Summary:
This adds rudimentary support for opening certain SQLite files in Grist.
If you have a file such as `landing.db` in Grist, you can convert it to Grist format by doing (either in monorepo or grist-core):
```
yarn run cli -h
yarn run cli sqlite -h
yarn run cli sqlite gristify landing.db
```
The file is now openable by Grist. To actually do so with the regular Grist server, you'll need to either import it, or convert some doc you don't care about in the `samples/` directory to be a soft link to it (and then force a reload).
This implementation is a rudimentary experiment. Here are some awkwardnesses:
* Only tables that happen to have a column called `id`, and where the column happens to be an integer, can be opened directly with Grist as it is today. That could be generalized, but it looked more than a Gristathon's worth of work, so I instead used SQLite views.
* Grist will handle tables that start with an uncapitalized letter a bit erratically. You can successfully add columns, for example, but removing them will cause sadness - Grist will rename the table in a confused way.
* I didn't attempt to deal with column names with spaces etc (though views could deal with those).
* I haven't tried to do any fancy type mapping.
* Columns with constraints can make adding new rows impossible in Grist, since Grist requires that a row can be added with just a single cell set.
Test Plan: added small test
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3502
Summary:
Changes auto-generated summary table IDs from e.g. `GristSummary_6_Table1` to `Table1_summary_A_B` (meaning `Table1` grouped by `A` and `B`). This makes it easier to write formulas involving summary tables, make API requests, understand logs, etc.
Because these don't encode the source table ID as reliably as before, `decode_summary_table_name` now uses the summary table schema info, not just the summary table ID. Specifically, it looks at the type of the `group` column, which is `RefList:<source table id>`.
Renaming a source table renames the summary table as before, and now renaming a groupby column renames the summary table as well.
Conflicting table names are resolved in the usual way by adding a number at the end, e.g. `Table1_summary_A_B2`. These summary tables are not automatically renamed when the disambiguation is no longer needed.
A new migration renames all summary tables to the new scheme, and updates formulas using summary tables with a simple regex.
Test Plan:
Updated many tests to use the new style of name.
Added new Python tests to for resolving conflicts when renaming source tables and groupby columns.
Added a test for the migration, including renames in formulas.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3508
Summary:
When a user requests to read the contents of an attachment, only allow the request if there exists a cell in an attachment column that contains the attachment and which they have read access to.
This does not cover:
* Granular write access for attachments. In particular, a user who can write to any attachment column should be considered to have full read access to all attachment columns, currently.
* Access control of attachment metadata such as name and format.
The implementation uses a sql query that requires a scan, and some notes on how this could be optimized in future. The web client was updated to specify the cell to check for access, and performance seemed fine in casual testing on a doc with 1000s of attachments. I'm not sure how performance would hold up as the set of access rules grows as well.
Test Plan: added tests
Reviewers: alexmojaki
Reviewed By: alexmojaki
Differential Revision: https://phab.getgrist.com/D3490
Summary:
Summary tables now have their own raw viewsection, and are shown
under Raw Data Tables on the Raw Data page.
Test Plan: Browser and Python tests.
Reviewers: jarek
Reviewed By: jarek
Differential Revision: https://phab.getgrist.com/D3495
Summary:
Building:
- Builds no longer wait for tsc for either client, server, or test targets. All use esbuild which is very fast.
- Build still runs tsc, but only to report errors. This may be turned off with `SKIP_TSC=1` env var.
- Grist-core continues to build using tsc.
- Esbuild requires ES6 module semantics. Typescript's esModuleInterop is turned
on, so that tsc accepts and enforces correct usage.
- Client-side code is watched and bundled by webpack as before (using esbuild-loader)
Code changes:
- Imports must now follow ES6 semantics: `import * as X from ...` produces a
module object; to import functions or class instances, use `import X from ...`.
- Everything is now built with isolatedModules flag. Some exports were updated for it.
Packages:
- Upgraded browserify dependency, and related packages (used for the distribution-building step).
- Building the distribution now uses esbuild's minification. babel-minify is no longer used.
Test Plan: Should have no behavior changes, existing tests should pass, and docker image should build too.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3506
Summary:
- Upgrades to build-related packages:
- Upgrade typescript, related libraries and typings.
- Upgrade webpack, eslint; add tsc-watch, node-dev, eslint_d.
- Build organization changes:
- Build webpack from original typescript, transpiling only; with errors still
reported by a background tsc watching process.
- Typescript-related changes:
- Reduce imports of AWS dependencies (very noticeable speedup)
- Avoid auto-loading global @types
- Client code is now built with isolatedModules flag (for safe transpilation)
- Use allowJs to avoid copying JS files manually.
- Linting changes
- Enhance Arcanist ESLintLinter to run before/after commands, and set up to use eslint_d
- Update eslint config, and include .eslintignore to avoid linting generated files.
- Include a bunch of eslint-prompted and eslint-generated fixes
- Add no-unused-expression rule to eslint, and fix a few warnings about it
- Other items:
- Refactor cssInput to avoid circular dependency
- Remove a bit of unused code, libraries, dependencies
Test Plan: No behavior changes, all existing tests pass. There are 30 tests fewer reported because `test_gpath.py` was removed (it's been unused for years)
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3498
Summary:
Adds a Python function `REQUEST` which makes an HTTP GET request. Behind the scenes it:
- Raises a special exception to stop trying to evaluate the current cell and just keep the existing value.
- Notes the request arguments which will be returned by `apply_user_actions`.
- Makes the actual request in NodeJS, which sends back the raw response data in a new action `RespondToRequests` which reevaluates the cell(s) that made the request.
- Wraps the response data in a class which mimics the `Response` class of the `requests` library.
In certain cases, this asynchronous flow doesn't work and the sandbox will instead synchronously call an exported JS method:
- When reevaluating a single cell to get a formula error, the request is made synchronously.
- When a formula makes multiple requests, the earlier responses are retrieved synchronously from files which store responses as long as needed to complete evaluating formulas. See https://grist.slack.com/archives/CL1LQ8AT0/p1653399747810139
Test Plan: Added Python and nbrowser tests.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: paulfitz, dsagal
Differential Revision: https://phab.getgrist.com/D3429
Summary:
- Substantial refactoring of the logic when the server fails to send some
messages to a client.
- Add seqId numbers to server messages to ensure reliable order.
- Add a needReload flag in clientConnect for a clear indication whent the
browser client needs to reload the app.
- Reproduce some potential failure scenarios in a test case (some of which
previously could have led to incorrectly ordered messages).
- Convert other Comm tests to typescript.
- Tweak logging of Comm and Client to be slightly more concise (in particular,
avoid logging sessionId)
Note that despite the big refactoring, this only addresses a fairly rare
situation, with websocket failures while server is trying to send to the
client. It includes no improvements for failures while the client is sending to
the server.
(I looked for an existing library that would take care of these issues. A relevant article I found is https://docs.microsoft.com/en-us/azure/azure-web-pubsub/howto-develop-reliable-clients, but it doesn't include a library for both ends, and is still in review. Other libraries with similar purposes did not inspire enough confidence.)
Test Plan: New test cases, which reproduce some previously problematic scenarios.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3470
Summary:
For self-hosted Grist, forward auth has proven useful, where
some proxy wrapped around Grist manages authentication, and
passes on user information to Grist in a trusted header.
The current implementation is adequate when Grist is the
only place where the user logs in or out, but is confusing
otherwise (see https://github.com/gristlabs/grist-core/issues/207).
Here we take some steps to broaden the scenarios Grist's
forward auth support can be used with:
* When a trusted header is present and is blank, treat
that as the user not being logged in, and don't look
any further for identity information. Specifically,
don't look in Grist's session information.
* Add a `GRIST_IGNORE_SESSION` flag to entirely prevent
Grist from picking up identity information from a cookie,
in order to avoid confusion between multiple login methods.
* Add tests for common scenarios.
Test Plan: added tests
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3482
Summary:
When an account is upgraded to a new product in Billing, send a message to the redis channel `billingAccount-${accountId}-product-changed`.
ActiveDocs subscribe to this channel. When a message is received, they refresh their product from the database and use it to recalculate doc usage based on new limits. The new usage is broadcast to clients so they see the result of the upgrade live.
Test Plan: Extended nbrowser Billing test to test that a document open in a separate tab has its limit banner cleared immediately on upgrade.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3480
Summary:
- Also converted sandboxUtil to typescript.
- The issue with %s manifested when a Python traceback contained "%s" in the
string; in that case the object with log metadata (e.g. docId) would
confusingly replace %s as if it were part of the message from Python.
Test Plan: Added a test case for the fix.
Reviewers: alexmojaki
Reviewed By: alexmojaki
Differential Revision: https://phab.getgrist.com/D3486
Summary: Use new Banner component for activation messages.
Test Plan: Existing tests.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3483
Summary:
- Showing nudge to individual users to sign up for free team plan.
- Implementing billing page to upgrade from free team to pro.
- New modal with upgrade options and free team site signup.
- Integrating Stripe-hosted UI for checkout and plan management.
Test Plan: updated tests
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3456
Summary:
- Add app/common/CommTypes.ts to define types shared by client and server.
- Include @types/ws npm package
Test Plan: Intended to have no changes in behavior
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3467
Summary:
If an external store fails completely, Grist will continue to
retry uploading to it. This diff updated the HostedStorageManager
test to limit the extent of these retries to the test itself -
otherwise they continue for all other tests in the same process,
potentially disrupting those that read logs. There are other tests
that use s3, but they aren't run in the same process with delicate
log-reading tests, and it isn't quite as clear what improvement
to make there.
Test Plan:
artificially made external store fail, and checked that
test contamination seen previously no longer occurs.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3469
Summary:
Instead of always redirecting new users to the home page or the (teams) welcome page,
only redirect when the user signed in for the first time on a personal site, has access to
other sites, and isn't already being redirected to a specific page on their personal site.
Also tweaks how invalid Choice column values are displayed to match Choice List
columns, and fixes a small CSS issue with select by in the page widget picker when
there are options with long labels.
Test Plan: Browser tests.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3461
Summary:
Introduces a new message type, docUsage, that's broadcast to all connected
clients whenever document usage is updated in ActiveDoc.
Test Plan: Browser tests.
Reviewers: jarek
Reviewed By: jarek
Differential Revision: https://phab.getgrist.com/D3451
Summary:
Previously, absence of `GRIST_DOCS_S3_BUCKET` was equated with absence
of external storage, but that is no longer true now that Azure is
available. Azure could be used by setting `GRIST_DOCS_S3_BUCKET`
but the alternative `GRIST_AZURE_CONTAINER` flag is friendlier.
Test Plan:
confirmed manually that Azure can be configured and
used now without `GRIST_DOCS_S3_BUCKET`
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3448
Summary:
grist-ee build was failing since it didn't have a
DocUsageBanner implementation available. Made the implementation
added to monorepo available, since it will be useful to improve
the activation banner.
Test Plan: manaul
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: georgegevoian
Differential Revision: https://phab.getgrist.com/D3452
Summary:
I missed committing a file that is important for editing files comfortably in the ext directory in an IDE. This diff:
* Adds tsconfig-base-ext.json - that was the only intended change
* Unrelated: Forces all creation of connections to the home db through a new `getOrCreateConnection` method which changes the `busy_timeout` if using Sqlite. This was an attempt to fix random "database is locked" test failures. I believe multiple connections to the home db as an sqlite file do not happen in self-hosted Grist (where there is a single node process) or in our SaaS (where the database is in postgres). It does affect Grist started using `devServerMain.ts` (where multiple processes accessing same database are started) or various test configurations when extra database connections are opened.
* Unrelated: I added a `busy_timeout` for session storage, when it uses Sqlite. Again, I don't believe this affects self-hosted Grist or our SaaS.
* Tweaked a `BillingDiscount` test that looked perhaps vulnerable to a stripe request stalling.
I can't be sure my tweaks actually help, since I didn't succeed in replicating the failures. Update: looks like the "locked" error can still happen :(
Test Plan: manual
Reviewers: jarek
Reviewed By: jarek
Subscribers: jarek
Differential Revision: https://phab.getgrist.com/D3450
Summary:
Also fixes a minor CSS regression in UserManager where the
link to add a team member wasn't shown on a separate row.
Test Plan: Browser tests.
Reviewers: jarek
Reviewed By: jarek
Differential Revision: https://phab.getgrist.com/D3444
Summary: Combines the code and behaviour of the existing endpoints `GET /records` (for the general shape of the result and the parameters for sort/filter/limit etc) and retrieving a specific attachment with `GET /attachments/:id` for handling fields specific to attachments.
Test Plan: Added a DocApi test. Also updated one test to use the new endpoint instead of raw `GET /tables/_grist_Attachments/records`.
Reviewers: cyprien
Reviewed By: cyprien
Subscribers: cyprien
Differential Revision: https://phab.getgrist.com/D3443
Summary:
Previously, columns of type Any were created and modified one by one by reusing
the "empty column" logic from the data engine. This copies that logic to Node,
and sets the type of all columns together, to create them with the correct type
in the AddTable call.
This makes imports about twice faster (when slowness is due to many columns),
but doesn't address all cases where individual handling of columns causes slowness.
Test Plan: Added a test case for the new helper function.
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3427
Summary: Reduces the log level in a few places from error to warning.
Test Plan: N/A
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3437
Summary:
This allows limiting the memory available to documents in the sandbox when gvisor is used. If memory limit is exceeded, we offer to open doc in recovery mode. Recovery mode is tweaked to open docs with tables in "ondemand" mode, which will generally take less memory and allow for deleting rows.
The limit is on the size of the virtual address space available to the sandbox (`RLIMIT_AS`), which in practice appears to function as one would want, and is the only practical option. There is a documented `RLIMIT_RSS` limit to `specifies the limit (in bytes) of the process's resident set (the number of virtual pages resident in RAM)` but this is no longer enforced by the kernel (neither the host nor gvisor).
When the sandbox runs out of memory, there are many ways it can fail. This diff catches all the ones I saw, but there could be more.
Test Plan: added tests
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3398
Summary:
This makes it possible to configure a SendGrid-based Notifier
instance via a JSON configuration file.
Test Plan: Tested manually.
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3432
Summary: For grist-ee, expect an activation key in environment variable `GRIST_ACTIVATION` or in a file pointed to by `GRIST_ACTIVATION_FILE`. In absence of key, start a 30-day trial, during which a banner is shown. Once trial expires, installation goes into document-read-only mode.
Test Plan: added a test
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: jarek
Differential Revision: https://phab.getgrist.com/D3426
Summary:
The summary includes a count of documents that are approaching
limits, in grace period, or delete-only. The endpoint is only accessible
to site owners, and is currently unused. A follow-up diff will add usage
banners to the site home page, which will use the response from the
endpoint to communicate usage information to owners.
Test Plan: Browser and server tests.
Reviewers: alexmojaki
Reviewed By: alexmojaki
Differential Revision: https://phab.getgrist.com/D3420
Summary:
Helps with cases such as https://grist.slack.com/archives/C02EGJ1FUCV/p1652196111066649?thread_ts=1651656433.171889&cid=C02EGJ1FUCV
When a user unsubscribes from a webhook, the secret URL is deleted from the database, but as long as the doc was open it would continue retrying pending requests still in the queue for a long time, using the locally cached value without noticing the effect of unsubscribing. This change allows unsubscribing to have an effect more quickly so that problematic events can be removed from the queue.
Test Plan: existing tests
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3430
Summary:
Adds a new environment variable that allows for custom
CSS to be included in all core static pages.
Test Plan: Tested manually in grist-core.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3419
Summary:
Currently, we have two ways that we deliver Grist. One is grist-core,
which has simple defaults and is relatively easy for third parties to
deploy. The second is our internal build for our SaaS, which is the
opposite. For self-managed Grist, a planned paid on-premise version
of Grist, I adopt the following approach:
* Use the `grist-core` build mechanism, extending it to accept an
overlay of extra code if present.
* Extra code is supplied in a self-contained `ext` directory, with
an `ext/app` directory that is of same structure as core `app`
and `stubs/app`.
* The `ext` directory also contains information about extra
node dependencies needed beyond that of `grist-core`.
* The `ext` directory is contained within our monorepo rather than
`grist-core` since it may contain material not under the Apache
license.
Docker builds are achieved in our monorepo by using the `--build-context`
functionality to add in `ext` during the regular `grist-core` build:
```
docker buildx build --load -t gristlabs/grist-ee --build-context=ext=../ext .
```
Incremental builds in our monorepo are achieved with the `build_core.sh` helper,
like:
```
buildtools/build_core.sh /tmp/self-managed
cd /tmp/self-managed
yarn start
```
The initial `ext` directory contains material for snapshotting to S3.
If you build the docker image as above, and have S3 access, you can
do something like:
```
docker run -p 8484:8484 --env GRIST_SESSION_SECRET=a-secret \
--env GRIST_DOCS_S3_BUCKET=grist-docs-test \
--env GRIST_DOCS_S3_PREFIX=self-managed \
-v $HOME/.aws:/root/.aws -it gristlabs/grist-ee
```
This will start a version of Grist that is like `grist-core` but with
S3 snapshots enabled. To release this code to `grist-core`, it would
just need to move from `ext/app` to `app` within core.
I tried a lot of ways of organizing self-managed Grist, and this was
what made me happiest. There are a lot of trade-offs, but here is what
I was looking for:
* Only OSS-code in grist-core. Adding mixed-license material there
feels unfair to people already working with the repo. That said,
a possible future is to move away from our private monorepo to
a public mixed-licence repo, which could have the same relationship
with grist-core as the monorepo has.
* Minimal differences between self-managed builds and one of our
existing builds, ideally hewing as close to grist-core as possible
for ease of documentation, debugging, and maintenance.
* Ideally, docker builds without copying files around (the new
`--build-context` functionality made that possible).
* Compatibility with monorepo build.
Expressing dependencies of the extra code in `ext` proved tricky to
do in a clean way. Yarn/npm fought me every step of the way - everything
related to optional dependencies was unsatisfactory in some respect.
Yarn2 is flexible but smells like it might be overreach. In the end,
organizing to install non-core dependencies one directory up from the
main build was a good simple trick that saved my bacon.
This diff gets us to the point of building `grist-ee` images conveniently,
but there isn't a public repo people can go look at to see its source. This
could be generated by taking `grist-core`, adding the `ext` directory
to it, and pushing to a distinct repository. I'm not in a hurry to do that,
since a PR to that repo would be hard to sync with our monorepo and
`grist-core`. Also, we don't have any licensing text ready for the `ext`
directory. So leaving that for future work.
Test Plan: manual
Reviewers: georgegevoian, alexmojaki
Reviewed By: georgegevoian, alexmojaki
Differential Revision: https://phab.getgrist.com/D3415
Summary:
Use openpyxl instead of messytables (which used xlrd internally) in import_xls.py.
Skip empty rows since excel files can easily contain huge numbers of them.
Drop support for xls files (which openpyxl doesn't support) in favour of the newer xlsx format.
Fix some details relating to python virtualenvs and dependencies, as Jenkins was failing to find new Python dependencies.
Test Plan: Mostly relying on existing tests. Updated various tests which referred to xls files instead of xlsx. Added a Python test for skipping empty rows.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3406
Summary:
Adds a new implementation of the interface ExternalStorage that works for Azure Blob Storage as an alternative to S3, for a specific self-hosting case.
Tweaks HostedStorageManager and ICreate to allow configuring different core implementations of ExternalStorage.
Followup tasks:
- Make this code available to self hosters, possibly by making it open source.
- Add an env var or other config option to specify the preferred type of storage. Currently using the var `AZURE_STORAGE_CONNECTION_STRING` to know how to connect to Azure when requested, but that choice still only lives in test code.
Test Plan: Generalized HostedStorageManager and ExternalStorage tests to test the new AzureExternalStorage alongside S3ExternalStorage. The HostedStorageManager tests also now test the 'cached' in-memory test storage in a way that's closer to the real storage methods.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3413
Summary:
Destroy function in TableOperations was throwing error when invoked with a single
record id instead of an array. Now it returns a void type.
Also changing mapColumns function signature as it doesn't require options for a default
behavior.
Test Plan: Updated tests.
Reviewers: alexmojaki
Reviewed By: alexmojaki
Differential Revision: https://phab.getgrist.com/D3404
Summary:
Adds attachment and data size to the usage section of
the raw data page. Also makes in-document usage banners
update as user actions are applied, causing them to be
hidden/shown or updated based on the current state of
the document.
Test Plan: Browser tests.
Reviewers: jarek
Reviewed By: jarek
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3395
Summary:
- Better focus on the widget title
- Adding columns only to the current view section
- New popup with options when user wants to delete a page
- New dialog to enter table name
- New table as a widget doesn't create a separate page
- Removing a table doesn't remove the primary view
Test Plan: Updated and new tests
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3410
Summary:
Skipping columns during incremental imports wasn't working for certain
column types, such as numeric columns. The column's default value was
being used instead (e.g. 0), overwriting values in the destination
table.
Test Plan: Browser tests.
Reviewers: jarek
Reviewed By: jarek
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3402
Summary: Adds a special user action `UpdateCurrentTime` which invalidates an internal engine dependency node that doesn't belong to any table but is 'used' by the `NOW()` function. Applies the action automatically every hour.
Test Plan: Added a Python test for the user action. Tested the interval periodically applying the action manually: {F43312}
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3389
Summary: Allow exceeding the daily API usage limit for a doc based on additional allocations for the current hour and minute. See the doc comment on getDocApiUsageKeysToIncr for details. This means that up to 5 redis keys may be relevant at a time for a single document.
Test Plan: Updated and expanded 'Daily API Limit' tests.
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D3368
Summary:
Currently, Grist behind a reverse proxy will generate many
needless redirects via `http`, and can't be used with only
port 443. This diff centralizes generation of these redirects
and uses the protocol in APP_HOME_URL if it is set.
Test Plan:
manually tested by rebuilding grist-core and
doing a reverse proxy deployment that had no support for
port 80. Prior to this change, there are lots of problems;
after, the site works as expected.
Reviewers: jarek
Reviewed By: jarek
Differential Revision: https://phab.getgrist.com/D3400
Summary:
A new way for renaming tables.
- There is a new popup to rename section (where you can also rename the table)
- Renaming/Deleting page doesn't modify/delete the table.
- Renaming table can rename a page if the names match (and the page contains a section with that table).
- User can rename table in Raw Data UI in two ways - either on the listing or by using the section name popup
- As before, there is no way to change tableId - it is derived from a table name.
- When the section name is empty the table name is shown instead.
- White space for section name is allowed (to discuss) - so the user can just paste ' '.
- Empty name for a page is not allowed (but white space is).
- Some bugs related to deleting tables with attached summary tables (and with undoing this operation) were fixed (but not all of them yet).
Test Plan: Updated tests.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: georgegevoian
Differential Revision: https://phab.getgrist.com/D3360
Summary:
Summary columns now have their own conditional rules,
which are not shared with sister columns.
Test Plan: New test
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3388
Summary:
- Previously showed "UnboundLocalError". Now will show:
Import failed: Failed to parse Excel file.
Error: No tables found (1 empty tables skipped)
- Also fix logging for import code
Test Plan: Added a test case
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3396
Summary: InitNewDoc is essentially only used to generate initialDocSql, so it doesn't make sense to set the timezone and locale. They are always set when actually creating a new doc anyway. Discussed in https://grist.slack.com/archives/C0234CPPXPA/p1650312714217089.
Test Plan: this
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3394
Summary:
This also enables the new Usage section for all sites. Currently,
it shows metrics for document row count, but only if the user
has full document read access. Otherwise, a message about
insufficient access is shown.
Test Plan: Browser tests.
Reviewers: jarek
Reviewed By: jarek
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3377
Summary:
Call ActiveDoc.removeUnusedAttachments every hour using setInterval, and in ActiveDoc.shutdown (which also clears said interval).
Unrelated: small fix to my webhooks code which was creating a redis client on shutdown just to quit it.
Test Plan:
Tweaked DocApi test to remove expired attachments by force-reloading the doc, so that it removes them during shutdown. Extracted a new testing endpoint /verifyFiles to support this test (previously running that code only happened with `/removeUnused?verifyfiles=1`).
Tested the setInterval part manually.
Reviewers: paulfitz, dsagal
Reviewed By: paulfitz
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3387
Summary: Mark actions adding attachment metadata as 'internal' (not part of undo stack) which previously was only for the Calculate action.
Test Plan: Extended nbrowser attachments test
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3380
Summary:
This avoids an extra database query to look up the user's current
name, by capturing it at the moment their user id is queried.
Test Plan: existing test for user.Name changes continues to pass
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D3381
Summary:
- Add a new parameter `Features.baseMaxAttachmentsBytesPerDocument` and set it to 1GB for the free team product.
- Add a method to DocStorage to calculate the total size of existing and used attachments.
- Add a migration to DocStorage adding an index to make the query in the above method fast.
- Check in ActiveDoc if uploading attachment(s) would exceed the product limit on that document.
Test Plan: Added test in `limits.ts` testing enforcement of the attachment limit.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3374
Summary: The name of a user for actions made using a websocket until now could be inconsistent with that seen by other means. This draws the name from the database, rather than from session information that may have been cached from an identity provider.
Test Plan: added test
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3379
Summary: Adds methods to delete metadata rows based on timeDeleted. The flag expiredOnly determines if it only deletes attachments that were soft-deleted 7 days ago, or just all soft-deleted rows. Then any actual file data that doesn't have matching metadata is deleted.
Test Plan: DocApi test
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3364
Summary:
- Include docId when available for client-side error reporting
- Distinguish sandbox crashes from forced exits
Test Plan: Tested manually
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3373
Summary:
This also updates Authorizer to link the authSubject
to Grist users if not previously linked. Linked subjects
are now used as the username for password-based logins,
instead of emails, which remain as a fallback.
Test Plan: Existing tests, and tested login flows manually.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3356
Summary:
Builds on https://phab.getgrist.com/D3352
Add DocStorage.scanAttachmentsForUsageChanges to do fancy JSON query to find all attachment metadata rows whose soft deletion status needs updating.
Add ActiveDoc.updateUsedAttachments which uses the above and then applies the appropriate user action if needed to soft delete/undelete metadata rows.
Add endpoint in DocApi calling ActiveDoc method.
Test Plan: Added DocApi test
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3357
Summary: Adds a migration in preparation for future work on tracking and deleting attachments. This includes a `_grist_Attachments.timeDeleted` column which isn't used yet, and changing the storage format of user columns of type `Attachments`. DocStorage now treats Attachments like RefList in general (since they use JSON), which also prompted a tiny bit of refactoring.
Test Plan: Added a migration test case showing the change in format.
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D3352
Summary:
The logic for calculating redirects wasn't quite right for Grist
configured to use a single domain, with teams encoded in the path.
This fixes it.
Test Plan: tested manually with docker compose and /etc/hosts
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3359
Summary:
Based on a discussion in https://grist.quip.com/ZvttAyjLCI7H#eLVADAbyipu
Without this change, the only difference between Enterprise and Pro plans regarding snapshots is 5 extra snapshots, one per year.
Test Plan: none
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3349
Summary:
This fleshes out header-based authentication a little more to
work with traefik-forward-auth.
Test Plan: manually tested
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3348
Summary:
Adds a new Grist login page to the login app, and replaces the
server-side Cognito Google Sign-In flow with Google's own OAuth flow.
Test Plan: Browser and server tests.
Reviewers: jarek
Reviewed By: jarek
Differential Revision: https://phab.getgrist.com/D3332
Summary:
Track 'data size' in ActiveDoc alongside row count. Measure it at most once every 5 minutes after each change as before, or after every change when it becomes high enough to matter.
A document is now considered to be approaching/exceeding 'the data limit' if either the data size or the row count is approaching/exceeding its own limit.
Unrelated: tweaked teamFreeFeatures.snapshotWindow based on Quip comments
Test Plan: Tested manually that data size is now logged after every change once it gets high enough, but only if the row limit isn't also too high. Still too early for automated tests.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3341
Summary:
Builds upon https://phab.getgrist.com/D3328
- Add HomeDB column `Document.gracePeriodStart`
- When the row count moves above the limit, set it to the current date. When it moves below, set it to null.
- Add DataLimitStatus type indicating if the document is approaching the limit, is in a grace period, or is in delete only mode if the grace period started at least 14 days ago. Compute it in ActiveDoc and send it to client when opening.
- Only allow certain user actions when in delete-only mode.
Follow-up tasks related to this diff:
- When DataLimitStatus in the client is non-empty, show a banner to the appropriate users.
- Only send DataLimitStatus to users with the appropriate access. There's no risk landing this now since real users will only see null until free team sites are released.
- Update DataLimitStatus immediately in the client when it changes, e.g. when user actions are applied or the product is changed. Right now it's only sent when the document loads.
- Update row limit, grace period start, and data limit status in ActiveDoc when the product changes, i.e. the user upgrades/downgrades.
- Account for data size when computing data limit status, not just row counts.
See also the tasks mentioned in https://phab.getgrist.com/D3331
Test Plan: Extended FreeTeam nbrowser test, testing the 4 statuses.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3331
Summary:
This adds support for gvisor sandboxing in core. When Grist is run outside of a container, regular gvisor can be used (if on linux), and will run in rootless mode. When Grist is run inside a container, docker's default policy is insufficient for running gvisor, so a fork of gvisor is used that has less defence-in-depth but can run without privileges.
Sandboxing is automatically turned on in the Grist core container. It is not turned on automatically when built from source, since it is operating-system dependent.
This diff may break a complex method of testing Grist with gvisor on macs that I may have been the only person using. If anyone complains I'll find time on a mac to fix it :)
This diff includes a small "easter egg" to force document loads, primarily intended for developer use.
Test Plan: existing tests pass; checked that core and saas docker builds function
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3333
Summary:
This shuffles some server tests to make them available in grist-core,
and adds a test for the `GRIST_PROXY_AUTH_HEADER` feature added in
https://github.com/gristlabs/grist-core/pull/165
It includes a fix for a header normalization issue for websocket connections.
Test Plan: added test
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3326
Summary:
Firstly I just wanted some more consistency and less repetition in places where Documents are retrieved from the DB, so it's more obvious when code differs from the norm. Main changes for that part:
- Let HomeDBManager accept a `Request` directly and convert it to a `Scope`, and use this in a few places.
- `getScope` tries `req.docAuth.docId` if `req.params` doesn't have a docId.
I also refactored how `_createActiveDoc` gets the document URL, separating out getting the document from getting a URL for it. This is because I want to use that document object in a future diff, but I also just find it cleaner. Notable changes for that:
- Extracted a new method `HomeDBManager.getRawDocById` as an alternative to `getDoc` that's explicitly for when you only have a document ID.
- Removed the interface method `GristServer.getDocUrl` and its two implementations because it wasn't used elsewhere and it didn't really add anything on top of getting a doc (now done by `getRawDocById`) and `getResourceUrl`.
- Between `cachedDoc` and `getRawDocById` (which represent previously existing code paths) also try `getDoc(getScope(docSession.req))`, which is new, because it seems better to only `getRawDocById` as a last resort.
Test Plan: Existing tests
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3328
Summary:
Adding conditional formatting rules feature.
Each column can have multiple styling rules which are applied in order
when evaluated to a truthy value.
- The creator panel has a new section: Cell Style
- New user action AddEmptyRule for adding an empty rule
- New columns in _grist_Table_columns and fields
A new color picker will be introduced in a follow-up diff (as it is also
used in choice/choice list/filters).
Design document:
https://grist.quip.com/FVzfAgoO5xOF/Conditional-Formatting-Implementation-Design
Test Plan: new tests
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3282
Summary: Capture the stacktrace (via SandboxError) in `_pyCallWait` instead of `_onSandboxMsg` where it's always the same.
Test Plan:
Tested manually, found for example that the stacktrace in the logs changed from being rather useless:
```
at NSandbox._onSandboxMsg (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:229:36)
at /home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:179:18
at Unmarshaller.parse (/home/alex/work/grist/_build/core/app/common/marshal.js:289:21)
at NSandbox._onSandboxData (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:174:28)
at Socket.<anonymous> (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:63:59)
at Socket.emit (events.js:315:20)
at Socket.EventEmitter.emit (domain.js:467:12)
at addChunk (internal/streams/readable.js:309:12)
at readableAddChunk (internal/streams/readable.js:284:9)
at Socket.Readable.push (internal/streams/readable.js:223:10)
at Pipe.onStreamRead (internal/stream_base_commons.js:188:23)
```
to being somewhat more helpful:
```
at NSandbox._pyCallWait (/home/alex/work/grist/_build/core/app/server/lib/NSandbox.js:134:19)
at processTicksAndRejections (internal/process/task_queues.js:93:5)
at async ActiveDoc.applyActionsToDataEngine (/home/alex/work/grist/_build/core/app/server/lib/ActiveDoc.js:1080:39)
at async Sharing._applyActionsToDataEngine (/home/alex/work/grist/_build/core/app/server/lib/Sharing.js:325:37)
```
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: paulfitz
Differential Revision: https://phab.getgrist.com/D3329
Summary:
Keep track of the number of API requests made for this document today in redis. Uses local caches of the count and the document so that usually requests can proceed without waiting for redis or the database.
Moved the free standing function apiThrottle to become a method to avoid adding another layer of request handler callbacks.
Test Plan: Added a DocApi test
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3327
Summary:
- Add a method `getSnapshotWindow` to `IInventory` and `DocSnapshotInventory`. It returns a `SnapshotWindow`, which represents a duration of time for which we keep backups for a particular document.
- `DocSnapshotPruner` calls this method and passes the window to `shouldKeepSnapshots` to determine which document versions have fallen outside the window and should be pruned.
- The implementation passed to `DocSnapshotInventory` uses a new method `getDocProduct` in `HomeDBManager` which directly returns the `Product` associated with a document, given only the document ID. Other methods in `HomeDBManager` require passing more information, especially about a user, but `DocSnapshotPruner` only knows about document IDs.
Test Plan: Added a test for `getDocProduct` and a test for `DocSnapshotPruner` where `getSnapshotWindow` is specified.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3322
Summary: This makes an equivalent of the /records REST endpoint available within custom widgets. For simple operations, it is compatible with https://github.com/airtable/airtable.js/. About half of the diff is refactoring code from DocApi that implements /records using applyUserActions, to make that code available in the plugin api.
Test Plan: added tests
Reviewers: alexmojaki
Reviewed By: alexmojaki
Differential Revision: https://phab.getgrist.com/D3320
Summary:
Add rowCount returned from sandbox when applying user actions to ActionGroup which is broadcast to clients.
Add rowCount to ActiveDoc and update it after applying user actions.
Add rowCount to OpenLocalDocResult using ActiveDoc value, to show when a client opens a doc before any user actions happen.
Add rowCount observable to DocPageModel which is set when the doc is opened and when action groups are received.
Add crude UI (commented out) in Tool.ts showing the row count and the limit in AppModel.currentFeatures. The actual UI doesn't have a place to go yet.
Followup tasks:
- Real, pretty UI
- Counts per table
- Keep count(s) secret from users with limited access?
- Data size indicator?
- Banner when close to or above limit
- Measure row counts outside of sandbox to avoid spoofing with formula
- Handle changes to the limit when the plan is changed or extra rows are purchased
Test Plan: Tested UI manually, including with free team site, opening a fresh doc, opening an initialised doc, adding rows, undoing, and changes from another tab. Automated tests seem like they should wait for a proper UI.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3318
Summary:
This calls sqlite3_limit(SQLITE_LIMIT_ATTACHED, 0) so that
if ever an `ATTACH` were snuck into an SQL query, it would be denied.
The limit needs to be waived when calling VACUUM since the implementation
of VACUUM uses ATTACH.
Test Plan: added test; existing tests should pass
Reviewers: alexmojaki
Reviewed By: alexmojaki
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3316
Summary:
Adds Google Tag Manager snippet to all login pages, and a new user
preference, recordSignUpEvent, that's set to true on first sign-in. The
client now checks for this preference, and if true, dynamically loads
Google Tag Manager to record a sign-up event. Afterwards, it removes
the preference.
Test Plan: Tested manually.
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3319
Summary:
If cancel was clicked while a transform section was still being
generated in the Importer, an error was thrown. This refactors
the cancelImportFiles API action to take in the file upload id
in place of the entire DataSourceTransformed parameter, which
contains other values that are irrelevant to canceling. One of those
values, the transform section id, was causing the error to be thrown
since it was momentarily null.
Test Plan: Tested manually.
Reviewers: alexmojaki
Reviewed By: alexmojaki
Differential Revision: https://phab.getgrist.com/D3317
Summary: As suggested by @dsagal in https://phab.getgrist.com/D3277#inline-36801, change to query `SUM(pgsize - unused)` instead of `SUM(pgsize)` to measure actual data size more accurately. Technically this doesn't reflect the database file size as accurately, but it should reflect sandbox memory usage better, and more importantly it should allow users to see data size decreasing when they delete stuff.
Test Plan: Tested manually by adding rows to a doc and looking at the logs. The data size is smaller and changes more granularly.
Reviewers: dsagal, paulfitz
Reviewed By: paulfitz
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3313
Summary:
Custom widget into page is served from a homeUrl instead
of untrusted URL, which might be not used in grist-core.
Test Plan: manual test
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3307
Summary:
Importing a .grist document is implemented in a somewhat clunky way, in a multi-worker setup.
* First a random worker receives the upload, and updates Grist's various stores appropriately (database, redis, s3).
* Then a random worker is assigned to serve the document.
If the worker serving the document fails, there is a chance the it will end up assigned to the worker that handled its upload. Currently the worker will misbehave in this case. This diff:
* Ports a multi-worker test from test/home to run in test/s3, and adds a test simulating a bad scenario seen in the wild.
* Fixes persistence of any existing document checksum in redis when a worker is assigned.
* Adds a check when assigned a document to serve, and finding that document already cached locally. It isn't safe to rely only on the document checksum in redis, since that may have expired.
* Explicitly claims the document on the uploading worker, so this situation becomes even less likely to arise.
Test Plan: added test
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3305
Summary:
- Removed string parsing and some type guessing code from parse_data.py. That logic is now implicitly done by ValueGuesser by leaving the initial column type as Any. parse_data.py mostly comes into play when importing files (e.g. Excel) containing values that already have types, i.e. numbers and dates.
- 0s and 1s are treated as numbers instead of booleans to keep imports lossless.
- Removed dateguess.py and test_dateguess.py.
- Changed what `guessDateFormat` does when multiple date formats work equally well for the given data, in order to be consistent with the old dateguess.py.
- Columns containing numbers are now always imported as Numeric, never Int.
- Removed `NullIfEmptyParser` because it was interfering with the new system. Its purpose was to avoid pointlessly changing a column from Any to Text when no actual data was inserted. A different solution to that problem was already added to `_ensure_column_accepts_data` in the data engine in a recent related diff.
Test Plan:
- Added 2 `nbrowser/Importer2` tests.
- Updated various existing tests.
- Extended testing of `guessDateFormat`. Added `guessDateFormats` to show how ambiguous dates are handled internally.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3302