Follow-up of #994. This PR revises the session ID generation logic to improve security in the absence of a secure session secret. It also adds a section in the admin panel "security" section to nag system admins when GRIST_SESSION_SECRET is not set.
Following is an excerpt from internal conversation.
TL;DR: Grist's current implementation generates semi-secure session IDs and uses a publicly known default signing key to sign them when the environment variable GRIST_SESSION_SECRET is not set. This PR generates cryptographically secure session IDs to dismiss security concerns around an insecure signing key, and encourages system admins to configure their own signing key anyway.
> The session secret is required by expressjs/session to sign its session IDs. It's designed as an extra protection against session hijacking by randomly guessing session IDs and hitting a valid one. While it is easy to encourage users to set a distinct session secret, this is unnecessary if session IDs are generated in a cryptographically secure way. As of now Grist uses version 4 UUIDs as session IDs (see app/server/lib/gristSessions.ts - it uses shortUUID.generate which invokes uuid.v4 under the hood). These contain 122 bits of entropy, technically insufficient to be considered cryptographically secure. In practice, this is never considered a real vulnerability. To compare, RSA2048 is still very commonly used in web servers, yet it only has 112 bits of security (>=128 bits = "secure", rule of thumb in cryptography). But for peace of mind I propose using crypto.getRandomValues to generate real 128-bit random values. This should render session ID signing unnecessary and hence dismiss security concerns around an insecure signing key.
Summary:
- /status accepts new optional query parameters: db=1, redis=1, and timeout=<ms> (defaults to 10_000).
- These verify that the server can make trivial calls to DB/Redis, and that they return within the timeout.
- New HealthCheck tests simulates DB and Redis problems.
- Added resilience to Redis reconnects (helped by a test case that simulates disconnects)
- When closing Redis-based session store, disconnect from Redis (to avoid hanging tests)
Some associated test reorg:
- Move stripeTools out of test/nbrowser, and remove an unnecessary dependency,
to avoid starting up browser for gen-server tests.
- Move TcpForwarder to its own file, to use in the new test.
Test Plan: Added a new HealthCheck test that simulates DB and Redis problems.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D4054
Summary:
Building:
- Builds no longer wait for tsc for either client, server, or test targets. All use esbuild which is very fast.
- Build still runs tsc, but only to report errors. This may be turned off with `SKIP_TSC=1` env var.
- Grist-core continues to build using tsc.
- Esbuild requires ES6 module semantics. Typescript's esModuleInterop is turned
on, so that tsc accepts and enforces correct usage.
- Client-side code is watched and bundled by webpack as before (using esbuild-loader)
Code changes:
- Imports must now follow ES6 semantics: `import * as X from ...` produces a
module object; to import functions or class instances, use `import X from ...`.
- Everything is now built with isolatedModules flag. Some exports were updated for it.
Packages:
- Upgraded browserify dependency, and related packages (used for the distribution-building step).
- Building the distribution now uses esbuild's minification. babel-minify is no longer used.
Test Plan: Should have no behavior changes, existing tests should pass, and docker image should build too.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: alexmojaki
Differential Revision: https://phab.getgrist.com/D3506
Summary:
I missed committing a file that is important for editing files comfortably in the ext directory in an IDE. This diff:
* Adds tsconfig-base-ext.json - that was the only intended change
* Unrelated: Forces all creation of connections to the home db through a new `getOrCreateConnection` method which changes the `busy_timeout` if using Sqlite. This was an attempt to fix random "database is locked" test failures. I believe multiple connections to the home db as an sqlite file do not happen in self-hosted Grist (where there is a single node process) or in our SaaS (where the database is in postgres). It does affect Grist started using `devServerMain.ts` (where multiple processes accessing same database are started) or various test configurations when extra database connections are opened.
* Unrelated: I added a `busy_timeout` for session storage, when it uses Sqlite. Again, I don't believe this affects self-hosted Grist or our SaaS.
* Tweaked a `BillingDiscount` test that looked perhaps vulnerable to a stripe request stalling.
I can't be sure my tweaks actually help, since I didn't succeed in replicating the failures. Update: looks like the "locked" error can still happen :(
Test Plan: manual
Reviewers: jarek
Reviewed By: jarek
Subscribers: jarek
Differential Revision: https://phab.getgrist.com/D3450
Summary:
- Update cookie module, to support modern sameSite settings
- Add a new cookie, grist_sid_status with less-sensitive value, to let less-trusted subdomains know if user is signed in
- The new cookie is kept in-sync with the session cookie.
- For a user signed in once, allow auto-signin is appropriate.
- For a user signed in with multiple accounts, show a page to select which account to use.
- Move css stylings for rendering users to a separate module.
Test Plan: Added a test case with a simulated Discourse page to test redirects and account-selection page.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3047
Summary:
* Remove adjustSession hack, interfering with loading docs under saml.
* Allow the anonymous user to receive an empty list of workspaces for
the merged org.
* Behave better on first page load when org is in path - this used to
fail because of lack of cookie. This is very visible in grist-core,
as a failure to load localhost:8484 on first visit.
* Mark cookie explicitly as SameSite=Lax to remove a warning in firefox.
* Make errorPages available in grist-core.
This changes the default behavior of grist-core to now start off in
anonymous mode, with an explicit sign-in step available. If SAML is not configured,
the sign-in operation will unconditionally sign the user in as a default
user, without any password check or other security. The user email is
taken from GRIST_DEFAULT_EMAIL if set. This is a significant change, but
makes anonymous mode available in grist-core (which is convenient
for testing) and makes behavior with and without SAML much more consistent.
Test Plan: updated test; manual (time to start adding grist-core tests though!)
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D2980
Summary:
We used tslint earlier, and on switching to eslint, some rules were not
transfered. This moves more rules over, for consistent conventions or helpful
warnings.
- Name private members with a leading underscore.
- Prefer interface over a type alias.
- Use consistent spacing around ':' in type annotations.
- Use consistent spacing around braces of code blocks.
- Use semicolons consistently at the ends of statements.
- Use braces around even one-liner blocks, like conditionals and loops.
- Warn about shadowed variables.
Test Plan: Fixed all new warnings. Should be no behavior changes in code.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D2831
Summary:
In an emergency, we may want to serve certain documents with "old" workers as we fix problems. This diff adds some support for that.
* Creates duplicate task definitions and services for staging and production doc workers (called grist-docs-staging2 and grist-docs-prod2), pulling from distinct docker tags (staging2 and prod2). The services are set to have zero workers until we need them.
* These new workers are started with a new env variable `GRIST_WORKER_GROUP` set to `secondary`.
* The `GRIST_WORKER_GROUP` variable, if set, makes the worker available to documents in the named group, and only that group.
* An unauthenticated `/assign` endpoint is added to documents which, when POSTed to, checks that the doc is served by a worker in the desired group for that doc (as set manually in redis), and if not frees the doc up for reassignment. This makes it possible to move individual docs between workers without redeployments.
The bash scripts added are a record of how the task definitions + services were created. The services could just have been copied manually, but the task definitions will need to be updated whenever the definitions for the main doc workers are updated, so it is worth scripting that.
For example, if a certain document were to fail on a new deployment of Grist, but rolling back the full deployment wasn't practical:
* Set prod2 tag in docker to desired codebase for that document
* Set desired_count for grist-docs-prod2 service to non-zero
* Set doc-<docid>-group for that doc in redis to secondary
* Hit /api/docs/<docid>/assign to move the doc to grist-docs-prod2
(If the document needs to be reverted to a previous snapshot, that currently would need doing manually - could be made simpler, but not in scope of this diff).
Test Plan: added tests
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D2649
Summary: This moves enough server material into core to run a home server. The data engine is not yet incorporated (though in manual testing it works when ported).
Test Plan: existing tests pass
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D2552