(core) add GVISOR_LIMIT_MEMORY to cap memory available in sandbox

Summary:
This allows limiting the memory available to documents in the sandbox when gvisor is used. If memory limit is exceeded, we offer to open doc in recovery mode. Recovery mode is tweaked to open docs with tables in "ondemand" mode, which will generally take less memory and allow for deleting rows.

The limit is on the size of the virtual address space available to the sandbox (`RLIMIT_AS`), which in practice appears to function as one would want, and is the only practical option. There is a documented `RLIMIT_RSS` limit to `specifies the limit (in bytes) of the process's resident set (the number of virtual pages resident in RAM)` but this is no longer enforced by the kernel (neither the host nor gvisor).

When the sandbox runs out of memory, there are many ways it can fail. This diff catches all the ones I saw, but there could be more.

Test Plan: added tests

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3398
This commit is contained in:
Paul Fitzpatrick
2022-05-18 12:05:37 -04:00
parent 2fd8a34ff8
commit cf23a2d1ee
13 changed files with 145 additions and 36 deletions

View File

@@ -6,7 +6,7 @@ import {EventEmitter} from 'events';
import * as path from 'path';
import {ApiError} from 'app/common/ApiError';
import {mapSetOrClear} from 'app/common/AsyncCreate';
import {mapSetOrClear, MapWithTTL} from 'app/common/AsyncCreate';
import {BrowserSettings} from 'app/common/BrowserSettings';
import {DocCreationInfo, DocEntry, DocListAPI, OpenDocMode, OpenLocalDocResult} from 'app/common/DocListAPI';
import {FilteredDocUsageSummary} from 'app/common/DocUsage';
@@ -37,6 +37,10 @@ import noop = require('lodash/noop');
// but is a bit of a burden under heavy traffic.
export const DEFAULT_CACHE_TTL = 10000;
// How long to remember that a document has been explicitly set in a
// recovery mode.
export const RECOVERY_CACHE_TTL = 30000;
/**
* DocManager keeps track of "active" Grist documents, i.e. those loaded
* in-memory, with clients connected to them.
@@ -45,6 +49,8 @@ export class DocManager extends EventEmitter {
// Maps docName to promise for ActiveDoc object. Most of the time the promise
// will be long since resolved, with the resulting document cached.
private _activeDocs: Map<string, Promise<ActiveDoc>> = new Map();
// Remember recovery mode of documents.
private _inRecovery = new MapWithTTL<string, boolean>(RECOVERY_CACHE_TTL);
constructor(
public readonly storageManager: IDocStorageManager,
@@ -55,6 +61,10 @@ export class DocManager extends EventEmitter {
super();
}
public setRecovery(docId: string, recovery: boolean) {
this._inRecovery.set(docId, recovery);
}
// attach a home database to the DocManager. During some tests, it
// is awkward to have this set up at the point of construction.
public testSetHomeDbManager(dbManager: HomeDBManager) {
@@ -459,7 +469,7 @@ export class DocManager extends EventEmitter {
if (!this._activeDocs.has(docName)) {
activeDoc = await mapSetOrClear(
this._activeDocs, docName,
this._createActiveDoc(docSession, docName, wantRecoveryMode)
this._createActiveDoc(docSession, docName, wantRecoveryMode ?? this._inRecovery.get(docName))
.then(newDoc => {
// Propagate backupMade events from newly opened activeDocs (consolidate all to DocMan)
newDoc.on('backupMade', (bakPath: string) => {