gristlabs_grist-core

mirror of https://github.com/gristlabs/grist-core.git synced 2024-10-27 20:44:07 +00:00

Author	SHA1	Message	Date
Paul Fitzpatrick	db34e1a6d9	(core) tweak throttling to work for gvisor/runsc Summary: Grist has, up to now, used a throttling mechanism that allows a sandbox free rein until it starts using above some threshold percentage of a cpu for some time - at that point, we start sending STOP and CONT signals on a duty cycle, with longer and longer STOPped periods until cpu usage is at a threshold. The general idea is to do short jobs quickly, while throttling long jobs (thus unfortunately making them even longer) in order to continue doing other short jobs quickly. The runsc sandbox is not a single process, there are in fact 5 per sandbox in our setup. Runsc can work with kvm or ptrace. Kvm is not available to us, so we use ptrace. With ptrace, there is one process that is the appropriate one to duty cycle, and another that needs to receive a signal in order to yield. This diff adds the necessary machinery. This is a conservative change, where I stick with our existing throttling mechanism and adapt it to the new sandbox. It would be reasonable to consider switching throttling. There's a lot the OS allows. We can set a quota for how much cpu a process can use within a given period, for example. However the overall behavior with that would be quite different to what we have, so feels like this would need more discussion. The implementation contains use of a linux utility `pgrep` since portability is not important (runsc is only available on linux) and there's no node api for enumerating children of a process. The diff contains some tweaks to `buildtools/contain.sh` to streamline experimenting with Grist and runsc on a mac. It is important for throttling that node and the sandbox processes are in the same process name space, if docker is in between them then some extra machinery is needed (a proxy throttler and a way to communicate with it) which I chose not to implement. Test Plan: added test; a lot of manual testing Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D3113	2021-11-04 17:23:43 -04:00
Paul Fitzpatrick	dd0f1be117	(core) get all tests working under python3/gvisor Summary: This verifies that all existing tests are capable of running under python3/gvisor, and fixes the small issues that came up. It does not yet activate python3 tests on all diffs, only diffs that specifically request them. * Adds a suffix in test names and output directories for tests run with PYTHON_VERSION=3, so that results of the same test run with and without the flag can be aggregated cleanly. * Adds support for checkpointing to the gvisor sandbox adapter. * Prepares a checkpoint made after grist python code has loaded in the gvisor sandbox. * Changes how `DOC_URL` is passed to the sandbox, since it can no longer be passed in as an environment variable when using checkpoints. * Uses the checkpoint to speed up tests using the gvisor sandbox, otherwise a lot of tests need more time (especially on mac under docker). * Directs jenkins to run all tests with python2 and python3 when a new file `buildtools/changelogs/python.txt` is touched (this diff counts as touching that file). * Tweaks miscellaneous tests - some needed fixes exposed by slightly different timing - a small number actually give different results in py3 (removal of `u` prefixes). - some needed a little more time The DOC_URL change is not the ultimate solution we want for DOC_URL. Eventually it should be a variable that gets updated, like the date perhaps. This is just a small pragmatic change to preserve existing behavior. Tests are run mindlessly as py3, and for some tests it won't change anything (e.g. if they do not use NSandbox). Tests are not run in parallel, doubling overall test time. Checkpoints could be useful in deployment, though this diff doesn't use them there. The application of checkpoints doesn't check for other configuration like 3-versus-5-pipe that we don't actually use. Python2 tests run using pynbox as always for now. The diff got sufficiently bulky that I didn't tackle running py3 on "regular" diffs in it. My preference, given that most tests don't appear to stress the python side of things, would be to make a selection of the tests that do and a few wild cards, and run those tests on both pythons rather then all of them. For diffs making a significant python change, I'd propose touching buildtools/changelogs/python.txt for full tests. But this is a conversation in progress. A total of 6886 tests ran on this diff. Test Plan: this is a step in preparing tests for py3 transition Reviewers: dsagal Reviewed By: dsagal Subscribers: dsagal Differential Revision: https://phab.getgrist.com/D3066	2021-10-18 17:44:15 -04:00
Paul Fitzpatrick	df318ad6b3	(core) add a mac-specific sandbox for development Summary: docker is slow on macs, so use native sandbox-exec by default for tests involving python3 on macs. Test Plan: updated test Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D3068	2021-10-07 14:14:25 -04:00
Paul Fitzpatrick	0fffe918c1	(core) don't garble document url in SELF_HYPERLINK on forks Summary: There was a bad regex processing the document url passed to the sandbox. Test Plan: added test Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D3048	2021-09-28 16:53:48 -04:00
Paul Fitzpatrick	e492dfdb22	(core) add experimental support for python3 in staging Summary: This adds `runsc` and `python3` to the grist-server images. For deployments with GRIST_EXPERIMENTAL_PLUGINS=1 (dev + staging but not prod) a hack is added to use `python3` under `runsc` for documents with a special title (`activate-python3-magic` or similar). This will simplify experiments on behavior of this configuration under realistic conditions. Hopefully, before landing this, I'll be able to switch to storing a python flag in a document options cell being added by @georgegevoian in a parallel diff, since using the doc title is super hacky :-). Test Plan: tested manually on worker built locally Reviewers: dsagal, alexmojaki Reviewed By: dsagal, alexmojaki Subscribers: georgegevoian Differential Revision: https://phab.getgrist.com/D2998	2021-08-26 09:39:26 -04:00
Alex Hall	4d526da58f	(core) Move file import plugins into core/sandbox/grist Summary: Move all the plugins python code into the main folder with the core code. Register file importing functions in the same main.py entrypoint as the data engine. Remove options relating to different entrypoints and code directories. The only remaining plugin-specific option in NSandbox is the import directory/mount, i.e. where files to be parsed are placed. Test Plan: this Reviewers: paulfitz Reviewed By: paulfitz Subscribers: dsagal Differential Revision: https://phab.getgrist.com/D2965	2021-08-09 18:37:14 +02:00
Paul Fitzpatrick	750c78763e	(core) add python2 to gvisor Dockerfile, for use in making comparisons Summary: This adds python2 to the gvisor sandbox image. It can be used instead of the default python3 by setting PYTHON_VERSION to 2 (or calling run.py with python2). This is useful for making side-by-side comparisons with code running python3. Test Plan: manual Reviewers: alexmojaki Reviewed By: alexmojaki Differential Revision: https://phab.getgrist.com/D2957	2021-08-03 17:51:31 -04:00
Paul Fitzpatrick	bb8cb2593d	(core) support python3 in grist-core, and running engine via docker and/or gvisor Summary: * Moves essential plugins to grist-core, so that basic imports (e.g. csv) work. * Adds support for a `GRIST_SANDBOX_FLAVOR` flag that can systematically override how the data engine is run. - `GRIST_SANDBOX_FLAVOR=pynbox` is "classic" nacl-based sandbox. - `GRIST_SANDBOX_FLAVOR=docker` runs engines in individual docker containers. It requires an image specified in `sandbox/docker` (alternative images can be named with `GRIST_SANDBOX` flag - need to contain python and engine requirements). It is a simple reference implementation for sandboxing. - `GRIST_SANDBOX_FLAVOR=unsandboxed` runs whatever local version of python is specified by a `GRIST_SANDBOX` flag directly, with no sandboxing. Engine requirements must be installed, so an absolute path to a python executable in a virtualenv is easiest to manage. - `GRIST_SANDBOX_FLAVOR=gvisor` runs the data engine via gvisor's runsc. Experimental, with implementation not included in grist-core. Since gvisor runs on Linux only, this flavor supports wrapping the sandboxes in a single shared docker container. * Tweaks some recent express query parameter code to work in grist-core, which has a slightly different version of express (smoke test doesn't catch this since in Jenkins core is built within a workspace that has node_modules, and wires get crossed - in a dev environment the problem on master can be seen by doing `buildtools/build_core.sh /tmp/any_path_outside_grist`). The new sandbox options do not have tests yet, nor does this they change the behavior of grist servers today. They are there to clean up and consolidate a collection of patches I've been using that were getting cumbersome, and make it easier to run experiments. I haven't looked closely at imports beyond core. Test Plan: tested manually against regular grist and grist-core, including imports Reviewers: alexmojaki, dsagal Reviewed By: alexmojaki Differential Revision: https://phab.getgrist.com/D2942	2021-07-28 09:02:32 -04:00
Paul Fitzpatrick	4222f1ed32	(core) communicate with sandbox via standard pipes Summary: This switches to using stdin/stdout for RPC calls to the sandbox, rather than specially allocated side channels. Plain text error information remains on stderr. The motivation for the change is to simplify use of sandboxes, some of which support extra file descriptors and some of which don't. The new style of communication is made the default, but I'm not committed to this, just that it be easy to switch to if needed. It is possible I'll need to switch the communication method again in the near future. One reason not to make this default would be windows support, which is likely broken since stdin/stdout are by default in text mode. Test Plan: existing tests pass Reviewers: dsagal, alexmojaki Reviewed By: dsagal, alexmojaki Differential Revision: https://phab.getgrist.com/D2897	2021-07-12 06:45:47 -04:00
Alex Hall	84ddbc448b	(core) Add test_replay for easily replaying data sent to the sandbox purely within python Summary: Run JS with a value for SANDBOX_BUFFERS_DIR, then run test_replay in python with the same value to replay just the python code. See test_replay.py for more info. Test Plan: Record some data, e.g. `SANDBOX_BUFFERS_DIR=manual npm start` or `SANDBOX_BUFFERS_DIR=server ./test/testrun.sh server`. Then run `SANDBOX_BUFFERS_DIR=server python -m unittest test_replay` from within `core/sandbox/grist` to replay the input from the JS. Sample of the output will look like this: ``` Checking /tmp/sandbox_buffers/server/2021-06-16T15:13:59.958Z True Checking /tmp/sandbox_buffers/server/2021-06-16T15:16:37.170Z True Checking /tmp/sandbox_buffers/server/2021-06-16T15:14:22.378Z True ``` Reviewers: paulfitz, dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2866	2021-06-30 16:56:09 +02:00
Alex Hall	305b133c59	(core) Remaining Python 3 compatibility changes Summary: Biggest change is turning everything to unicode Test Plan: The tests Reviewers: dsagal, paulfitz Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2875	2021-06-25 12:00:58 +02:00
Alex Hall	8a940676e9	(core) Generic tools for recording pycalls, deterministic mode. Summary: Replaces https://phab.getgrist.com/D2854 Refactoring of NSandbox: - Simplify arguments to NSandbox.spawn. Only half the arguments were used depending on the flavour, adding a layer of confusion. - Ensure the same environment variables are passed to both flavours of sandbox - Simplify passing down environment variables. Implement deterministic mode with libfaketime and a seeded random instance. - Include static prebuilt libfaketime.so.1, may need another solution in future for other platforms. Recording pycalls: - Add script recordDocumentPyCalls.js to open a single document outside of tests. - Refactor out recordPyCalls.ts to support various uses. - Add afterEach hook to save all pycalls from server tests under $PYCALLS_DIR - Make docTools usable without mocha. - Add useLocalDoc and loadLocalDoc for loading non-fixture documents Test Plan: Made a document with formulas NOW() and UUID() Compare two document openings in normal mode: diff <(test/recordDocumentPyCalls.js samples/d4W6NrzCMNVSVD6nWgNrGC.grist /dev/stdout) \ <(test/recordDocumentPyCalls.js samples/d4W6NrzCMNVSVD6nWgNrGC.grist /dev/stdout) Output: < 1623407499.58132, --- > 1623407499.60376, 1195c1195 < "B": "bd2487f6-63c9-4f02-bbbc-5c0d674a2dc6" --- > "B": "22e1a4fd-297f-4b86-91a2-bc42cc6da4b2" `export DETERMINISTIC_MODE=1` and repeat. diff is empty! Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D2857	2021-06-15 20:58:05 +02:00
Paul Fitzpatrick	6d2e8378cd	(core) fix some tests for node v14 Test Plan: existing tests pass Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2814	2021-05-12 22:49:53 -04:00
Paul Fitzpatrick	9f234b758d	(core) freshen grist-core build Summary: * adds a smoke test to grist-core * fixes a problem with highlight.js failing to load correctly * skips survey for default user * freshens docker build Utility files in test/nbrowser are moved to core/test/nbrowser, so that gristUtils are available there. This increased the apparent size of the diff as "./" import paths needed replacing with "test/nbrowser/" paths. The utility files are untouched, except for the code to start a server - it now has a small grist-core specific conditional in it. Test Plan: adds test Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2768	2021-04-03 09:41:06 -04:00
Paul Fitzpatrick	0c5f7cf0a7	(core) add SELF_HYPERLINK() function for generating links to the current document Summary: * Adds a `SELF_HYPERLINK()` python function, with optional keyword arguments to set a label, the page, and link parameters. * Adds a `UUID()` python function, since using python's uuid.uuidv4 hits a problem accessing /dev/urandom in the sandbox. UUID makes no particular quality claims since it doesn't use an audited implementation. A difficult to guess code is convenient for some use cases that `SELF_HYPERLINK()` enables. The canonical URL for a document is mutable, but older versions generally forward. So for implementation simplicity the document url is passed it on sandbox creation and remains fixed throughout the lifetime of the sandbox. This could and should be improved in future. The URL is passed into the sandbox as a `DOC_URL` environment variable. The code for creating the URL is factored out of `Notifier.ts`. Since the url is a function of the organization as well as the document, some rejiggering is needed to make that information available to DocManager. On document imports, the new document is registered in the database slightly earlier now, in order to keep the procedure for constructing the URL in different starting conditions more homogeneous. Test Plan: updated test Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2759	2021-03-18 19:37:07 -04:00
Paul Fitzpatrick	b82eec714a	(core) move data engine code to core Summary: this moves sandbox/grist to core, and adds a requirements.txt file for reconstructing the content of sandbox/thirdparty. Test Plan: existing tests pass. Tested core functionality manually. Tested docker build manually. Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2563	2020-07-29 08:57:25 -04:00
Paul Fitzpatrick	5ef889addd	(core) move home server into core Summary: This moves enough server material into core to run a home server. The data engine is not yet incorporated (though in manual testing it works when ported). Test Plan: existing tests pass Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2552	2020-07-21 20:39:10 -04:00

17 Commits