gristlabs_grist-core

mirror of https://github.com/gristlabs/grist-core.git synced 2024-09-29 11:23:03 +00:00

Author	SHA1	Message	Date
George Gevoian	94eec5e906	(core) Add AI Assistant retry with shorter prompt Summary: If the longer OpenAI model exceeds the OpenAPI context length, we now perform another retry with a shorter variant of the formula prompt. The shorter prompt excludes non-referenced tables and lookup method definitions, which should help reduce token usage in documents with larger schemas. Test Plan: Server test. Reviewers: JakubSerafin Reviewed By: JakubSerafin Subscribers: JakubSerafin Differential Revision: https://phab.getgrist.com/D4184	2024-02-12 11:06:52 -05:00
Alex Hall	391c8ee087	(core) Allow assistant to evaluate current formula Summary: Replaces https://phab.getgrist.com/D3940, particularly to avoid doing potentially unwanted things automatically. Adds optional fields `evaluateCurrentFormula?: boolean; rowId?: number` to `FormulaAssistanceContext` (part of `AssistanceRequest`). When `evaluateCurrentFormula` is `true`, calls a new function `evaluate_formula` in the sandbox which computes the existing formula in the column (regardless of anything the AI may have suggested) and uses that to generate an additional system message which is added before the user's message. In theory this could be used in an interface where users ask why a formula doesn't work, including possibly a formula suggested by the AI. For now, it's only used in `runCompletion_impl.ts` for experimenting. Also cleaned up a bit, removing `_chatMode` which is always `true` now, and uses of `regenerate` which is always `false`. Test Plan: Updated `runCompletion_impl` to optionally use the new feature, in which case it now scores 51/68 instead of 49/68. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D3970	2023-07-24 21:59:00 +02:00
Dmitry S	534615dd50	(core) Update logging in sandbox code, and log tracebacks as single log messages. Summary: - Replace logger module by the standard module 'logging'. - When a log message from the sandbox includes newlines (e.g. for tracebacks), keep those lines together in the Node log message. Previously each line was a different message, making it difficult to view tracebacks, particularly in prod where each line becomes a separate message object. - Fix assorted lint errors. Test Plan: Added a test for the log-line splitting and escaping logic. Reviewers: georgegevoian Reviewed By: georgegevoian Differential Revision: https://phab.getgrist.com/D3956	2023-07-18 11:21:25 -04:00
Jarosław Sadziński	6e3f0f2b35	(core) Porting back AI formula backend Summary: This is a backend part for the formula AI. Test Plan: New tests Reviewers: paulfitz Reviewed By: paulfitz Subscribers: cyprien Differential Revision: https://phab.getgrist.com/D3786	2023-02-08 17:15:59 +01:00
Alex Hall	792565976a	(core) Show example values in formula autocomplete Summary: This diff adds a preview of the value of certain autocomplete suggestions, especially of the form `$foo.bar` or `user.email`. The main initial motivation was to show the difference between `$Ref` and `$Ref.DisplayCol`, but the feature is more general. The client now sends the row ID of the row being edited (along with the table and column IDs which were already sent) to the server to fetch autocomplete suggestions. The returned suggestions are now tuples `(suggestion, example_value)` where `example_value` is a string or null. The example value is simply obtained by evaluating (in a controlled way) the suggestion in the context of the given record and the current user. The string representation is similar to the standard `repr` but dates and datetimes are formatted, and the whole thing is truncated for efficiency. The example values are shown in the autocomplete popup separated from the actual suggestion by a number of spaces calculated to: 1. Clearly separate the suggestion from the values 2. Left-align the example values in most cases 3. Avoid having so much space such that connecting suggestions and values becomes visually difficult. The tokenization of the row is then tweaked to show the example in light grey to deemphasise it. Main discussion where the above was decided: https://grist.slack.com/archives/CDHABLZJT/p1661795588100009 The diff also includes various other small improvements and fixes: - The autocomplete popup is much wider to make room for long suggestions, particularly lookups, as pointed out in https://phab.getgrist.com/D3580#inline-41007. The wide popup is the reason a fancy solution was needed to position the example values. I didn't see a way to dynamically resize the popup based on suggestions, and it didn't seem like a good idea to try. - The `grist` and `python` labels previously shown on the right are removed. They were not helpful (https://grist.slack.com/archives/CDHABLZJT/p1659697086155179) and would get in the way of the example values. - Fixed a bug in our custom tokenization that caused function arguments to be weirdly truncated in the middle: https://grist.slack.com/archives/CDHABLZJT/p1661956353699169?thread_ts=1661953258.342739&cid=CDHABLZJT and https://grist.slack.com/archives/C069RUP71/p1659696778991339 - Hide suggestions involving helper columns like `$gristHelper_Display` or `Table.lookupRecords(gristHelper_Display=` (https://grist.slack.com/archives/CDHABLZJT/p1661953258342739). The former has been around for a while and seems to be a mistake. The fix is simply to use `is_visible_column` instead of `is_user_column`. Since the latter is not used anywhere else, and using it in the first place seems like a mistake more than anything else, I've also removed the function to prevent similar mistakes in the future. - Don't suggest private columns as lookup arguments: https://grist.slack.com/archives/CDHABLZJT/p1662133416652499?thread_ts=1661795588.100009&cid=CDHABLZJT - Only fetch fresh suggestions specifically after typing `lookupRecords(` or `lookupOne(` rather than just `(`, as this would needlessly hide function suggestions which could still be useful to see the arguments. However this only makes a difference when there are still multiple matching suggestions, otherwise Ace hides them anyway. Test Plan: Extended and updated several Python and browser tests. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D3611	2022-09-28 19:42:36 +02:00
Paul Fitzpatrick	7078922a65	(core) ensure randomness works when sandbox is cloned from a checkpoint Summary: This calls a new `initialize` method on the sandbox before we start doing calculations with it, to make sure that `random.seed()` has been called. Otherwise, if the sandbox is cloned from a checkpoint, the seed will have been reset. The `initialize` method includes the functionality previously done by `set_doc_url` since it is also initialization/personalization and this way we avoid introducing another round trip to the sandbox. Test Plan: tested with grist-core configured to use gvisor Reviewers: georgegevoian, dsagal Reviewed By: georgegevoian, dsagal Subscribers: alexmojaki Differential Revision: https://phab.getgrist.com/D3549	2022-07-27 14:59:27 -04:00
Alex Hall	9fffb491f9	(core) External requests Summary: Adds a Python function `REQUEST` which makes an HTTP GET request. Behind the scenes it: - Raises a special exception to stop trying to evaluate the current cell and just keep the existing value. - Notes the request arguments which will be returned by `apply_user_actions`. - Makes the actual request in NodeJS, which sends back the raw response data in a new action `RespondToRequests` which reevaluates the cell(s) that made the request. - Wraps the response data in a class which mimics the `Response` class of the `requests` library. In certain cases, this asynchronous flow doesn't work and the sandbox will instead synchronously call an exported JS method: - When reevaluating a single cell to get a formula error, the request is made synchronously. - When a formula makes multiple requests, the earlier responses are retrieved synchronously from files which store responses as long as needed to complete evaluating formulas. See https://grist.slack.com/archives/CL1LQ8AT0/p1653399747810139 Test Plan: Added Python and nbrowser tests. Reviewers: georgegevoian Reviewed By: georgegevoian Subscribers: paulfitz, dsagal Differential Revision: https://phab.getgrist.com/D3429	2022-06-17 21:53:20 +02:00
Dmitry S	e59dcc142d	(core) Show proper message on empty Excel import, rather than a code error Summary: - Previously showed "UnboundLocalError". Now will show: Import failed: Failed to parse Excel file. Error: No tables found (1 empty tables skipped) - Also fix logging for import code Test Plan: Added a test case Reviewers: georgegevoian Reviewed By: georgegevoian Differential Revision: https://phab.getgrist.com/D3396	2022-04-27 00:49:28 -04:00
Alex Hall	437d30bd9f	(core) Log number of rows in user tables in data engine Summary: Adds a method Table._num_rows using an empty lookup map column. Adds a method Engine.count_rows which adds them all up. Returns the count after applying user actions to be logged by ActiveDoc. Test Plan: Added a unit test in Python. Tested log message manually. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D3275	2022-02-22 00:59:56 +02:00
Alex Hall	e900f39da3	(core) Log statistics about table sizes Summary: Record numbers of rows, columns, cells, and bytes of marshalled data for most calls to table_data_from_db Export new function get_table_stats in the sandbox, which gives the raw numbers and totals. Get and log these stats in ActiveDoc right after loading tables, before Calculate, so they are logged even in case of errors. Tweak logging about number of tables, especially number of on-demand tables, to not only show in debug logging. Test Plan: Updated doc regression tests, that shows what the data looks like nicely. Reviewers: dsagal, paulfitz Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D3081	2021-10-21 17:54:20 +02:00
Paul Fitzpatrick	dd0f1be117	(core) get all tests working under python3/gvisor Summary: This verifies that all existing tests are capable of running under python3/gvisor, and fixes the small issues that came up. It does not yet activate python3 tests on all diffs, only diffs that specifically request them. * Adds a suffix in test names and output directories for tests run with PYTHON_VERSION=3, so that results of the same test run with and without the flag can be aggregated cleanly. * Adds support for checkpointing to the gvisor sandbox adapter. * Prepares a checkpoint made after grist python code has loaded in the gvisor sandbox. * Changes how `DOC_URL` is passed to the sandbox, since it can no longer be passed in as an environment variable when using checkpoints. * Uses the checkpoint to speed up tests using the gvisor sandbox, otherwise a lot of tests need more time (especially on mac under docker). * Directs jenkins to run all tests with python2 and python3 when a new file `buildtools/changelogs/python.txt` is touched (this diff counts as touching that file). * Tweaks miscellaneous tests - some needed fixes exposed by slightly different timing - a small number actually give different results in py3 (removal of `u` prefixes). - some needed a little more time The DOC_URL change is not the ultimate solution we want for DOC_URL. Eventually it should be a variable that gets updated, like the date perhaps. This is just a small pragmatic change to preserve existing behavior. Tests are run mindlessly as py3, and for some tests it won't change anything (e.g. if they do not use NSandbox). Tests are not run in parallel, doubling overall test time. Checkpoints could be useful in deployment, though this diff doesn't use them there. The application of checkpoints doesn't check for other configuration like 3-versus-5-pipe that we don't actually use. Python2 tests run using pynbox as always for now. The diff got sufficiently bulky that I didn't tackle running py3 on "regular" diffs in it. My preference, given that most tests don't appear to stress the python side of things, would be to make a selection of the tests that do and a few wild cards, and run those tests on both pythons rather then all of them. For diffs making a significant python change, I'd propose touching buildtools/changelogs/python.txt for full tests. But this is a conversation in progress. A total of 6886 tests ran on this diff. Test Plan: this is a step in preparing tests for py3 transition Reviewers: dsagal Reviewed By: dsagal Subscribers: dsagal Differential Revision: https://phab.getgrist.com/D3066	2021-10-18 17:44:15 -04:00
Alex Hall	4d526da58f	(core) Move file import plugins into core/sandbox/grist Summary: Move all the plugins python code into the main folder with the core code. Register file importing functions in the same main.py entrypoint as the data engine. Remove options relating to different entrypoints and code directories. The only remaining plugin-specific option in NSandbox is the import directory/mount, i.e. where files to be parsed are placed. Test Plan: this Reviewers: paulfitz Reviewed By: paulfitz Subscribers: dsagal Differential Revision: https://phab.getgrist.com/D2965	2021-08-09 18:37:14 +02:00
Alex Hall	5aed22dc1e	(core) Remove dead code for fetching snapshots Summary: Deletes code which was previously only used by SharedSharing.ts, which was deleted in D2894 Test Plan: no Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D2960	2021-08-04 15:42:31 +02:00
George Gevoian	e5eeb3ec80	(core) Add 'user' variable to trigger formulas Summary: The 'user' variable has a similar API to the one from access rules: it contains properties about a user, such as their full name and email address, as well as optional, user-defined attributes that are populated via user attribute tables. Test Plan: Python unit tests. Reviewers: alexmojaki, paulfitz, dsagal Reviewed By: alexmojaki, dsagal Subscribers: paulfitz, dsagal, alexmojaki Differential Revision: https://phab.getgrist.com/D2898	2021-07-15 15:18:32 -07:00
George Gevoian	9592e3610b	(core) Add 'value' to trigger formula autocomplete Summary: API signature for autocomplete updated to add column ID, which is necessary for exposing correct types for 'value'. Test Plan: Unit tests. Reviewers: alexmojaki Reviewed By: alexmojaki Subscribers: jarek, alexmojaki Differential Revision: https://phab.getgrist.com/D2896	2021-07-12 15:07:16 -07:00
Paul Fitzpatrick	4222f1ed32	(core) communicate with sandbox via standard pipes Summary: This switches to using stdin/stdout for RPC calls to the sandbox, rather than specially allocated side channels. Plain text error information remains on stderr. The motivation for the change is to simplify use of sandboxes, some of which support extra file descriptors and some of which don't. The new style of communication is made the default, but I'm not committed to this, just that it be easy to switch to if needed. It is possible I'll need to switch the communication method again in the near future. One reason not to make this default would be windows support, which is likely broken since stdin/stdout are by default in text mode. Test Plan: existing tests pass Reviewers: dsagal, alexmojaki Reviewed By: dsagal, alexmojaki Differential Revision: https://phab.getgrist.com/D2897	2021-07-12 06:45:47 -04:00
Alex Hall	84ddbc448b	(core) Add test_replay for easily replaying data sent to the sandbox purely within python Summary: Run JS with a value for SANDBOX_BUFFERS_DIR, then run test_replay in python with the same value to replay just the python code. See test_replay.py for more info. Test Plan: Record some data, e.g. `SANDBOX_BUFFERS_DIR=manual npm start` or `SANDBOX_BUFFERS_DIR=server ./test/testrun.sh server`. Then run `SANDBOX_BUFFERS_DIR=server python -m unittest test_replay` from within `core/sandbox/grist` to replay the input from the JS. Sample of the output will look like this: ``` Checking /tmp/sandbox_buffers/server/2021-06-16T15:13:59.958Z True Checking /tmp/sandbox_buffers/server/2021-06-16T15:16:37.170Z True Checking /tmp/sandbox_buffers/server/2021-06-16T15:14:22.378Z True ``` Reviewers: paulfitz, dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2866	2021-06-30 16:56:09 +02:00
Alex Hall	305b133c59	(core) Remaining Python 3 compatibility changes Summary: Biggest change is turning everything to unicode Test Plan: The tests Reviewers: dsagal, paulfitz Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2875	2021-06-25 12:00:58 +02:00
Alex Hall	16f297a250	(core) Simple Python 3 compatibility changes Summary: Changes that move towards python 3 compatibility that are easy to review without much thought Test Plan: The tests Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2873	2021-06-22 17:13:17 +02:00
Dmitry S	de35be6b0a	(core) Checks that an ACL formula can be parsed, and prevent saving unparsable ACL rules. Summary: - Fix error-handling in bundleActions(), and wait for the full bundle to complete. (The omissions here were making it impossibly to react to errors from inside bundleActions()) - Catch problematic rules early enough to undo them, by trying out ruleCollection.update() on updated rules before the updates are applied. - Added checkAclFormula() call to DocComm that checks parsing and compiling formula, and reports errors. - In UI, prevent saving if any aclFormulas are invalid, or while waiting for the to get checked. - Also fixed some lint errors Test Plan: Added a test case of error reporting in ACL formulas. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D2689	2020-12-15 09:43:37 -05:00
Dmitry S	5b2de988b5	(core) Perform migrations of Grist schema using only metadata tables when possible. Summary: Loading all user data to run a migration is risky (creates more than usual memory pressure), and almost never needed (only one migration requires it). This diff attempts to run migrations using only metadata (_grist_* tables), but retries if the sandbox tells it that all data is needed. The intent is for new migrations to avoid needing all data. Test Plan: Added a somewhat contrived unittest. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D2659	2020-11-11 19:21:40 -05:00
Paul Fitzpatrick	b82eec714a	(core) move data engine code to core Summary: this moves sandbox/grist to core, and adds a requirements.txt file for reconstructing the content of sandbox/thirdparty. Test Plan: existing tests pass. Tested core functionality manually. Tested docker build manually. Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2563	2020-07-29 08:57:25 -04:00

22 Commits