gristlabs_grist-core

mirror of https://github.com/gristlabs/grist-core.git synced 2024-10-27 20:44:07 +00:00

Author	SHA1	Message	Date
Dmitry S	9d4eeda480	(core) Python optimizations to speed up data engine Summary: - A bunch of optimizations guided by python profiling (esp. py-spy) - Big one is optimizing Record/RecordSet attribute access - Adds tracemalloc printout when running test_replay with PYTHONTRACEMALLOC=1 (on PY3) (but memory size is barely affected by these changes) - Testing with RECORD_SANDBOX_BUFFERS_DIR, loading and calculating a particular very large doc (CRM), time taken improved from 73.9s to 54.8s (26% faster) Test Plan: No behavior changes intended; relying on existing tests to verify that. Reviewers: georgegevoian Reviewed By: georgegevoian Differential Revision: https://phab.getgrist.com/D3781	2023-02-09 12:49:58 -05:00
Alex Hall	9fffb491f9	(core) External requests Summary: Adds a Python function `REQUEST` which makes an HTTP GET request. Behind the scenes it: - Raises a special exception to stop trying to evaluate the current cell and just keep the existing value. - Notes the request arguments which will be returned by `apply_user_actions`. - Makes the actual request in NodeJS, which sends back the raw response data in a new action `RespondToRequests` which reevaluates the cell(s) that made the request. - Wraps the response data in a class which mimics the `Response` class of the `requests` library. In certain cases, this asynchronous flow doesn't work and the sandbox will instead synchronously call an exported JS method: - When reevaluating a single cell to get a formula error, the request is made synchronously. - When a formula makes multiple requests, the earlier responses are retrieved synchronously from files which store responses as long as needed to complete evaluating formulas. See https://grist.slack.com/archives/CL1LQ8AT0/p1653399747810139 Test Plan: Added Python and nbrowser tests. Reviewers: georgegevoian Reviewed By: georgegevoian Subscribers: paulfitz, dsagal Differential Revision: https://phab.getgrist.com/D3429	2022-06-17 21:53:20 +02:00
Alex Hall	1c89d08ea3	(core) Add a row to summary tables grouped by list column(s) corresponding to empty lists Summary: Adds some special handling to summary table and lookup logic: - Source rows with empty choicelists/reflists get a corresponding summary row with an empty string/reference when grouping by that column, instead of excluding them from any group - Adds a new `QueryOperation` 'empty' in the client which is used in `LinkingState`, `QuerySet`, and `recursiveMoveToCursorPos` to match empty lists in source tables against falsy values in linked summary tables. - Adds a new parameter `match_empty` to the Python `CONTAINS` function so that regular formulas can implement the same behaviour as summary tables. See https://grist.slack.com/archives/C0234CPPXPA/p1654030490932119 - Uses the new `match_empty` argument in the formula generated for the `group` column when detaching a summary table. Test Plan: Updated and extended Python and nbrowser tests of summary tables grouped by choicelists to test for new behaviour with empty lists. Reviewers: georgegevoian Reviewed By: georgegevoian Differential Revision: https://phab.getgrist.com/D3471	2022-06-09 23:38:14 +02:00
Alex Hall	d154b9afa7	(core) Make lookups depend on all rows Summary: This is a fix for a bug discussed in https://grist.slack.com/archives/C069RUP71/p1645138610722889 I still haven't completely wrapped my head around it or figured out how to make a simple reproducible example, but the problem seems to be that a lookup can happen before the column(s) being looked up (the summary helper column in this case) have been computed fully (I think it got interrupted halfway by an OrderError). `do_lookup` would check via `engine._use_node` that the row IDs it found had all been computed already, but there might still be other rows that hadn't been computed yet and would also have values matching the lookup key, so it missed those. This diff instead calls `_use_node` with no `row_ids` argument, which should ensure that all rows have already been computed. At first I was worried about how this would affect performance, which led me down an optimisation rabbit hole, hence a bit of unrelated cleanup here and also https://phab.getgrist.com/D3310 . But it doesn't seem to be a problem, and IIUC it should actually make things better, although this code is pretty confusing. Test Plan: Tested manually that the doc no longer behaves weirdly Reviewers: dsagal Reviewed By: dsagal Subscribers: dsagal, paulfitz Differential Revision: https://phab.getgrist.com/D3308	2022-03-14 19:42:51 +02:00
Alex Hall	2d0978559b	(core) Replace compute stack and frames with _current_node and _current_row_id Summary: This is an attempt to optimise Engine._use_node. It doesn't seem to actually improve overall performance significantly, but it shouldn't make it worse, and I think it's an improvement to the code. It turns out that there's no need to track a stack of compute frames any more. The only time we get close to nested evaluation, we set allow_evaluation=False to prevent it actually happening. So there's only one 'frame' during actual evaluation, which means we can get rid of the concept of frames entirely. This allows simplifying the code and letting the computer do less work in general. Test Plan: this Reviewers: dsagal Reviewed By: dsagal Subscribers: dsagal Differential Revision: https://phab.getgrist.com/D3310	2022-03-11 12:34:00 +02:00
Alex Hall	437d30bd9f	(core) Log number of rows in user tables in data engine Summary: Adds a method Table._num_rows using an empty lookup map column. Adds a method Engine.count_rows which adds them all up. Returns the count after applying user actions to be logged by ActiveDoc. Test Plan: Added a unit test in Python. Tested log message manually. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D3275	2022-02-22 00:59:56 +02:00
Alex Hall	f5981606e1	(core) Make CONTAINS a function for consistency with mkpydocs etc. Summary: Having CONTAINS be a class is a pain, undoing that mistake now Test Plan: none needed Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2929	2021-07-21 13:18:23 +02:00
Alex Hall	f7a9638992	(core) CONTAINS() and summarising by ChoiceList columns with flattening Summary: Added CONTAINS 'function' which can be used in lookups Changed LookupMapColumn._row_key_map to use right=set so one row can have many keys when CONTAINS is used. Use CONTAINS to implement group column in summary table, while helper column in source table can reference and create multiple rows in summary table, especially when summarising by ChoiceList columns. Use itertools.product to generate all combinations of lookup keys and groupby values. cleanup Test Plan: Added python unit tests. Reviewers: dsagal Reviewed By: dsagal Subscribers: paulfitz, dsagal Differential Revision: https://phab.getgrist.com/D2900	2021-07-19 16:35:35 +02:00
Dmitry S	a56714e1ab	(core) Implement trigger formulas (generalizing default formulas) Summary: Trigger formulas can be calculated for new records, or for new records and updates to certain fields, or all fields. They do not recalculate on open, and they MAY be set directly by the user, including for data-cleaning. - Column metadata now includes recalcWhen and recalcDeps fields. - Trigger formulas are NOT recalculated on open or on schema changes. - When recalcWhen is "never", formula isn't calculated even for new records. - When recalcWhen is "allupdates", formula is calculated for new records and any manual (non-formula) updates to the record. - When recalcWhen is "", formula is calculated for new records, and changes to recalcDeps fields (which may be formula fields or column itself). - A column whose recalcDeps includes itself is a "data-cleaning" column; a value set by the user will still trigger the formula. - All trigger-formulas receive a "value" argument (to support the case above). Small changes - Update RefLists (used for recalcDeps) when target rows are deleted. - Add RecordList.__contains__ (for `rec in refList` or `id in refList` checks) - Clarify that Calculate action has replaced load_done() in practice, and use it in tests too, to better match reality. Left for later: - UI for setting recalcWhen / recalcDeps. - Implementation of actions such as "Recalculate for all cells". - Allowing trigger-formulas access to the current user's info. Test Plan: Added a comprehensive python-side test for various trigger combinations Reviewers: paulfitz, alexmojaki Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D2872	2021-06-25 22:53:07 -04:00
Alex Hall	16f297a250	(core) Simple Python 3 compatibility changes Summary: Changes that move towards python 3 compatibility that are easy to review without much thought Test Plan: The tests Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2873	2021-06-22 17:13:17 +02:00
Paul Fitzpatrick	b82eec714a	(core) move data engine code to core Summary: this moves sandbox/grist to core, and adds a requirements.txt file for reconstructing the content of sandbox/thirdparty. Test Plan: existing tests pass. Tested core functionality manually. Tested docker build manually. Reviewers: dsagal Reviewed By: dsagal Differential Revision: https://phab.getgrist.com/D2563	2020-07-29 08:57:25 -04:00

11 Commits