Summary:
- A bunch of optimizations guided by python profiling (esp. py-spy)
- Big one is optimizing Record/RecordSet attribute access
- Adds tracemalloc printout when running test_replay with PYTHONTRACEMALLOC=1 (on PY3)
(but memory size is barely affected by these changes)
- Testing with RECORD_SANDBOX_BUFFERS_DIR, loading and calculating a particular
very large doc (CRM), time taken improved from 73.9s to 54.8s (26% faster)
Test Plan: No behavior changes intended; relying on existing tests to verify that.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3781
Summary:
Adds a Python function `REQUEST` which makes an HTTP GET request. Behind the scenes it:
- Raises a special exception to stop trying to evaluate the current cell and just keep the existing value.
- Notes the request arguments which will be returned by `apply_user_actions`.
- Makes the actual request in NodeJS, which sends back the raw response data in a new action `RespondToRequests` which reevaluates the cell(s) that made the request.
- Wraps the response data in a class which mimics the `Response` class of the `requests` library.
In certain cases, this asynchronous flow doesn't work and the sandbox will instead synchronously call an exported JS method:
- When reevaluating a single cell to get a formula error, the request is made synchronously.
- When a formula makes multiple requests, the earlier responses are retrieved synchronously from files which store responses as long as needed to complete evaluating formulas. See https://grist.slack.com/archives/CL1LQ8AT0/p1653399747810139
Test Plan: Added Python and nbrowser tests.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Subscribers: paulfitz, dsagal
Differential Revision: https://phab.getgrist.com/D3429
Summary:
Adds some special handling to summary table and lookup logic:
- Source rows with empty choicelists/reflists get a corresponding summary row with an empty string/reference when grouping by that column, instead of excluding them from any group
- Adds a new `QueryOperation` 'empty' in the client which is used in `LinkingState`, `QuerySet`, and `recursiveMoveToCursorPos` to match empty lists in source tables against falsy values in linked summary tables.
- Adds a new parameter `match_empty` to the Python `CONTAINS` function so that regular formulas can implement the same behaviour as summary tables. See https://grist.slack.com/archives/C0234CPPXPA/p1654030490932119
- Uses the new `match_empty` argument in the formula generated for the `group` column when detaching a summary table.
Test Plan: Updated and extended Python and nbrowser tests of summary tables grouped by choicelists to test for new behaviour with empty lists.
Reviewers: georgegevoian
Reviewed By: georgegevoian
Differential Revision: https://phab.getgrist.com/D3471
Summary:
This is a fix for a bug discussed in https://grist.slack.com/archives/C069RUP71/p1645138610722889
I still haven't completely wrapped my head around it or figured out how to make a simple reproducible example, but the problem seems to be that a lookup can happen before the column(s) being looked up (the summary helper column in this case) have been computed fully (I think it got interrupted halfway by an OrderError). `do_lookup` would check via `engine._use_node` that the row IDs it found had all been computed already, but there might still be other rows that hadn't been computed yet and would also have values matching the lookup key, so it missed those.
This diff instead calls `_use_node` with no `row_ids` argument, which should ensure that all rows have already been computed.
At first I was worried about how this would affect performance, which led me down an optimisation rabbit hole, hence a bit of unrelated cleanup here and also https://phab.getgrist.com/D3310 . But it doesn't seem to be a problem, and IIUC it should actually make things better, although this code is pretty confusing.
Test Plan: Tested manually that the doc no longer behaves weirdly
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: dsagal, paulfitz
Differential Revision: https://phab.getgrist.com/D3308
Summary:
This is an attempt to optimise Engine._use_node. It doesn't seem to actually improve overall performance significantly, but it shouldn't make it worse, and I think it's an improvement to the code.
It turns out that there's no need to track a stack of compute frames any more. The only time we get close to nested evaluation, we set allow_evaluation=False to prevent it actually happening. So there's only one 'frame' during actual evaluation, which means we can get rid of the concept of frames entirely. This allows simplifying the code and letting the computer do less work in general.
Test Plan: this
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: dsagal
Differential Revision: https://phab.getgrist.com/D3310
Summary:
Adds a method Table._num_rows using an empty lookup map column.
Adds a method Engine.count_rows which adds them all up.
Returns the count after applying user actions to be logged by ActiveDoc.
Test Plan: Added a unit test in Python. Tested log message manually.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3275
Summary: Having CONTAINS be a class is a pain, undoing that mistake now
Test Plan: none needed
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D2929
Summary:
Added CONTAINS 'function' which can be used in lookups
Changed LookupMapColumn._row_key_map to use right=set so one row can have many keys when CONTAINS is used.
Use CONTAINS to implement group column in summary table, while helper column in source table can reference and create multiple rows in summary table, especially when summarising by ChoiceList columns.
Use itertools.product to generate all combinations of lookup keys and groupby values.
cleanup
Test Plan: Added python unit tests.
Reviewers: dsagal
Reviewed By: dsagal
Subscribers: paulfitz, dsagal
Differential Revision: https://phab.getgrist.com/D2900
Summary:
Trigger formulas can be calculated for new records, or for new records and
updates to certain fields, or all fields. They do not recalculate on open,
and they MAY be set directly by the user, including for data-cleaning.
- Column metadata now includes recalcWhen and recalcDeps fields.
- Trigger formulas are NOT recalculated on open or on schema changes.
- When recalcWhen is "never", formula isn't calculated even for new records.
- When recalcWhen is "allupdates", formula is calculated for new records and
any manual (non-formula) updates to the record.
- When recalcWhen is "", formula is calculated for new records, and changes to
recalcDeps fields (which may be formula fields or column itself).
- A column whose recalcDeps includes itself is a "data-cleaning" column; a
value set by the user will still trigger the formula.
- All trigger-formulas receive a "value" argument (to support the case above).
Small changes
- Update RefLists (used for recalcDeps) when target rows are deleted.
- Add RecordList.__contains__ (for `rec in refList` or `id in refList` checks)
- Clarify that Calculate action has replaced load_done() in practice,
and use it in tests too, to better match reality.
Left for later:
- UI for setting recalcWhen / recalcDeps.
- Implementation of actions such as "Recalculate for all cells".
- Allowing trigger-formulas access to the current user's info.
Test Plan: Added a comprehensive python-side test for various trigger combinations
Reviewers: paulfitz, alexmojaki
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D2872
Summary: Changes that move towards python 3 compatibility that are easy to review without much thought
Test Plan: The tests
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D2873
Summary:
this moves sandbox/grist to core, and adds a requirements.txt
file for reconstructing the content of sandbox/thirdparty.
Test Plan:
existing tests pass.
Tested core functionality manually. Tested docker build manually.
Reviewers: dsagal
Reviewed By: dsagal
Differential Revision: https://phab.getgrist.com/D2563