Commit Graph

15 Commits

Author SHA1 Message Date
Dmitry S
d191859be7 (core) For exporting XLSX, do it memory-efficiently in a worker thread.
Summary:
- Excel exports were awfully memory-inefficient, causing occasional docWorker
  crashes. The fix is to use the "streaming writer" option of ExcelJS
  https://github.com/exceljs/exceljs#streaming-xlsx-writercontents. (Empirically
  on one example, max memory went down from 3G to 100M)
- It's also CPU intensive and synchronous, and can block node for tens of
  seconds. The fix is to use a worker-thread. This diff uses "piscina" library
  for a pool of threads.
- Additionally, adds ProcessMonitor that logs memory and cpu usage,
  particularly when those change significantly.
- Also introduces request cancellation, so that a long download cancelled by
  the user will cancel the work being done in the worker thread.

Test Plan:
Updated previous export tests; memory and CPU performance tested
manually by watching output of ProcessMonitor.

Difference visible in these log excerpts:

Before (total time to serve request 22 sec):
```
Telemetry processMonitor heapUsedMB=2187, heapTotalMB=2234, cpuAverage=1.13, intervalMs=17911
Telemetry processMonitor heapUsedMB=2188, heapTotalMB=2234, cpuAverage=0.66, intervalMs=5005
Telemetry processMonitor heapUsedMB=2188, heapTotalMB=2234, cpuAverage=0, intervalMs=5005
Telemetry processMonitor heapUsedMB=71, heapTotalMB=75, cpuAverage=0.13, intervalMs=5002
```
After (total time to server request 18 sec):
```
Telemetry processMonitor heapUsedMB=109, heapTotalMB=144, cpuAverage=0.5, intervalMs=5001
Telemetry processMonitor heapUsedMB=109, heapTotalMB=144, cpuAverage=1.39, intervalMs=5002
Telemetry processMonitor heapUsedMB=94, heapTotalMB=131, cpuAverage=1.13, intervalMs=5000
Telemetry processMonitor heapUsedMB=94, heapTotalMB=131, cpuAverage=1.35, intervalMs=5001
```
Note in "Before" that heapTotalMB goes up to 2GB in the first case, and "intervalMs" of 17 seconds indicates that node was unresponsive for that long. In the second case, heapTotalMB stays low, and the main thread remains responsive the whole time.

Reviewers: jarek

Reviewed By: jarek

Differential Revision: https://phab.getgrist.com/D3906
2023-06-01 12:06:48 -04:00
Dmitry S
65013331a3 (core) Fix imports into reference columns, and support two ways to import Numeric as a reference.
Summary:
- When importing into a Ref column, use lookupOne() formula for correct previews.
- When selecting columns to import into a Ref column, now a Numeric column like
  'Order' will produce two options: "Order" and "Order (as row ID)".
- Fixes exports to correct the formatting of visible columns. This addresses multiple bugs:
  1. Formatting wasn't used, e.g. a Ref showing a custom-formatted date was still presented as YYYY-MM-DD in CSVs.
  2. Ref showing a Numeric column was formatted as if a row ID (e.g. `Table1[1.5]`), which is very wrong.
- If importing into a table that doesn't have a primary view, don't switch page after import.

Refactorings:
- Generalize GenImporterView to be usable in more cases; removed near-duplicated logic from node side
- Some other refactoring in importing code.
- Fix field/column option selection in ValueParser
- Add NUM() helper to turn integer-valued floats into ints, useful for "as row ID" lookups.

Test Plan: Added test cases for imports into reference columns, updated Exports test fixtures.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3875
2023-05-02 10:28:14 -04:00
Louis Delbosc
c54e910fd6
Export table schema (#459)
* add endpoint
* Add table-schema transformation data
2023-03-16 17:37:24 -04:00
Paul Fitzpatrick
472a9a186e (core) control the distribution of attachment metadata
Summary:
for users who don't automatically have deep rights
to the document, provide them with attachment metadata only
for rows they have access to. This is a little tricky to
do efficiently. We provide attachment metadata when an
individual table is fetched, rather than on initial document
load, so we don't block that load on a full document scan.
We provide attachment metadata to a client when we see that
we are shipping rows mentioning particular attachments,
without making any effort to keep track of the metadata they
already have.

Test Plan: updated tests

Reviewers: dsagal, jarek

Reviewed By: dsagal, jarek

Differential Revision: https://phab.getgrist.com/D3722
2022-12-22 09:10:30 -05:00
Jarosław Sadziński
198beaab2a (core) Ref columns weren't filtered on csv/excel export for sections.
Summary:
Ref columns weren't filtred on section export.
Filters were applied to a display helper columns instead
of the actual columns.

Test Plan: Updated tests

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: alexmojaki

Differential Revision: https://phab.getgrist.com/D3644
2022-09-28 22:32:14 +02:00
Louis Delbosc
494a683332
Export xlsx #256 (#270)
XLSX export of active view / table

Co-authored-by: Louis Delbosc <louis.delbosc.prestataire@anct.gouv.fr>
Co-authored-by: Vincent Viers <vincent.viers@beta.gouv.fr>
2022-09-14 14:55:44 -04:00
Dmitry S
1c24bfc8a6 (core) Fix exports to CSV/XLSX/etc when data is restricted by access rules
Summary:
- The issue manifested as error "Cannot read property '0' of undefined" in some
  cases, and as "Blocked by table read access rules" in others (instead of
  limiting output to what's not blocked)
- Goes deeper: exports weren't respecting metadata censoring.
- The fix changes exports to use censored metadata, which addresses both errors above.
- Includes an improvement to column ordering in XLSX exports.

Test Plan: Add a server test for CSV and XLSX exports with access rules

Reviewers: paulfitz, georgegevoian

Reviewed By: paulfitz, georgegevoian

Subscribers: paulfitz

Differential Revision: https://phab.getgrist.com/D3615
2022-09-02 10:59:59 -04:00
Alex Hall
faec8177ab (core) Use MetaTableData more
Summary:
Add more method overrides to MetaTableData for extra type safety.

Use MetaTableData, MetaRowRecord, and getMetaTable in more places.

Test Plan: Mostly it just has to compile. Tested manually that types are being checked more strictly now, e.g. by adding a typo to property names. Some type casting has also been removed.

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: dsagal

Differential Revision: https://phab.getgrist.com/D3168
2021-12-07 17:09:58 +02:00
Jarosław Sadziński
53bdd6c8e1 (core) Exposing more descriptive errors from exports
Summary:
Exports used to show generic message on error.
Adding error description to the message.

Test Plan: Updated tests

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3157
2021-11-30 17:26:32 +01:00
George Gevoian
7fe4423a6f (core) Allow filtering hidden columns
Summary:
Existing filters are now moved out of fields
and into a new metadata table for filters, and the
client is updated to retrieve/update/save filters from
the new table. This enables storing of filters for
columns that don't have fields (notably, hidden columns).

Test Plan: Browser and server tests.

Reviewers: jarek

Reviewed By: jarek

Differential Revision: https://phab.getgrist.com/D3138
2021-11-22 10:26:08 -08:00
Jarosław Sadziński
3c72639e25 (core) Adding sort options for columns.
Summary:
Adding sort options for columns.
- Sort menu has a new option "More sort options" that opens up Sort left menu
- Each sort entry has an additional menu with 3 options
-- Order by choice index (for the Choice column, orders by choice position)
-- Empty last (puts empty values last in ascending order, first in descending order)
-- Natural sort (for Text column, compares strings with numbers as numbers)
Updated also CSV/Excel export and api sorting.
Most of the changes in this diff is a sort expression refactoring. Pulling out all the methods
that works on sortExpression array into a single namespace.

Test Plan: Browser tests

Reviewers: alexmojaki

Reviewed By: alexmojaki

Subscribers: dsagal, alexmojaki

Differential Revision: https://phab.getgrist.com/D3077
2021-11-03 15:31:39 +01:00
George Gevoian
0717ee627e (core) Relocate export urls to /download/
Summary:
Moves CSV and XLSX export urls under /download/, and
removes the document title query parameter which is now
retrieved from the backend.

Test Plan: No new tests. Existing tests that verify endpoints still function.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3010
2021-09-02 09:36:33 -07:00
George Gevoian
a6e08883e0 (core) Simple localization support and currency selector.
Summary:
- Grist document has a associated "locale" setting that affects how currency is formatted.
- Currency selector for number format.

Test Plan: not done

Reviewers: dsagal

Reviewed By: dsagal

Subscribers: paulfitz

Differential Revision: https://phab.getgrist.com/D2977
2021-08-26 13:36:49 -07:00
Paul Fitzpatrick
bb8cb2593d (core) support python3 in grist-core, and running engine via docker and/or gvisor
Summary:
 * Moves essential plugins to grist-core, so that basic imports (e.g. csv) work.
 * Adds support for a `GRIST_SANDBOX_FLAVOR` flag that can systematically override how the data engine is run.
   - `GRIST_SANDBOX_FLAVOR=pynbox` is "classic" nacl-based sandbox.
   - `GRIST_SANDBOX_FLAVOR=docker` runs engines in individual docker containers. It requires an image specified in `sandbox/docker` (alternative images can be named with `GRIST_SANDBOX` flag - need to contain python and engine requirements). It is a simple reference implementation for sandboxing.
   - `GRIST_SANDBOX_FLAVOR=unsandboxed` runs whatever local version of python is specified by a `GRIST_SANDBOX` flag directly, with no sandboxing. Engine requirements must be installed, so an absolute path to a python executable in a virtualenv is easiest to manage.
   - `GRIST_SANDBOX_FLAVOR=gvisor` runs the data engine via gvisor's runsc. Experimental, with implementation not included in grist-core. Since gvisor runs on Linux only, this flavor supports wrapping the sandboxes in a single shared docker container.
 * Tweaks some recent express query parameter code to work in grist-core, which has a slightly different version of express (smoke test doesn't catch this since in Jenkins core is built within a workspace that has node_modules, and wires get crossed - in a dev environment the problem on master can be seen by doing `buildtools/build_core.sh /tmp/any_path_outside_grist`).

The new sandbox options do not have tests yet, nor does this they change the behavior of grist servers today. They are there to clean up and consolidate a collection of patches I've been using that were getting cumbersome, and make it easier to run experiments.

I haven't looked closely at imports beyond core.

Test Plan: tested manually against regular grist and grist-core, including imports

Reviewers: alexmojaki, dsagal

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D2942
2021-07-28 09:02:32 -04:00
Jarosław Sadziński
08295a696b (core) Export to Excel and Send to drive
Summary:
Implementing export to excel and send to Google Drive feature.

As part of this feature few things were implemented:
- Server side google authentication exposed on url: (docs, docs-s, or localhost:8080)/auth/google
- Exporting grist documents as an excel file (xlsx)
- Storing exported grist document (in excel format) in Google Drive as a spreadsheet document.

Server side google authentication requires one new environmental variables
- GOOGLE_CLIENT_SECRET (required) used by authentication handler

Test Plan: Browser tests for exporting to excel.

Reviewers: paulfitz, dsagal

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D2924
2021-07-21 16:36:00 +02:00