Commit Graph

3 Commits

Author SHA1 Message Date
Alex Hall
6c90de4d62 (core) Switch excel import parsing from messytables+xlrd to openpyxl, and ignore empty rows
Summary:
Use openpyxl instead of messytables (which used xlrd internally) in import_xls.py.

Skip empty rows since excel files can easily contain huge numbers of them.

Drop support for xls files (which openpyxl doesn't support) in favour of the newer xlsx format.

Fix some details relating to python virtualenvs and dependencies, as Jenkins was failing to find new Python dependencies.

Test Plan: Mostly relying on existing tests. Updated various tests which referred to xls files instead of xlsx. Added a Python test for skipping empty rows.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3406
2022-05-12 14:43:21 +02:00
Alex Hall
5b16dd2e86 (core) Upgrade chardet
Summary: Upgrade chardet version from 2.3.0 to 4.0.0 to improve encoding detection when importing files with non-ascii characters.

Test Plan: the tests

Reviewers: jarek

Reviewed By: jarek

Differential Revision: https://phab.getgrist.com/D3080
2021-10-21 18:29:17 +02:00
Paul Fitzpatrick
bb8cb2593d (core) support python3 in grist-core, and running engine via docker and/or gvisor
Summary:
 * Moves essential plugins to grist-core, so that basic imports (e.g. csv) work.
 * Adds support for a `GRIST_SANDBOX_FLAVOR` flag that can systematically override how the data engine is run.
   - `GRIST_SANDBOX_FLAVOR=pynbox` is "classic" nacl-based sandbox.
   - `GRIST_SANDBOX_FLAVOR=docker` runs engines in individual docker containers. It requires an image specified in `sandbox/docker` (alternative images can be named with `GRIST_SANDBOX` flag - need to contain python and engine requirements). It is a simple reference implementation for sandboxing.
   - `GRIST_SANDBOX_FLAVOR=unsandboxed` runs whatever local version of python is specified by a `GRIST_SANDBOX` flag directly, with no sandboxing. Engine requirements must be installed, so an absolute path to a python executable in a virtualenv is easiest to manage.
   - `GRIST_SANDBOX_FLAVOR=gvisor` runs the data engine via gvisor's runsc. Experimental, with implementation not included in grist-core. Since gvisor runs on Linux only, this flavor supports wrapping the sandboxes in a single shared docker container.
 * Tweaks some recent express query parameter code to work in grist-core, which has a slightly different version of express (smoke test doesn't catch this since in Jenkins core is built within a workspace that has node_modules, and wires get crossed - in a dev environment the problem on master can be seen by doing `buildtools/build_core.sh /tmp/any_path_outside_grist`).

The new sandbox options do not have tests yet, nor does this they change the behavior of grist servers today. They are there to clean up and consolidate a collection of patches I've been using that were getting cumbersome, and make it easier to run experiments.

I haven't looked closely at imports beyond core.

Test Plan: tested manually against regular grist and grist-core, including imports

Reviewers: alexmojaki, dsagal

Reviewed By: alexmojaki

Differential Revision: https://phab.getgrist.com/D2942
2021-07-28 09:02:32 -04:00