(core) Lossless imports

Summary:
- Removed string parsing and some type guessing code from parse_data.py. That logic is now implicitly done by ValueGuesser by leaving the initial column type as Any. parse_data.py mostly comes into play when importing files (e.g. Excel) containing values that already have types, i.e. numbers and dates.
- 0s and 1s are treated as numbers instead of booleans to keep imports lossless.
- Removed dateguess.py and test_dateguess.py.
- Changed what `guessDateFormat` does when multiple date formats work equally well for the given data, in order to be consistent with the old dateguess.py.
- Columns containing numbers are now always imported as Numeric, never Int.
- Removed `NullIfEmptyParser` because it was interfering with the new system. Its purpose was to avoid pointlessly changing a column from Any to Text when no actual data was inserted. A different solution to that problem was already added to `_ensure_column_accepts_data` in the data engine in a recent related diff.

Test Plan:
- Added 2 `nbrowser/Importer2` tests.
- Updated various existing tests.
- Extended testing of `guessDateFormat`. Added `guessDateFormats` to show how ambiguous dates are handled internally.

Reviewers: georgegevoian

Reviewed By: georgegevoian

Differential Revision: https://phab.getgrist.com/D3302
This commit is contained in:
Alex Hall
2022-03-04 19:37:56 +02:00
parent 9522438967
commit 321019217d
14 changed files with 150 additions and 785 deletions

View File

@@ -353,6 +353,7 @@ export class Importer extends DisposableWithEvents {
label: field.label(),
colId: destTableId ? field.colId() : null, // if inserting into new table, colId isn't defined
type: field.column().type(),
widgetOptions: field.column().widgetOptions(),
formula: field.column().formula()
})),
sourceCols: sourceFields.map((field) => field.colId())

View File

@@ -105,7 +105,7 @@ export async function prepTransformColInfo(docModel: DocModel, origCol: ColumnRe
let {dateFormat} = prevOptions;
if (!dateFormat) {
const colValues = tableData.getColValues(sourceCol.colId()) || [];
dateFormat = guessDateFormat(colValues.map(String)) || "YYYY-MM-DD";
dateFormat = guessDateFormat(colValues.map(String));
}
widgetOptions = dateTimeWidgetOptions(dateFormat, true);
break;