(core) Add BulkAddOrUpdateRecord action for efficiency

Summary:
This diff adds a new `BulkAddOrUpdateRecord` user action which is what is sounds like:

- A bulk version of the existing `AddOrUpdateRecord` action.
- Much more efficient for operating on many records than applying many individual actions.
- Column values are specified as maps from `colId` to arrays of values as usual.
- Produces bulk versions of `AddRecord` and `UpdateRecord` actions instead of many individual actions.

Examples of users wanting to use something like `AddOrUpdateRecord` with large numbers of records:

- https://grist.slack.com/archives/C0234CPPXPA/p1651789710290879
- https://grist.slack.com/archives/C0234CPPXPA/p1660743493480119
- https://grist.slack.com/archives/C0234CPPXPA/p1660333148491559
- https://grist.slack.com/archives/C0234CPPXPA/p1663069291726159

I tested what made many `AddOrUpdateRecord` actions slow in the first place. It was almost entirely due to producing many individual `AddRecord` user actions. About half of that time was for processing the resulting `AddRecord` doc actions. Lookups and updates were not a problem. With these changes, the slowness is gone.

The Python user action implementation is more complex but there are no surprises. The JS API now groups `records` based on the keys of `require` and `fields` so that `BulkAddOrUpdateRecord` can be applied to each group.

Test Plan: Update and extend Python and DocApi tests.

Reviewers: jarek, paulfitz

Reviewed By: jarek, paulfitz

Subscribers: jarek

Differential Revision: https://phab.getgrist.com/D3642
This commit is contained in:
Alex Hall
2022-09-28 15:13:07 +02:00
parent df65219729
commit 1864b7ba5d
6 changed files with 261 additions and 40 deletions

View File

@@ -77,6 +77,10 @@ function isAclTable(tableId: string): boolean {
return ['_grist_ACLRules', '_grist_ACLResources'].includes(tableId);
}
function isAddOrUpdateRecordAction(a: UserAction): boolean {
return ['AddOrUpdateRecord', 'BulkAddOrUpdateRecord'].includes(String(a[0]));
}
// A list of key metadata tables that need special handling. Other metadata tables may
// refer to material in some of these tables but don't need special handling.
// TODO: there are other metadata tables that would need access control, or redesign -
@@ -128,8 +132,9 @@ const OTHER_RECOGNIZED_ACTIONS = new Set([
'BulkRemoveRecord',
'ReplaceTableData',
// A data action handled specially because of read needs.
// Data actions handled specially because of read needs.
'AddOrUpdateRecord',
'BulkAddOrUpdateRecord',
// Groups of actions.
'ApplyDocActions',
@@ -979,21 +984,22 @@ export class GranularAccess implements GranularAccessForBundle {
// way to do that within the data engine as currently
// formulated. Could perhaps be done for on-demand tables though.
private async _checkAddOrUpdateAccess(docSession: OptDocSession, actions: UserAction[]) {
if (!scanActionsRecursively(actions, (a) => a[0] === 'AddOrUpdateRecord')) {
if (!scanActionsRecursively(actions, isAddOrUpdateRecordAction)) {
// Don't need to apply this particular check.
return;
}
// Fail if being combined with anything fancy.
if (scanActionsRecursively(actions, (a) => {
const name = a[0];
return !['ApplyUndoActions', 'ApplyDocActions', 'AddOrUpdateRecord'].includes(String(name)) &&
return !['ApplyUndoActions', 'ApplyDocActions'].includes(String(name)) &&
!isAddOrUpdateRecordAction(a) &&
!(isDataAction(a) && !getTableId(a).startsWith('_grist_'));
})) {
throw new Error('Can only combine AddOrUpdate with simple data changes');
}
// Check for read access, and that we're not touching metadata.
await applyToActionsRecursively(actions, async (a) => {
if (a[0] !== 'AddOrUpdateRecord') { return; }
if (!isAddOrUpdateRecordAction(a)) { return; }
const tableId = validTableIdString(a[1]);
if (tableId.startsWith('_grist_')) {
throw new Error(`AddOrUpdate cannot yet be used on metadata tables`);