gristlabs_grist-core

Archives/gristlabs_grist-core

Fork 0

mirror of https://github.com/gristlabs/grist-core.git synced 2026-03-02 04:09:24 +00:00

Commit Graph

Author	SHA1	Message	Date
Alex Hall	d88f79bc5e	(core) Add cases involving lookups to formula dataset Summary: I looked through the template documents mentioned in `formula-dataset-index.csv` and selected formulas involving lookups to add to the CSV, particularly nontrivial formulas. Test Plan: Running the test script on the new dataset gives a score of 47/61 compared to the previous 45/47, i.e. it scores 2/14 on the new entries. Lookups are clearly challenging and we'll need to add more information to the prompt, maybe even consider a more complicated strategy than a single prompt. This diff is purely for expanding the dataset, improving performance will come later. Reviewers: paulfitz Reviewed By: paulfitz Differential Revision: https://phab.getgrist.com/D3931	2023-06-26 13:18:52 +02:00
Cyprien P	1ff93f89c2	(core) Porting the AI evaluation script Summary: Porting script that run an evaluation against our formula dataset. To test you need an openai key (see here: https://platform.openai.com/) or hugging face (it should work as well), then checkout the branch and run `OPENAI_API_KEY=<my_openai_api_key> node core/test/formula-dataset/runCompletion.js` Test Plan: Needs manually testing: so far there is no plan to make it part of CI. The current score is somewhere around 34 successful prompts over a total of 47. Reviewers: paulfitz Reviewed By: paulfitz Subscribers: jarek Differential Revision: https://phab.getgrist.com/D3816	2023-03-15 14:54:28 +01:00

Author

SHA1

Message

Date

Alex Hall

d88f79bc5e

(core) Add cases involving lookups to formula dataset

Summary: I looked through the template documents mentioned in `formula-dataset-index.csv` and selected formulas involving lookups to add to the CSV, particularly nontrivial formulas.

Test Plan: Running the test script on the new dataset gives a score of 47/61 compared to the previous 45/47, i.e. it scores 2/14 on the new entries. Lookups are clearly challenging and we'll need to add more information to the prompt, maybe even consider a more complicated strategy than a single prompt. This diff is purely for expanding the dataset, improving performance will come later.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3931

2023-06-26 13:18:52 +02:00

Cyprien P

1ff93f89c2

(core) Porting the AI evaluation script

Summary:
Porting script that run an evaluation against our formula dataset.

To test you need an openai key (see here: https://platform.openai.com/)
or hugging face (it should work as well), then checkout the branch and run

`OPENAI_API_KEY=<my_openai_api_key> node core/test/formula-dataset/runCompletion.js`

Test Plan:
Needs manually testing: so far there is no plan to make it part of CI.

The current score is somewhere around 34 successful prompts over a total of 47.

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: jarek

Differential Revision: https://phab.getgrist.com/D3816

2023-03-15 14:54:28 +01:00

2 Commits