Commit Graph

2 Commits

Author SHA1 Message Date
Alex Hall
d88f79bc5e (core) Add cases involving lookups to formula dataset
Summary: I looked through the template documents mentioned in `formula-dataset-index.csv` and selected formulas involving lookups to add to the CSV, particularly nontrivial formulas.

Test Plan: Running the test script on the new dataset gives a score of 47/61 compared to the previous 45/47, i.e. it scores 2/14 on the new entries. Lookups are clearly challenging and we'll need to add more information to the prompt, maybe even consider a more complicated strategy than a single prompt. This diff is purely for expanding the dataset, improving performance will come later.

Reviewers: paulfitz

Reviewed By: paulfitz

Differential Revision: https://phab.getgrist.com/D3931
2023-06-26 13:18:52 +02:00
Cyprien P
1ff93f89c2 (core) Porting the AI evaluation script
Summary:
Porting script that run an evaluation against our formula dataset.

To test you need an openai key (see here: https://platform.openai.com/)
or hugging face (it should work as well), then checkout the branch and run

`OPENAI_API_KEY=<my_openai_api_key> node core/test/formula-dataset/runCompletion.js`

Test Plan:
Needs manually testing: so far there is no plan to make it part of CI.

The current score is somewhere around 34 successful prompts over a total of 47.

Reviewers: paulfitz

Reviewed By: paulfitz

Subscribers: jarek

Differential Revision: https://phab.getgrist.com/D3816
2023-03-15 14:54:28 +01:00