Summary: I looked through the template documents mentioned in `formula-dataset-index.csv` and selected formulas involving lookups to add to the CSV, particularly nontrivial formulas.
Test Plan: Running the test script on the new dataset gives a score of 47/61 compared to the previous 45/47, i.e. it scores 2/14 on the new entries. Lookups are clearly challenging and we'll need to add more information to the prompt, maybe even consider a more complicated strategy than a single prompt. This diff is purely for expanding the dataset, improving performance will come later.
Reviewers: paulfitz
Reviewed By: paulfitz
Differential Revision: https://phab.getgrist.com/D3931
Summary:
Porting script that run an evaluation against our formula dataset.
To test you need an openai key (see here: https://platform.openai.com/)
or hugging face (it should work as well), then checkout the branch and run
`OPENAI_API_KEY=<my_openai_api_key> node core/test/formula-dataset/runCompletion.js`
Test Plan:
Needs manually testing: so far there is no plan to make it part of CI.
The current score is somewhere around 34 successful prompts over a total of 47.
Reviewers: paulfitz
Reviewed By: paulfitz
Subscribers: jarek
Differential Revision: https://phab.getgrist.com/D3816