Skip to content

A public repository collecting links to state of the art QA and evaluation sets for various ML and LLM applications

Notifications You must be signed in to change notification settings

kjappelbaum/matchem-eval

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

matchem-eval

A public repository collecting links to state-of-the-art QA and evaluation sets for various ML and LLM applications

  • MaScQA: A dataset of 650 challenging questions from the materials domain that require the knowledge and skills of a materials science student who has cleared their undergraduate degree. Questions are classified based on their structure and the materials science domain-based subcategories.
  • ChemQA: ChemQA is a Multimodal Question-and-Answering dataset on chemistry reasoning. This work is inspired by IsoBench and ChemLLMBench. Containts 5 QA Tasks in total: Counting Numbers of Carbons and Hydrogens in Organic Molecules, Calculating Molecular Weights in Organic Molecules, Name Conversion: From SMILES to IUPAC, Molecule Captioning and Editing, and retro-synthesis Planning: inspired by [2], adapted from dataset provided in [4], following the same training, validation and evaluation splits.
  • ChemLLMBench: A comprehensive benchmark on eight chemistry tasks

About

A public repository collecting links to state of the art QA and evaluation sets for various ML and LLM applications

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published