Skip to content

bigbigwatermalon/FinSQL

Repository files navigation

FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis

bull_icon

This repository contains the code and dataset for the paper FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis.

Before we start, we need to download the dataset and the database from Google Drive and put them in the directory dataset/

Then the file structure should be like this:

. tree           
├── BULL-cn
├── BULL-en
├── README.md
├── database_cn
├── database_en
└── get_dev.py

Later, we need to preprocess the dataset

bash scripts/preprocessing_finsql.sh

Then train the Cross-Encoder model:

bash scripts/train_text2sql_schema_item_classifier_finsql.sh

At last, use the Cross-Encoder model to predict the dev set:

bash scripts/generate_text2sql_dataset_finsql.sh

After we preprossing the dataset, we perform hybrid data augmentation:

bash scripts/hybrid_augmentation.sh 

Then we start to train the LLM model:

bash ds_sft.sh  

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published