Finetuning GoldenRetriever

GoldenRetriever may be finetuned using the finetuning script. The script leverages on triplet generators subpackage as well as a triplet loss function.

Sample usage:

python -m src.finetune.main

Evaluation methods

Functions to evaluate model These includes metrics and other utility functions

src.finetune.eval.eval_model(model, df, test_dict)

Evalutate golden retriever object

Parameters:
  • model (GoldenRetriever's Model class object) – GoldenRetriever’s Model class object
  • df (pd.DataFrame) – Contains the query response pairs
  • test_dict (dict) – Contains the indices of train test pairs
Return overall_eval:
 

pd.DataFrame that contains the metrics

Return eval_dict:
 

dict of the same metrics

src.finetune.eval.get_eval_dict(ranks)

Score the predicted ranks according to various metricss

Parameters:ranks (list) – predicted ranks of the correct responses
Returns:dict that contains the metrics and their respective keys
src.finetune.eval.mrr(ranks)

Calculate mean reciprocal rank Function taken from: https://github.com/google/retrieval-qa-eval/blob/master/squad_eval.py

Parameters:ranks (list) – predicted ranks of the correct responses
Returns:float value containing the MRR
src.finetune.eval.recall_at_n(ranks, n=3)

Calculate recall @ N Function taken from: https://github.com/google/retrieval-qa-eval/blob/master/squad_eval.py

Parameters:ranks (list) – predicted ranks of the correct responses
Returns:float value containing the Recall@N

Finetuning Triplet Generators

TRIPLET GENERATORS: Functions to generate training triplets for triplet loss to finetune GR

src.finetune.generators.gen(query, response, neg_response, CONFIG, shuffle_data=False)

Create a generator that of queries, responses and negative responses.

src.finetune.generators.hard_triplet_generator(df, train_dict, model, CONFIG)

Returns a generator that gives batches of training triplets

src.finetune.generators.random_triplet_generator(df, train_dict, CONFIG)

Returns a generator that gives batches of training triplets