Medical Question Answering Transformer Model — Project

Transforming Medical Inquiry

5 min readAug 10, 2024

In the realm of healthcare, where precision and clarity are paramount, a new frontier is emerging with the advent of transformer models tailored for medical question answering. Imagine an AI system capable of deciphering complex medical queries, synthesizing vast amounts of clinical data and delivering accurate, actionable insights with unprecedented speed. This project explores the cutting-edge intersection of artificial intelligence and medical science, where transformer models are redefining how we approach medical information. As we delve into this transformative technology, prepare to uncover how these advanced models are poised to revolutionize patient care, enhance medical research and set a new standard in the quest for accurate medical knowledge.

Table of Content

Data preparation
Creating prompt and tokenization
Training with PEFT and LORA

Data Preparation

We use the medical question answering dataset
which includes 47,457 medical question-answer pairs. The
dataset can be downloaded from the link below.

The dataset is 1000s of XML files. We will do the data cleaning and produce
a single JSON file three keys [‘Instruction’, ‘Input’, ‘Output’] for each
document. (https://github.com/abachaa/MedQuAD)

The JSON files look like:

[{'instruction':'How Can you treat my
diabetes?','input': 'I have uncontrolled diabetes.
MY A1C is above 7.5','output': 'You can treat in
following ways:\n 1. Get physical\n 2. take
medication as prescribed by your doctor \n3. check
your blood sugar regularly' }]

Model Declaration

We will utilize the lama-7b-hf model created by Meta. LlaMA 7b is trained
with 1 Trillion tokens with next word prediction as pre-training objective.
LLaMA out performed GPT-3 in several natural language processing tasks,

such as sentiment analysis. This could be attributed to LLaMA’s extensive
training dataset, which gives it an advantage over GPT-3. LlaMA is released under non-commercial license thus, you need to be cognizant on using this

model. To obtain the model weights from Meta, you must submit a request
through. However, the Llama model’s weights were inadvertently leaked and
incorporated into Hugging Face’s decapoda-research llama-7b-hf. As a
result, we will employ the Llama model from decapoda-research rather than

requesting the weights from Meta which takes longer time.

https://ai.facebook.com/blog/large-language-model-llama-meta-

ai/.

Creating Prompt and Tokenization

Prior to tokenization, we must construct the prompt. Here, is the structure of
the prompt.

def generate_prompt(data_point):
 if data_point["input"]:
 return f"""Below is an instruction that
describes a task, paired with an input that
provides further context. Write a response that
appropriately completes the request.

### Instruction:
{data_point["instruction"]}

### Input:
{data_point["input"]}

### Response:
{data_point["output"]}"""
 else:

return f"""Below is an instruction that
describes a task. Write a response that
appropriately completes the request.

### Instruction:
{data_point["instruction"]}

### Response:
{data_point["output"]}"""

We tokenize the prompt and create the training and validation dataset with
this format.

Dataset({features: ['instruction', 'input', 'output',
'input_ids', 'attention_mask'], num_rows: 14762})

Training With PEFT and LORA

The general approach to using large language models involves two steps.

Pre-training the LLM with a huge amount of data, such as GPT,
BERT, T5.
Fine-tuning it for downstream tasks to improve performance.

However, as LLMs become larger, fine-tuning becomes computationally
expensive, making it impossible to fine-tune many LLMs on commodity

hardware. Additionally, storing task specific fine-tuned models also becomes
a challenge as a single fine-tuned model can be in the range of hundreds of gigabytes. To address these issues, Parameter-Efficient Fine-Tuning

(PEFT) was introduced.

PEFT freezes most of the parameters from the original pre-trained model
and only trains a small number of extra model parameters which avoids the

problem of catastrophic forgetting behavior as most of the parameters from
the original model are kept fixed. Another important benefit of PEFT is that

it reduces the size of the task specific fine-tuned models, making them more
portable. PEFT offers various methods such as Low Rank Adaptation of

Large Language Models (LORA),

Prefix Tuning, P-Tuning, Prompt

Tuning and AdaLoRA.
For the AI doctor project, LORA was chosen. LORA freezes all parameters

of the pre-trained model and injects trainable rank decomposition matrices

into each layer of the Transformer architecture, greatly reducing the number
of trainable parameters needed for downstream tasks. For the AI doctor
project, only 0.06% of the original parameters of LlaMA needed to be
trained, thanks to LORA.

Let us see a few examples of inference done by an AI doctor:

print(ask_ai_doctor('What are symptoms of Cirrhosis?', model))

Output:

The following list of signs and symptoms may be
associated with
cirrhosis. Some people with cirrhosis do not have
any of these
symptoms. If you are concerned about how your
general health is
affected by cirrhosis, talk to your doctor or nurse
practitioner.
Signs and Symptoms of Cirrhosis -----------------
---- Abdominal swelling (ascites) Bleeding
problems Blurred vision Breath odor 

Confusion Constipation Difficulty concentrating 
Dizziness
Fatigue Fluid retention Gallstones Gout 
Hair loss Headache
Itching Jaundice Liver cancer Memory loss 
Muscle weakness
Nausea Neuropathy Night sweats Pain in the
upper right abdomen
Poor appetite Skin itching Sleepiness Stomach
pain Swollen
legs and feet Tiredness Weight gain Yellow
skin and eyes

Conclusion

Finally, we conclude our deep dive into the art and science of building advanced models, it’s clear that each step from meticulous data preparation to model declaration and from crafting precise prompts to tokenization plays a pivotal role in shaping success. Leveraging state-of-the-art techniques like PEFT and LoRA for training has further pushed the boundaries of what’s possible, enabling models to adapt with unparalleled efficiency and accuracy. This comprehensive approach not only optimizes performance but also sets the stage for groundbreaking results, transforming raw data into actionable insights with a finesse that drives innovation. Embracing these methodologies equips you with the tools to tackle complex challenges, setting new benchmarks in the field and driving the future of AI excellence.