User Guide¶
Overview of the Parse-EMR System¶
Parse-EMR is a clinical research tool that helps physicians build datasets from structured electronic medical records (EMRs) using Large Language Models.
Pre-Requisites: What is a Prompt Set?¶
A prompt set in this system is a structured question tree that defines exactly what questions the LLM (Large Language Model) will be asked and exactly what information will be extracted from the patient charts. Think of it as a "data extraction schema" that tells the automated chart review system what questions it should ask the LLM about the given note.
Key Concept: Templates are the foundation of the entire system - they determine what data gets extracted, how it's structured, and what conditions control when questions are asked and what questions are ignored.
Why is Branching Logic (show_if logic) Required?¶
If you ask a question to a large language model, and the answer is not present in the text, it will often give you a fabricated answer in the format you want. For example, if you ask it for the kidney tumor grade/stage, and the note is completely unrelated to kidney cancer, it will often make up a random kidney tumor grade/stage.
To prevent this from happening, we use a system of branching logic, where the questions that are asked next are determined by the answers to the previous questions.
For example, the prompting system would first ask, "Does this report describe specimens that would be directly related to kidney cancer?"
Only if the answer is yes, would the prompting system ask "Does this report describe specimens collected from one or more kidney procedures?"
Only if it answers yes to both of those questions, would the prompting system ask for more detail, like the grade, stage, and laterality.
How We Define Branching Logic¶
This is determined through a JSON file, that looks something like this:
[
{
"name": "relevant_to_kidney_cancer",
"type": "dropdown",
"title": "Relevant to Kidney Cancer",
"prompt": "Does this report describe specimens that would be directly relevant to kidney cancer? In other words, is this a specimen collected during either a kidney procedure (excision/biopsy) or a biopsy of a metastatic kidney cancer site?",
"options": [
"Yes",
"No",
"Cannot be Determined"
]
},
{
"name": "from_kidney_procedure",
"type": "dropdown",
"title": "From Kidney Procedure",
"prompt": "Does this report describe specimens collected from one or more kidney procedures, such as renal mass biopsy, partial/radical nephrectomy, or nephroureterectomy?",
"options": [
"Yes",
"No",
"Cannot be Determined"
],
"show_if": "'[relevant_to_kidney_cancer]' == 'Yes'"
},
{
"name": "procedure_laterality",
"type": "dropdown",
"title": "Procedure Laterality",
"prompt": "What is the laterality of the procedure?",
"options": [
"Left",
"Right",
"Bilateral"
],
"show_if": "'[from_kidney_procedure]' == 'Yes'"
}
]
Where the Parse-EMR Prompt Design System Fits In¶
This JSON could be created manually through a text editor, but that is often difficult, especially since it requires you to have a working knowledge of programming.
The parse-emr prompt design system was created as a solution to this problem, allowing you to design these prompt sets without manually writing any code.
Step 1: Creating a Project¶
Projects are the starting point to the system. They allow you to define what data will be pulled and what you need to create in order to conduct the chart review.
- Navigate to Projects: Go to "Projects" in the main navigation
- Create New Project
- Define Required Fields: The name is what the project will be called, and the end and start date determine the date range of data you wish to pull. Try keeping this narrow to keep runtimes smaller.
- Define Data You Want to Pull: Using the checkboxes, pick what data you want to be pulled and processed using your prompt sets.
- Save and More: Once you create a project, it will tell you what prompt sets you need to create/link to process your data.
- Export: Once you build your prompt sets, you can come back here and export the project configuration, which can then be run on a high performance computing cluster.
Step 2: Building a Prompt Set¶
Step 2a: Starting the Prompt Set Builder¶
- Navigate to Prompt Sets: Go to "Prompt Sets" in the main navigation
- Create New: Click "Create New Prompt Set"
- Name Your Template: Enter a descriptive name (e.g. "Prostate Cancer Surgical Pathology Note")
- Access the Builder: The system opens the visual question tree builder
Step 2b: Understanding the Interface¶
The builder has several key components:
Main Controls:
- Add New Question: Creates individual questions
- Add Repeat Clause: Creates repeating question sets
- Export JSON: Downloads the template as a JSON file
- Visualize Whole Tree: Visualizes the branching logic (show_if logic) of the prompt set in a tree format
- Save Prompt Set: Saves your work!
Step 2c: Creating Individual Questions¶
Question Types Available:
- Dropdown - Multiple choice options
- Integer - Whole numbers only
- Number - Decimal number
- Time - 24-hour format (HH:MM)
- Date - Calendar date selection
Good Prompt Design Guide¶
Be Specific and Unambiguous
Poor Prompt: "Is this related to kidney cancer?"
Good Prompt: "Does this report describe specimens that would be directly relevant to kidney cancer? In other words, is this a specimen collected during either a kidney procedure (excision/biopsy) or a biopsy of a metastatic kidney cancer site?"
Why it matters: Medical charts often contain multiple measurements or conflicting information. Being specific helps the AI focus on the right data.
Include Clear Response Options
For dropdown questions, provide comprehensive but non-overlapping options:
Good Options:
Yes
No
Unknown
Not Documented
Not Applicable
Poor options:
Yes
No
Maybe
Why it matters: "Maybe" is ambiguous and doesn't help with data analysis. "Unknown" and "Not Documented" are distinct concepts in medical research.
Question Configuration Fields¶
Basic Fields:
- Name: Internal identifier, no spaces (e.g.,
tumor_size) - Title: Display name (e.g., "Tumor Size")
- Prompt: The actual question text for the AI
- Type: Data type (dropdown, integer, etc)
Type Specific Fields¶
For Dropdown Questions:
- Options: One option per line in text area
- Example:
Yes
No
Unknown
Not Applicable
For Number Questions:
- Step: Increment value (e.g. 0.1 for decimals, 1 for whole numbers)
Conditional Branching Logic (show_if logic): The key to it all!¶
Questions will be shown/hidden based on previous answers.
Two Methods:
- Visual Condition Builder (recommended):
- This will generate the branching logic for you, without having to write any code
- Add OR groups (only one of the conditions needs to be true for the question to be asked)
- Within each group, add AND conditions (all of the defined groups must be true)
- Select question, operator, and value
- Live preview shows generated Python code
- Custom condition (Advanced):
- Direct Python syntax input
- Any Pythonic statement with the given variables would work
- Format:
'[question_name]' == 'value' - Supports complex logic:
'[tumor_stage_rc]' in ['pT3', 'pT4']
Condition Examples:
# Shows question only if the LLM responded with "Yes" to the question with the name "relevant_to_kidney_cancer"
'[relevant_to_kidney_cancer]' == 'Yes'
# Multiple conditions (AND)
'[adrenal_gland_submitted]' == 'Yes' and '[any_tumor_present]' == 'Yes'
# Multiple groups (OR)
'[tumor_stage_rc]' == 'pT3' or '[tumor_stage_rc]' == 'pT4'
It is highly recommended to use the visual condition builder.
Advanced Features¶
Consolidation Rules¶
Multiple questions can be consolidated to a single variable with priority-based merging.
Configuration:
- Check "Enable Consolidation" to activate
- Consolidation Variable: Target Variable Name
- Consolidation Priority: Higher numbers override lower ones
Use Case: Multiple variables answer the same question, but some are more specific than others, and we want to take the most specific response.
Example:
First ask what general type of kidney tumor is present (options: renal tumor, ureteral/renal pelvis tumor, adrenal tumor, other), which has a consolidation priority of 1 to tumor_histologic_subtype.
Then, if it is a Renal Tumor, we ask what subtype of renal tumor it is, which will have a consolidation priority of 2 to tumor_histologic_subtype.
If it answers "adrenal tumor" to the first question, then the second question will never be asked, and tumor_histologic_subtype will saved as adrenal tumor.
However, if it answered "renal tumor", then we will ask the second question, and tumor_histologic_subtype will be saved as the answer to the second question.
Repeat Clauses¶
Repeat clauses are the most complex but powerful feature - they allow asking the same set of questions multiple times. Think of them as "for loops" for questions. If a patient has 3 kidney procedures, you can ask the same set of questions 3 times, one for each kidney procedure.
Repeat Clause Configuration¶
Step 1: Basic Settings:
- Array Name: Name for the collection (e.g.
kidney_procedures) - Count Question Name: Question that asks "how many?" (e.g.
num_kidney_procedures)
Step 2: When to Show:
- Show When Condition: Python condition to activate the repeat
- Maximum Repetitions: Safety limit (usually 10-20)
Step 3: Question Prefix (optional):
- Text that appears before each repeated question
- Use
{repeat_ind}for the current number (1,2,3, etc.) - Example: "Regarding procedure number {repeat_ind}:"
Step 4: ID Item (Required):
- The first question for each repetition - helps identify each item
Step 5: Additional Questions:
Questions that get added for each repeat item
Repeat clause example:
{
"repeat": {
"name": "kidney_procedures",
"num": "num_kidney_procedures",
"bool": "'[had_kidney_procedure]' == 'Yes'",
"prompt_prefix": "Regarding kidney procedure number {repeat_ind}:",
"max_num": 10,
"id_item": {
"name": "procedure_type",
"title": "Procedure Type",
"prompt": "What type of kidney procedure was performed?",
"type": "dropdown",
"options": ["Nephrectomy", "Biopsy", "Ablation"]
},
"items": [
{
"name": "procedure_date",
"title": "Procedure Date",
"prompt": "When was this procedure performed?",
"type": "date"
},
{
"name": "complications",
"title": "Complications",
"prompt": "Were there any complications?",
"type": "dropdown",
"options": ["Yes", "No", "Unknown"]
}
]
}
}
Additional Features¶
Auto-Ordering System
The system automatically reorders questions based on dependencies to ensure there are not logical errors in the order of questions and when they are required by show_if questions.
Step 3: Testing Your Prompt Set¶
After building your prompt set, you can test it on sample patient notes before sending it for larger-scale analysis. This helps you verify how your prompt set will perform on clinical data.
Overview of the Testing Process¶
Testing your prompt set often involves three sequential steps:
- Upload a Test Chart - Add a de-identified/fake patient note that matches your prompt set's note type
- Run LLM Analysis - Let the AI process the chart using your prompt set
- Perform Manual Analysis - Conduct manual chart review and compare it with the AI's results
This process helps you identify issues with your prompt design and iteratively fix your prompts before they are used on a larger scale.
Step 2a: Upload a Test Chart¶
- Navigate to Notes: Go to "Notes" in the main navigation
- Add New Note: Click "Add New Note"
- Name the note
- Paste Text and save
Warning
You cannot upload any notes with PHI. All notes must either by synthetically generated or thoroughly deidentified before putting them into the system.
Step 2b: Run LLM Analysis¶
- Navigate to LLM Analysis: Go to "LLM Analysis" in the main navigation
- Select Your Prompt Set: Choose your newly created prompt set from the dropdown
- Select Your Test Chart: Choose the chart you just uploaded
- Run Analysis: Click "Run Analysis" to start the process
Analysis typically takes 2 to 5 minutes depending on:
- Chart length and complexity
- Number of questions in your prompt
- AI model response time
- System load
Note
Running LLM analysis subtracts from your monthly allotted token usage. Token usage for each prompt set depends on multiple factors, and is calculated during each run. Each user is allotted 5 million tokens per month.
Once the analysis is completed, you can view the results.
Step 2c: Perform Manual Analysis¶
To take testing and debugging one step further, running manual analysis lets you create ground truth data for comparison with the AI results. It can also be used to validate your prompt set by testing it manually.
With this tool, you essentially become the AI, where the system asks you the exact questions the LLM will be asked so you can test your prompts.
Once you have completed a manual analysis, you can link an LLM analysis to get statistics on agreement rates and compare their answers side by side.