ML: LR + GaussianNB Ensemble · Dataset: Synthetic QA Benchmark v2.1 (N=120, 15 classes) · PubMed-inspired
• Source: Synthetic Clinical QA Benchmark v2.1
• Inspiration: PubMed-style clinical QA pairs
• Size: N=120 cases, 15 disease classes
• Categories: Cardiac, Pulmonary, Neuro, Infectious, GI, Metabolic, Other
• Features: 12-dim symptom + vital sign vector
• Validation: Stratified 70/30 train-test split
FOR RESEARCH / DECISION SUPPORT ONLY
9-section structured output · ML prediction · Differential diagnosis · Triage · Evaluation metrics
Live simulation · Priority triage queue · Haemodynamic monitoring
Multi-agent clinical reasoning · Real-time grading · Risk analysis
Human · ClinicalReasoningEngine (ML) · Rule-Based System
LR + GaussianNB Ensemble · Synthetic QA Benchmark v2.1 · N=120 cases · 15 disease classes
ML pipeline · Dataset · Feature engineering · Evaluation methodology
Name: Synthetic Clinical QA Benchmark v2.1
Inspiration: PubMed clinical QA pairs
Total cases: 120 (8 per disease class)
Disease classes: 15 across 7 categories
Generation: Prototype vectors + Gaussian noise (σ=0.15)
Split: 70/30 train-test, stratified by class
Not real patient data. Research use only.
Model 1: Logistic Regression (C=1.5, multinomial, lbfgs)
Model 2: Gaussian Naive Bayes (var_smoothing=1e-8)
Ensemble: Weighted average (LR 60% + NB 40%)
Scaling: StandardScaler (zero-mean, unit-variance)
Calibration: History-based Bayesian prior adjustment
Override: Vital sign triage escalation logic
Fallback: Cosine similarity (sklearn unavailable)
Split: 70/30 stratified train-test
Accuracy: Macro-averaged across 15 classes
Precision: Macro-averaged (zero_division=0)
Recall: Macro-averaged (zero_division=0)
F1: Harmonic mean of precision and recall
Emergency: Binary sensitivity/specificity
Undertriage: FN rate for emergency cases