Quality Assessment in EvidenceTableBuilder: What Each Profile Does (and What’s Live Today)

TL;DR

EvidenceTableBuilder now has a dedicated Quality Assessment flow alongside the main Builder: you pick a profile (aligned with familiar frameworks like RoB 2 or NOS), upload PDFs, and get structured, auditable drafts, rationales and evidence snippets tied back to the paper, not a black-box “the AI says this study is biased.”

Live today: CORE Extraction, RoB 2 (parallel, cluster, and crossover RCT variants), ROBINS-I (V2 follow-up study), Newcastle–Ottawa Scale (NOS), and ROBIS.

On the roadmap (visible in the app as “coming soon”): QUADAS-2, AMSTAR, a full set of JBI checklists, CASP checklists across study designs, and more, each profile is only switched on when prompts, exports, and licensing are ready.

If you want the why behind tool choice in general, start with How to Choose the Right Quality Assessment Tool. If you want the why behind hesitation at the desk, see Rigour Isn’t the Problem.

Why a Separate “Quality Assessment” Page?

Data extraction and risk-of-bias work are related, but they are not the same job.

In the Builder, you define your own columns and extract answers. In Quality Assessment, the rows and domains are largely fixed by the profile you select, think signalling questions, domains, and study-type-specific prompts, while the product promise stays the same as everywhere else on the platform: traceability.

That lines up with what we shipped for extraction: Audit Trails are the pattern here too. Judgements (or star-style items) should be reviewable alongside where in the PDF the model found support, so you can confirm, edit, and export something you would actually put in front of a supervisor or reviewer.

The integrated PDF viewer: same words, same page

Quality Assessment is not only a checklist on the left. The workspace opens the full-text PDF beside your run, so every domain or item can be checked against the exact wording the model used.

When you select a signalling question or item, you see the AI interpretation together with a verbatim quote pulled from the paper, plus metadata such as page and section. The PDF viewer jumps to that page, shows the quote in a prominent bar at the top of the viewer, and highlights the matching sentence in the document, so you are not hunting through dozens of pages to see whether the draft matches the source.

For appendices, supervision, or an audit pack, you can download a QA proof PDF: an export of the source document with the model’s evidence locations marked, so stakeholders can review where each piece of support came from without logging into the app.

How to Think About the Outputs

Across the live profiles, the honest framing is:

AI-assisted draft, not an official substitute for human appraisal.
Result-level tools (RoB 2, ROBINS-I) need the right outcome / comparison / timepoint context, one paper can look different depending on which result you are assessing.
Study-level tools (e.g. NOS) still deserve a human check that the pathway (cohort vs case–control) matches the paper.
Review-level tools (ROBIS) apply to systematic reviews as the unit of analysis, not primary trials in isolation.

CORE sits slightly apart: it is structured extraction of core study facts (citation, aim, design, outcomes, key results, etc.) and does not produce a risk-of-bias judgement. It is there for when you want a fast, consistent “front sheet” on each PDF before or alongside formal QA.

Profiles Available to Run Today

These seven profiles are enabled end-to-end in the Quality Assessment workspace:

Profile	Best for	Notes
CORE Extraction	Any study PDF	Paper-level facts for your table; not a RoB-style judgement.
RoB 2 – Parallel RCT	Individually randomised parallel-group trials	Result-level; domain drafts with evidence.
RoB 2 – Cluster RCT	Cluster-randomised trials	Adds cluster-specific concerns (e.g. recruitment timing around cluster allocation).
RoB 2 – Crossover Trial	Crossover / matched designs	Period, sequence, washout, carryover.
ROBINS-I V2 (follow-up study)	Non-randomised studies of interventions	Result-level; heavy emphasis on confounding and target-trial thinking, human review is essential.
Newcastle–Ottawa Scale (NOS)	Cohort and case–control studies	Study-level; star-style logic with auditable item support.
ROBIS	Systematic reviews	Review-level risk-of-bias phases and domains.

Credit estimates and setup steps differ by profile (e.g. per PDF vs per PDF/result); the app surfaces an estimate before you commit a run.

Full Profile List in the Product (Including Coming Soon)

Below is the complete grid of assessment templates surfaced in the Quality Assessment tool today. Anything not in the “live” table above appears in the UI as coming soon until configuration, testing, and export mappings are complete, a tile is not a promise until it is actually enabled.

Diagnostic accuracy and prediction

QUADAS-2 , risk of bias and applicability for diagnostic accuracy studies.
(JBI) Diagnostic Test Accuracy Studies , JBI checklist for primary diagnostic studies.
CASP Diagnostic Study Checklist , CASP diagnostic appraisal.
CASP Clinical Prediction Rule Checklist , clinical prediction rules / models.

Systematic reviews and overview-type evidence

AMSTAR , methodological quality of systematic reviews (broader checklist style than ROBIS alone).
(JBI) Systematic Reviews , JBI systematic review checklist.
CASP Systematic Review Checklist , CASP systematic review appraisal.
CASP SLR with Meta-Analysis of RCTs , reviews with MA of RCTs.
CASP SLR with Meta-Analysis of Observational Studies , reviews with MA of observational evidence.

Randomised trials (additional checklists)

(JBI) Randomized Controlled Trials , JBI RCT checklist.
CASP Randomised Controlled Trial (RCT) Checklist , CASP RCT appraisal.

Non-randomised and observational primary studies

(Beyond NOS and ROBINS-I, which are already live.)

(JBI) Cohort Studies , JBI cohort checklist.
(JBI) Case Control Studies , JBI case–control checklist.
(JBI) Analytical Cross Sectional Studies , analytical cross-sectional designs.
(JBI) Quasi-Experimental Studies , quasi-experimental designs.
(JBI) Prevalence Studies , prevalence methodology.
CASP Cohort Study Checklist , CASP cohort appraisal.
CASP Case Control Study Checklist , CASP case–control appraisal.
CASP Cross-Sectional Studies Checklist , CASP cross-sectional appraisal.

Qualitative, economic, and “grey” textual evidence

(JBI) Qualitative Research , JBI qualitative checklist.
CASP Qualitative Studies Checklist , CASP qualitative appraisal.
(JBI) Economic Evaluations , economic evaluation methodology.
CASP Economic Evaluation Checklist , CASP economic appraisal.
(JBI) Textual Evidence: Expert Opinion , expert opinion sources.
(JBI) Textual Evidence: Narrative , narrative summaries.
(JBI) Textual Evidence: Policy , policy documents as evidence.

Lower on the evidence hierarchy (still sometimes included)

(JBI) Case Reports , case report appraisal.
(JBI) Case Series , case series appraisal.

How This Fits Your Workflow

If you are already using EvidenceTableBuilder for extraction, think of Quality Assessment as the same infrastructure (upload, credits, progress, history, exports) applied to standardised appraisal templates, with the same insistence that software should support human judgement, not pretend the reviewer has been removed from the loop.

For extraction discipline that pairs well with QA, see Best Practices for Data Extraction in Systematic Reviews and How Best to Use EvidenceTableBuilder for Systematic Literature Reviews.