Visualization of clinical data

Laure Cougnaud, Michela Pasetto

October 09, 2022

This vignette focuses on the visualizations available in the clinDataReview package.

We will use example data sets from the clinUtils package.

If you have doubts on the data format, please check first the vignette on data preprocessing available at: here.

If everything is clear on that side, let’s get started!

Please note that the patient profiles and interactive visualizations are only displayed in the vignette if Pandoc is available.

library(clinDataReview)
library(plotly)
library(clinUtils)

data(dataADaMCDISCP01)
labelVars <- attr(dataADaMCDISCP01, "labelVars")

varsLB <- c(
    "PARAM", "PARAMCD", "USUBJID", "TRTP", 
    "ADY", "VISITNUM", "VISIT", "LBSTRESN"
)
dataLB <- dataADaMCDISCP01$ADLBC[, varsLB]

varsAE <- c("USUBJID", "AESOC", "AEDECOD", "ASTDY", "AENDY", "AESEV")
dataAE <- dataADaMCDISCP01$ADAE[, varsAE]

varsDM <- c("RFSTDTC", "USUBJID")
dataDM <- dataADaMCDISCP01$ADSL[, varsDM]

1 Patient profiles

The interactive visualizations of the clinical data package include functionalities to link a plot to patient-specific report, e.g. patient profiles created with the patientProfilesVis package.

Such patient profiles can be created via a config file, with a dedicated template report available in the clinDataReview package.

A simple patient profile report for each subject in the example dataset is created below.

Please note that the patient profiles are created and included in the interactive visualizations only during an interactive session (via interactive()) .

# create a directory to store the patient profiles:
patientProfilesDir <- "patientProfiles"
dir.create(patientProfilesDir)

# get examples of parameters for the report
configDir <- system.file("skeleton", "config", package = "clinDataReview")
params <- getParamsFromConfig(
    configDir = configDir, 
    configFile = "config-patientProfiles.yml"
)
# create patient profile with only one panel for the demo
params$patientProfilesParams <- params$patientProfilesParams[1]
# use dataset from the clinUtils package
params$pathDataFolder <- system.file("extdata", "cdiscpilot01", "SDTM", package = "clinUtils")
# store patient profile in this folder:
params$patientProfilePath <- patientProfilesDir

# create patient profiles
pathTemplate <- clinDataReview::getPathTemplate(params$template)
file.copy(from = pathTemplate, to = ".")
report <- rmarkdown::render(
    input = basename(pathTemplate), 
    envir = new.env()
)
unlink(basename(pathTemplate))
unlink(basename(report))

Please refer to the vignette about reporting for more details on how to set up a config file and use template reports available in the package.

You can directly skip to reporting vignette, which is available here or run in your console the command below.

vignette("clinDataReview-reporting", "clinDataReview")

2 Data visualization

All the visualizations available in the package are interactive.

2.1 Visualization of individual profiles

Visualization of individual profiles is available via the function scatterplotClinData.

2.1.1 Explore the visualization data

To facilitate the exploration of the data, the underlying data behind each visualization can be included as a table as well below the plot by setting the parameter table to TRUE.

Please note that this functionality is not demonstrated in this document to ensure a lightweight vignette in the package.

2.1.3 Spaghetti plot of time profile

SCREENING 1WEEK 12WEEK 16WEEK 2WEEK 20WEEK 24WEEK 26WEEK 4WEEK 6WEEK 8203040
Planned Treatment(Placebo,1)(Xanomeline High Dose,1)(Xanomeline Low Dose,1)PlaceboXanomeline High DoseXanomeline Low DoseActual value of Alanine Aminotransferase (U/L)Analysis Relative DayNumeric Result/Finding in Standard UnitsProfile plot by subjectPoints are positioned at relative dayVisits are positioned based on median relative day across subjects.

2.1.4 Scatterplot

32.535.037.540.042.532.535.037.540.042.532.535.037.540.042.532.535.037.540.042.520304032.535.037.540.042.5203040
Unique Subject Identifier01-701-114801-701-119201-701-121101-704-144501-710-108301-718-137101-718-1427Albumin (g/L) vs Alanine Aminotransferase (U/L)Alanine Aminotransferase (U/L)Albumin (g/L)SCREENING 1WEEK 12WEEK 16WEEK 2WEEK 20WEEK 24WEEK 26WEEK 4WEEK 6WEEK 8

2.1.5 eDish plot

20305051020
Visit NameSCREENING 1WEEK 12WEEK 16WEEK 2WEEK 20WEEK 24WEEK 26WEEK 4WEEK 6WEEK 8Bilirubin (umol/L) vs Alanine Aminotransferase (U/L)Alanine Aminotransferase (U/L)Bilirubin (umol/L)

2.1.6 Visualization of time-intervals

Time-intervals are displayed with the timeProfileIntervalPlot function:

−800−600−400−200020001-701-114801-701-119201-701-121101-704-144501-710-108301-718-137101-718-1427
Severity/IntensityMILDMODERATESEVEREAnalysis Start Relative Day and Analysis Start Relative Day

By default, empty intervals are represented if the start/end time variables are missing. Missing start/end time can be imputed, or different symbols can be used to represent such cases:

−800−600−400−200020001-701-114801-701-119201-701-121101-704-144501-710-108301-718-137101-718-1427
Start day and End day

2.2 Visualization of summary statistics

Summary statistics can also be visualized with the package, via different types of visualizations: sunburst, treemap and barplot.

These functions take as input a table of summary statistics, especially counts. Such table can e.g. computed with the inTextSummaryTable R package (see corresponding package vignette for more information).

2.2.2 Categorical variables

2.2.2.1 Compute count statistics

In this example, counts of adverse events are extracted for each Primary System Organ Class and Dictionary-Derived Term.

Besides the counts of the number of subjects, the paths to the patient profile report for each subgroup are extracted and combined.

# total counts: Safety Analysis Set (patients with start date for the first treatment)
dataTotal <- subset(dataDM, RFSTDTC != "")

## patient profiles report

if(interactive()){

    # add path in data
    
    dataAE$patientProfilePath <- paste0(
        "patientProfiles/subjectProfile-", 
        sub("/", "-", dataAE$USUBJID), ".pdf"
    )

    # add link in data (for attached table)
    dataAE$patientProfileLink <- with(dataAE,
        paste0(
            '<a href="', patientProfilePath, 
            '" target="_blank">', USUBJID, '</a>'
        )
    )

    # Specify extra summarizations besides the standard stats
    # When the data is summarized,
    # the patient profile path are summarized
    # as well across patients
    # (the paths should be collapsed with: ', ')
    statsExtraPP <- list(
        statPatientProfilePath = function(data) 
          toString(sort(unique(data$patientProfilePath))),
        statPatientProfileLink = function(data)
          toString(sort(unique(data$patientProfileLink)))
    )
    
}

# get counts (records, subjects, % subjects) + stats with subjects profiles path
statsPP <- c(
    inTextSummaryTable::getStats(type = "count"),
    if(interactive())
        list(
            patientProfilePath = quote(statPatientProfilePath),
            patientProfileLink = quote(statPatientProfileLink)
        )
)

dataAE$AESEV <- factor(
    dataAE$AESEV,
    levels = c("MILD", "MODERATE", "SEVERE")
)
dataAE$AESEVN <- as.numeric(dataAE$AESEV)

# compute adverse event table
tableAE <- inTextSummaryTable::computeSummaryStatisticsTable(
    
    data = dataAE,
    rowVar = c("AESOC", "AEDECOD"),
    dataTotal = dataTotal,
    labelVars = labelVars,
    
    # The total across the variable used for the nodes
    # should be specified
    rowVarTotalInclude = c("AESOC", "AEDECOD"),
    
    rowOrder = "total",
    
    # statistics of interest
    # include columns with patients
    stats = statsPP, 
    # add extra 'statistic': concatenate subject IDs
    statsExtra = if(interactive())  statsExtraPP

)
knitr::kable(head(tableAE),
    caption = paste("Extract of the Adverse Event summary table",
        "used for the sunburst and barplot visualization"
    )
)
Extract of the Adverse Event summary table used for the sunburst and barplot visualization
AESOC AEDECOD isTotal statN statm statPercTotalN statPercN n % m
CARDIAC DISORDERS MYOCARDIAL INFARCTION FALSE 1 1 7 14.28571 1 14.3 1
GASTROINTESTINAL DISORDERS DYSPEPSIA FALSE 1 1 7 14.28571 1 14.3 1
GASTROINTESTINAL DISORDERS NAUSEA FALSE 2 7 7 28.57143 2 28.6 7
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS APPLICATION SITE DERMATITIS FALSE 1 2 7 14.28571 1 14.3 2
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS APPLICATION SITE ERYTHEMA FALSE 3 3 7 42.85714 3 42.9 3
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS APPLICATION SITE IRRITATION FALSE 2 4 7 28.57143 2 28.6 4

2.2.2.2 Sunburst

The sunburstClinData function visualizes the counts of hierarchical data in nested circles.

The different groups are visualized from the biggest class (root node) in the center of the visualization to the smallest sub-groups (leaves) on the outside of the circles.

The size of the different segments is relative the respective counts.

Overall: 58GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS: 18GASTROINTESTINAL DISORDERS: 8MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS: 5INFECTIONS AND INFESTATIONS: 4NERVOUS SYSTEM DISORDERS: 4PSYCHIATRIC DISORDERS: 4RESPIRATORY, THORACIC AND MEDIASTINAL DISORDERS: 4SKIN AND SUBCUTANEOUS TISSUE DISORDERS: 3INJURY, POISONING AND PROCEDURAL COMPLICATIONS: 2METABOLISM AND NUTRITION DISORDERS: 2RENAL AND URINARY DISORDERS: 2CARDIAC DISORDERS: 1INVESTIGATIONS: 1APPLICATION SITE PRURITUS: 5APPLICATION SITE IRRITATION: 4APPLICATION SITE ERYTHEMA: 3APPLICATION SITE DERMATITIS: 2SECRETION DISCHARGE: 2FATIGUE: 1SUDDEN DEATH: 1NAUSEA: 7DYSPEPSIA: 1MUSCULAR WEAKNESS: 2BACK PAIN: 1FLANK PAIN: 1SHOULDER PAIN: 1LOWER RESPIRATORY TRACT INFECTION: 2PNEUMONIA: 2AMNESIA: 2LETHARGY: 1PARTIAL SEIZURES WITH SECONDARY GENERALISATION: 1COMPLETED SUICIDE: 1CONFUSIONAL STATE: 1DEPRESSED MOOD: 1HALLUCINATION, VISUAL: 1COUGH: 2DYSPNOEA: 1EPISTAXIS: 1ERYTHEMA: 2ACTINIC KERATOSIS: 1JOINT DISLOCATION: 1SKIN LACERATION: 1DECREASED APPETITE: 2CALCULUS URETHRAL: 1INCONTINENCE: 1MYOCARDIAL INFARCTION: 1NASAL MUCOSA BIOPSY: 1
Number of adverse events by Primary System Organ Class and Dictionary-Derived Term

2.2.2.3 Treemap

A treemap visualizes the counts of the hierarchical data in nested rectangles. The area of each rectangle is proportional to the counts of the respective group.

Note, that a treemap can also be colored accordingly to a meaningful variable. For instance, if we show adverse events, we might color the plot by severity. This can be achieved with the colorVar parameter.

Overall: 58GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS: 18GASTROINTESTINAL DISORDERS: 8MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS: 5INFECTIONS AND INFESTATIONS: 4NERVOUS SYSTEM DISORDERS: 4PSYCHIATRIC DISORDERS: 4RESPIRATORY, THORACIC AND MEDIASTINAL DISORDERS: 4SKIN AND SUBCUTANEOUS TISSUE DISORDERS: 3INJURY, POISONING AND PROCEDURAL COMPLICATIONS: 2METABOLISM AND NUTRITION DISORDERS: 2RENAL AND URINARY DISORDERS: 2CARDIAC DISORDERS: 1INVESTIGATIONS: 1APPLICATION SITE PRURITUS: 5APPLICATION SITE IRRITATION: 4APPLICATION SITE ERYTHEMA: 3APPLICATION SITE DERMATITIS: 2SECRETION DISCHARGE: 2FATIGUE: 1SUDDEN DEATH: 1NAUSEA: 7DYSPEPSIA: 1MUSCULAR WEAKNESS: 2BACK PAIN: 1FLANK PAIN: 1SHOULDER PAIN: 1LOWER RESPIRATORY TRACT INFECTION: 2PNEUMONIA: 2AMNESIA: 2LETHARGY: 1PARTIAL SEIZURES WITH SECONDARY GENERALISATION: 1COMPLETED SUICIDE: 1CONFUSIONAL STATE: 1DEPRESSED MOOD: 1HALLUCINATION, VISUAL: 1COUGH: 2DYSPNOEA: 1EPISTAXIS: 1ERYTHEMA: 2ACTINIC KERATOSIS: 1JOINT DISLOCATION: 1SKIN LACERATION: 1DECREASED APPETITE: 2CALCULUS URETHRAL: 1INCONTINENCE: 1MYOCARDIAL INFARCTION: 1NASAL MUCOSA BIOPSY: 1
Number of adverse events by Primary System Organ Class and Dictionary-Derived Term

2.2.2.4 Barplot

A barplot visualizes the counts for one single variable in a specific order.

4322111111111111111111111111111111APPLICATION SITE PRURITUSAPPLICATION SITE ERYTHEMAAPPLICATION SITE IRRITATIONNAUSEAACTINIC KERATOSISAMNESIAAPPLICATION SITE DERMATITISBACK PAINCALCULUS URETHRALCOMPLETED SUICIDECONFUSIONAL STATECOUGHDECREASED APPETITEDEPRESSED MOODDYSPEPSIADYSPNOEAEPISTAXISERYTHEMAFATIGUEFLANK PAINHALLUCINATION, VISUALINCONTINENCEJOINT DISLOCATIONLETHARGYLOWER RESPIRATORY TRACT INFECTIONMUSCULAR WEAKNESSMYOCARDIAL INFARCTIONNASAL MUCOSA BIOPSYPARTIAL SEIZURES WITH SECONDARY GENERALISATIONPNEUMONIASECRETION DISCHARGESHOULDER PAINSKIN LACERATIONSUDDEN DEATH01234
Number of patients with adverse events vs Dictionary-Derived TermDictionary-Derived TermNumber of patients with adverse events

2.2.3 Continuous variable

2.2.3.2 Plot error bars/confidence intervals

BaselineWeek 2Week 4Week 6Week 865707580
Analysis TimepointAFTER LYING DOWN FOR 5 MINUTESAFTER STANDING FOR 1 MINUTEAFTER STANDING FOR 3 MINUTESDiastolic Blood Pressure summary profile by actual visit and and analysis timepointAnalysis VisitMean and Standard Error

2.2.3.3 Boxplot

A boxplot visualizes the distribution of a continuous variable of interest versus specific categorical variables.

This visualization doesn’t rely on pre-computed statistics, so the continuous variable of interest is directly passed to the functionality.

556065707580855060708090100BaselineWeek 2Week 4Week 6Week 85060708090100
Actual TreatmentPlaceboXanomeline High DoseXanomeline Low DoseDiastolic Blood Pressure distribution by actual visit and analysis timepointAFTER LYING DOWN FOR 5 MINUTESAFTER STANDING FOR 1 MINUTEAFTER STANDING FOR 3 MINUTESActual value of the Diastolic Blood Pressure parameter (mmHg)Analysis Visit

2.3 Multiple visualizations in a loop

To include multiple clinical data visualizations (with or without attached table) in a loop (in the same Rmarkdown chunk), the list of visualizations should be passed to the knitPrintListObjects function of the clinUtils package.

2.3.0.1 Potassium (mmol/L)

0501001503.63.94.24.5
Planned Treatment(Placebo,1)(Xanomeline High Dose,1)(Xanomeline Low Dose,1)PlaceboXanomeline High DoseXanomeline Low DoseActual value of Potassium (mmol/L)Analysis Relative DayNumeric Result/Finding in Standard Units

2.3.0.2 Sodium (mmol/L)

050100150132.5135.0137.5140.0142.5
Planned Treatment(Placebo,1)(Xanomeline High Dose,1)(Xanomeline Low Dose,1)PlaceboXanomeline High DoseXanomeline Low DoseActual value of Sodium (mmol/L)Analysis Relative DayNumeric Result/Finding in Standard Units

3 Palettes

3.1 Set palette for the entire session

Palette for the colors and shapes associated with specific variables can be set for all clinical data visualizations at once by setting the clinDataReview.colors and clinDataReview.shapes options at the start of the R session.

Please see the clinUtils package for the default colors and shapes.

## function (n, alpha = 1, begin = 0, end = 1, direction = 1, option = "D")
##  int [1:24] 21 22 23 24 25 0 1 2 3 4 ...
−800−600−400−200020001-701-114801-701-119201-701-121101-704-144501-710-108301-718-137101-718-1427
Severity/IntensityMILDMODERATESEVEREAnalysis Start Relative Day and Analysis End Relative Day

The palettes can be set for all visualizations, e.g. at the start of the R session, with:

In case the palette contains less elements than available in the data, these are replicated.

−800−600−400−200020001-701-114801-701-119201-701-121101-704-144501-710-108301-718-137101-718-1427
Severity/IntensityMILDMODERATESEVEREAnalysis Start Relative Day and Analysis End Relative Day

Palettes are reset to the default patient profiles palettes at the start of a new R session, or by setting:

4 Appendix

4.1 Session info

R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] plyr_1.8.7 plotly_4.10.0 ggplot2_3.3.6 clinUtils_0.1.1 clinDataReview_1.3.1 knitr_1.40

loaded via a namespace (and not attached): [1] ggrepel_0.9.1 Rcpp_1.0.9 inTextSummaryTable_3.2.0 tidyr_1.2.1 digest_0.6.29 utf8_1.2.2
[7] R6_2.5.1 evaluate_0.16 httr_1.4.4 highr_0.9 pillar_1.8.1 gdtools_0.2.4
[13] rlang_1.0.6 uuid_1.1-0 lazyeval_0.2.2 data.table_1.14.2 jquerylib_0.1.4 DT_0.25
[19] flextable_0.8.2 rmarkdown_2.16 labeling_0.4.2 stringr_1.4.1 htmlwidgets_1.5.4 munsell_0.5.0
[25] compiler_4.2.1 xfun_0.33 systemfonts_1.0.4 pkgconfig_2.0.3 base64enc_0.1-3 htmltools_0.5.3
[31] tidyselect_1.1.2 tibble_3.1.8 bookdown_0.29 jsonvalidate_1.3.2 fansi_1.0.3 viridisLite_0.4.1
[37] dplyr_1.0.10 withr_2.5.0 grid_4.2.1 jsonlite_1.8.2 gtable_0.3.1 lifecycle_1.0.2
[43] magrittr_2.0.3 scales_1.2.1 zip_2.2.1 cli_3.4.1 stringi_1.7.8 cachem_1.0.6
[49] reshape2_1.4.4 farver_2.1.1 xml2_1.3.3 bslib_0.4.0 ellipsis_0.3.2 generics_0.1.3
[55] vctrs_0.4.2 cowplot_1.1.1 tools_4.2.1 forcats_0.5.2 glue_1.6.2 officer_0.4.4
[61] purrr_0.3.4 hms_1.1.2 crosstalk_1.2.0 parallel_4.2.1 fastmap_1.1.0 yaml_2.3.5
[67] colorspace_2.0-3 haven_2.5.1 sass_0.4.2