To support the safety analysis, it is quite common to define specific grouping of events. One of the most common ways is to group events or medications by a specific medical concept such as a Standard MedDRA Queries (SMQs) or WHO-Drug Standardized Drug Groupings (SDGs).
To help with the derivation of these variables, the {admiral}
function derive_vars_query()
can be used. This function
takes as input the dataset (dataset
) where the grouping
must occur (e.g ADAE
) and a dataset containing the required
information to perform the derivation of the grouping variables
(dataset_queries
).
The dataset passed to the dataset_queries
argument of
the derive_vars_query()
function can be created by the
create_query_data()
function. For SMQs and SDGs
company-specific functions for accessing the SMQ and SDG database need
to be passed to the create_query_data()
function (see the
description of the get_smq_fun
and get_sdq_fun
parameter for details).
This vignette describes the expected structure and content of the
dataset passed to the dataset_queries
argument in the
derive_vars_query()
function.
Variable | Scope | Type | Example Value |
---|---|---|---|
VAR_PREFIX | The prefix used to define the grouping variables | Character | "SMQ01" |
QUERY_NAME | The value provided to the grouping variables name | Character | "Immune-Mediated Guillain-Barre Syndrome" |
TERM_LEVEL | The variable used to define the grouping. Used in conjunction with TERM_NAME | Character | "AEDECOD" |
TERM_NAME | A term used to define the grouping. Used in conjunction with TERM_LEVEL | Character | "GUILLAIN-BARRE SYNDROME" |
TERM_ID | A code used to define the grouping. Used in conjunction with TERM_LEVEL | Integer | 10018767 |
QUERY_ID | Id number of the query. This could be a SMQ identifier | Integer | 20000131 |
QUERY_SCOPE | Scope (Broad/Narrow) of the query | Character | BROAD , NARROW , NA |
QUERY_SCOPE_NUM | Scope (Broad/Narrow) of the query | Integer | 1 , 2 , NA |
VERSION | The version of the dictionary | Character | "20.1" |
Bold variables are required in
dataset_queries
: an error is issued if any of these
variables is missing. Other variables are optional.
The VERSION
variable is not used by
derive_vars_query()
but can be used to check if the
dictionary version of the queries dataset and the analysis dataset are
in line.
Each row must be unique within the dataset.
As described above, the variables VAR_PREFIX
,
QUERY_NAME
, TERM_LEVEL
, TERM_NAME
and TERM_ID
are required. The combination of these
variables will allow the creation of the grouping variable.
VAR_PREFIX
must be a character string starting with
2 or 3 letters, followed by a 2-digits number (e.g. “CQ01”).
QUERY_NAME
must be a non missing character string
and it must be unique within VAR_PREFIX
.
TERM_LEVEL
must be a non missing character
string.
Each value in TERM_LEVEL
represents a variable from
dataset
used to define the grouping variables
(e.g. AEDECOD
,AEBODSYS
,
AELLTCD
).
The function derive_vars_query()
will check that
each value given in TERM_LEVEL
has a corresponding variable
in the input dataset
and issue an error otherwise.
Different TERM_LEVEL
variables may be specified
within a VAR_PREFIX
.
TERM_NAME
must be a character string. This
must be populated if TERM_ID
is
missing.
TERM_ID
must be an integer. This
must be populated if TERM_NAME
is
missing.
VAR_PREFIX
will be used to create the grouping
variable appending the suffix “NAM”. This variable will now be referred
to as ABCzzNAM
: the name of the grouping variable.
E.g. VAR_PREFIX == "SMQ01"
will create the
SMQ01NAM
variable.
For each VAR_PREFIX
, a new ABCzzNAM
variable is created in dataset
.
QUERY_NAME
will be used
to populate the corresponding ABCzzNAM
variable.
TERM_LEVEL
will be used to identify the variables
from dataset
used to perform the grouping
(e.g. AEDECOD
,AEBODSYS
,
AELLTCD
).
TERM_NAME
(for character variables),
TERM_ID
(for numeric variables) will be used to identify
the records meeting the criteria in dataset
based on the
variable defined in TERM_LEVEL
.
Result:
For each record in dataset
, where the variable
defined by TERM_LEVEL
match a term from the
TERM_NAME
(for character variables) or TERM_ID
(for numeric variables) in the datasets_queries
,
ABCzzNAM
is populated with
QUERY_NAME
.
Note: The type (numeric or character) of the variable defined in
TERM_LEVEL
is checked in dataset
. If the
variable is a character variable (e.g. AEDECOD
), it is
expected that TERM_NAME
is populated, if it is a numeric
variable (e.g. AEBDSYCD
), it is expected that
TERM_ID
is populated, otherwise an error is
issued.
In this example, one standard MedDRA query
(VAR_PREFIX = "SMQ01"
) and one customized query
(VAR_PREFIX = "CQ02"
) are defined to analyze the adverse
events.
The standard MedDRA query variable SMQ01NAM
[VAR_PREFIX
] will be populated with “Standard Query 1”
[QUERY_NAME
] if any preferred term (AEDECOD
)
[TERM_LEVEL
] in dataset
is equal to “AE1” or
“AE2” [TERM_NAME
]
The customized query (CQ02NAM
)
[VAR_PREFIX
] will be populated with “Query 2”
[QUERY_NAME
] if any Low Level Term Code
(AELLTCD
) [TERM_LEVEL
] in dataset
is equal to 10 [TERM_ID
] or any preferred term
(AEDECOD
) [TERM_LEVEL
] in dataset
is equal to “AE4” [TERM_NAME
].
ds_query
)VAR_PREFIX | QUERY_NAME | TERM_LEVEL | TERM_NAME | TERM_ID | |
---|---|---|---|---|---|
SMQ01 | Standard Query 1 | AEDECOD | AE1 | ||
SMQ01 | Standard Query 1 | AEDECOD | AE2 | ||
CQ02 | Query 2 | AELLTCD | 10 | ||
CQ02 | Query 2 | AEDECOD | AE4 |
ae
)USUBJID | AEDECOD | AELLTCD |
---|---|---|
0001 | AE1 | 101 |
0001 | AE3 | 10 |
0001 | AE4 | 120 |
0001 | AE5 | 130 |
Generated by calling
derive_vars_query(dataset = ae, dataset_queries = ds_query)
.
USUBJID | AEDECOD | AELLTCD | SMQ01NAM | CQ02NAM |
---|---|---|---|---|
0001 | AE1 | 101 | Standard Query 1 | |
0001 | AE3 | 10 | Query 2 | |
0001 | AE4 | 120 | Query 2 | |
0001 | AE5 | 130 |
Subject 0001 has one event meeting the Standard Query 1 criteria
(AEDECOD = "AE1"
) and two events meeting the customized
query (AELLTCD = 10
and AEDECOD = "AE4"
).
When standardized MedDRA Queries are added to the dataset, it is
expected that the name of the query (ABCzzNAM
) is populated
along with its number code (ABCzzCD
), and its Broad or
Narrow scope (ABCzzSC
).
The following variables can be added to queries_datset
to derive this information.
QUERY_ID
must be an integer.
QUERY_SCOPE
must be a character string. Possible
values are: “BROAD”, “NARROW” or NA
.
QUERY_SCOPE_NUM
must be an integer. Possible values
are: 1
, 2
or NA
.
QUERY_ID
, QUERY_SCOPE
and
QUERY_SCOPE_NUM
will be used in the same way as
QUERY_NAME
(see here) and will
help in the creation of the ABCzzCD
, ABCzzSC
and ABCzzSCN
variables.These variables are optional and if not populated in
dataset_queries
, the corresponding output variable will not
be created:
VAR_PREFIX | QUERY_NAME | QUERY_ID | QUERY_SCOPE | QUERY_SCOPE_NUM | Variables created |
---|---|---|---|---|---|
SMQ01 | Query 1 | XXXXXXXX | NARROW | 2 | SMQ01NAM , SMQ01CD , SMQ01SC ,
SMQ01SCN |
SMQ02 | Query 2 | XXXXXXXX | BROAD | SMQ02NAM , SMQ02CD ,
SMQ02SC |
|
SMQ03 | Query 3 | XXXXXXXX | 1 | SMQ03NAM , SMQ03CD ,
SMQ03SCN |
|
SMQ04 | Query 4 | XXXXXXXX | SMQ04NAM , SMQ04CD |
||
SMQ05 | Query 5 | SMQ05NAM |