Numbers in engineering format

Introduction

Packages used in this vignette.

library("dplyr")
library("purrr")
library("knitr")
library("docxtools")

This vignette demonstrates the use of two functions from the docxtools package:

format_engr() for formatting numbers in engineering notation
align_pander() for aligning table columns using a simple pander table style

The primary goal of format_engr() is to present numeric variables in a data frame in engineering format, that is, scientific notation with exponents that are multiples of 3. Compare:

syntax	expression
computer	$1.011E+5$
mathematical	$1.011\times10^{5}$
engineering	$101.1\times10^{3}$

format_engr()

This example uses a small data set, density, included with docxtools, with temperature in K, pressure in Pa, the gas constant in J kg^-1K^-1, and density in kg m^-3.

density
#> function (x, ...) 
#> UseMethod("density")
#> <bytecode: 0x000001854413c648>
#> <environment: namespace:stats>

Four of the variables are numeric. The date variable is of type “double” but class “Date”, so it is not reformatted. Class and type are found using purrr::map_chr().

map_chr(density, class)
#> Error in `purrr:::stop_bad_type()`:
#> ! `.x` must be a vector, not a function
map_chr(density, typeof)
#> Error in `purrr:::stop_bad_type()`:
#> ! `.x` must be a vector, not a function

Usage is format_engr(x, sigdig = NULL, ambig_0_adj = FALSE). The function returns a data frame with all numeric values reformatted as character strings in engineering format with math delimiters $...$ .

density_engr <- format_engr(density)
density_engr
#> function (x, ...) 
#> UseMethod("density")
#> <bytecode: 0x000001854413c648>
#> <environment: namespace:stats>

The formerly numeric variables are now characters. Non-numeric variables are returned unaltered.

purrr::map_chr(density_engr, class)
#> Error in `purrr:::stop_bad_type()`:
#> ! `.x` must be a vector, not a function

The math formatting is applied when the data frame is printed in the output document. For example, we can use knitr::kable() to print the formatted data.

kable(density_engr)
#> Error in as.data.frame.default(x): cannot coerce class '"function"' to a data.frame

The function is compatible with the pipe operator.

density_engr <- density %>%
  format_engr()
kable(density_engr)
#> Error in as.data.frame.default(x): cannot coerce class '"function"' to a data.frame

Comments:

as illustrated by the temperature column, scientific notation is omitted when the exponent is 0, 1, or 2
trailing zeros after a decimal point are significant

Significant digits

format_engr() has three arguments:

x, a data frame with at least one numerical variable.
sigdig, an optional vector of significant digits. Default is 4.
ambig_0_adj, an optional logical to adjust the notation in the event of ambiguous trailing zeros. Default is FALSE.

The sigdig argument can be a single value, applied to all numeric columns.

density_engr <- format_engr(density, sigdig = 3)
kable(density_engr)
#> Error in as.data.frame.default(x): cannot coerce class '"function"' to a data.frame

Alternatively, significant digits can be assigned to every numeric column. A zero returns the variable in its original form.

density_engr <- format_engr(density, sigdig = c(5, 4, 0, 5))
kable(density_engr)
#> Error in as.data.frame.default(x): cannot coerce class '"function"' to a data.frame

Ambiguous trailing zeros

Subset the data to look at just the numerical variables.

x <- density %>%
  select(T_K, p_Pa, R, density)
#> Error in UseMethod("select"): no applicable method for 'select' applied to an object of class "function"

Print the data with incrementally decreasing significant digits.

kable(format_engr(x, sigdig = 4), caption = "4 digits")
#> Error in is.data.frame(x): object 'x' not found

Three digits creates no ambiguity.

kable(format_engr(x, sigdig = 3), caption = "3 digits")
#> Error in is.data.frame(x): object 'x' not found

With 2 digits, we have three columns with ambiguous trailing zeros.

kable(format_engr(x, sigdig = 2), caption = "2 digits")
#> Error in is.data.frame(x): object 'x' not found

By setting the ambig_0_adj argument to TRUE, scientific notation is used to remove the ambiguity.

kable(format_engr(x, sigdig = 2, ambig_0_adj = TRUE), caption = "Removing ambiguity")
#> Error in is.data.frame(x): object 'x' not found

The ambiguous trailing zero adjustment is applied only to those variables for which the condition exists. For example, consider these data without the adjustment,

kable(format_engr(x, sigdig = c(4, 2, 0, 3), ambig_0_adj = FALSE))
#> Error in is.data.frame(x): object 'x' not found

The same numbers with ambig_0_adj = TRUE,

kable(format_engr(x, sigdig = c(4, 2, 0, 3), ambig_0_adj = TRUE))
#> Error in is.data.frame(x): object 'x' not found

and only the pressure variable has a reformatted power of ten because it is the only variable that had ambiguous trailing zeros.