Format powers of ten

The notation we use to represent large and small numbers depends on the context of our communications. In a computer script, for example, we might encode Avogadro’s number as 6.0221*10^23. A computer printout of this number would typically use E-notation, as in 6.0221E+23.

In professional technical communications, however, computer syntax should be avoided—the asterisk (*) and carat (^) in 6.0221*10^23 communicate instructions to a computer, not mathematics to a reader. And while E-notation (6.0221E+23) has currency in some discourse communities, the general convention in technical communications is to format large and small numbers using powers-of-ten notation of the form,

\[ a \times 10^n, \]

The \(\times\) multiplication symbol, often avoided in other contexts, is conventional syntax in powers-of-ten notation. Also, the notation has two forms in general use: scientific and engineering (Chase 2021, 63–67).

Exceptions.   When exponents are in the neighborhood of zero, for example, \(n \in \{-1, 0, 1, 2\}\), decimal notation may be preferred to power of ten notation. Decimal values such as 0.1234, 1.234, 12.34, and 123.4 might be printed as-is. The range of exponents to include in this set is discretionary.

Markup

In R Markdown and Quarto Markdown, we use an inline equation markup delimited by $...$ to create a math expression in the output document. For example, the markup for Avogadro’s number is given by,

    $N_A = 6.0221\times{10}^{23}$

which yields \(N_A = 6.0221 \times 10^{23}\) in the document. To program the markup, however, we enclose the markup as a character string, that is,

    "$N_A = 6.0221\\times{10}^{23}$"

which requires the backward slash \ to be “escaped”, hence the product symbol is denoted \\times. This is the form of the output produced by format_power().

format_power()

Given a number, a numerical vector, or a numerical column from a data frame, format_power() converts the numbers to character strings of the form,

    "$a\\times{10}^{n}$" 

where a is the coefficient and n is the exponent. The user can specify the number of significant digits and scientific or engineering format. Unless otherwise specified, numbers with exponents in the neighborhood of zero are excluded from power of ten notation and are converted to character strings of the form,

    "$a$" 

where a is the number in decimal notation to the specified number of significant digits.

Arguments.

If you are writing your own script to follow along, we use these packages in this vignette:

library("formatdown")
library("data.table")

Equivalent usage.   The first two arguments do not have to be named if the argument order is maintained.

# Numerical value
avogadro <- 6.0221E+23

# Arguments named
(x <- format_power(x = avogadro, digits = 3))
#> [1] "$602\\times{10}^{21}$"

# Arguments unnamed
y <- format_power(avogadro, 3)

# Implicit use of default argument
z <- format_power(avogadro)

# Demonstrate equivalence
all.equal(x, y)
#> [1] TRUE
all.equal(x, z)
#> [1] TRUE

Scalar

Use with inline R code.

format_power(avogadro)
#> [1] "$602\\times{10}^{21}$"

which, in an Rmd or qmd document, is rendered as \(602\times{10}^{21}\).

Vector

x <- c(
  1.2222e-6, 2.3333e-5, 3.4444e-4, 4.1111e-3, 5.2222e-2, 6.3333e-1,
  7.4444e+0, 8.1111e+1, 9.2222e+2, 1.3333e+3, 2.4444e+4, 3.1111e+5, 4.2222e+6
)
format_power(x)
#>  [1] "$1.22\\times{10}^{-6}$" "$23.3\\times{10}^{-6}$" "$344\\times{10}^{-6}$" 
#>  [4] "$4.11\\times{10}^{-3}$" "$52.2\\times{10}^{-3}$" "$0.633$"               
#>  [7] "$7.44$"                 "$81.1$"                 "$922$"                 
#> [10] "$1.33\\times{10}^{3}$"  "$24.4\\times{10}^{3}$"  "$311\\times{10}^{3}$"  
#> [13] "$4.22\\times{10}^{6}$"

is rendered as \(1.22\times{10}^{-6}\), \(23.3\times{10}^{-6}\), \(344\times{10}^{-6}\), etc.

Significant digits

Argument does not have to be named.

format_power(x[1], 3)
#> [1] "$1.22\\times{10}^{-6}$"
format_power(x[1], 4)
#> [1] "$1.222\\times{10}^{-6}$"

are rendered as

Format

Argument, if included, must be named.

format_power(x[3], format = "sci")
#> [1] "$3.44\\times{10}^{-4}$"
format_power(x[3], format = "engr")
#> [1] "$344\\times{10}^{-6}$"

are rendered as

To compare the effects across many orders of magnitude, we format the example vector twice, placing the results side by side in a data frame for comparison, rendered using knitr::kable(),

# Compare two formats
DT <- data.table(
  scientific  = format_power(x, format = "sci"),
  engineering = format_power(x)
)
knitr::kable(DT, align = "rr")
scientific engineering
\(1.22\times{10}^{-6}\) \(1.22\times{10}^{-6}\)
\(2.33\times{10}^{-5}\) \(23.3\times{10}^{-6}\)
\(3.44\times{10}^{-4}\) \(344\times{10}^{-6}\)
\(4.11\times{10}^{-3}\) \(4.11\times{10}^{-3}\)
\(5.22\times{10}^{-2}\) \(52.2\times{10}^{-3}\)
\(0.633\) \(0.633\)
\(7.44\) \(7.44\)
\(81.1\) \(81.1\)
\(922\) \(922\)
\(1.33\times{10}^{3}\) \(1.33\times{10}^{3}\)
\(2.44\times{10}^{4}\) \(24.4\times{10}^{3}\)
\(3.11\times{10}^{5}\) \(311\times{10}^{3}\)
\(4.22\times{10}^{6}\) \(4.22\times{10}^{6}\)

Exceptions

omit_power argument, if included, must be named.

format_power(x[6], omit_power = c(-1, 3))
#> [1] "$0.633$"
format_power(x[6], omit_power = c(0, 3))
#> [1] "$633\\times{10}^{-3}$"

are rendered as

omit_power = NULL removes the exceptions and all numbers are rendered in the selected power-of-ten notation.

# Omit no values from power-of-ten notation
DT <- data.table(
  scientific = format_power(x, format = "sci", omit_power = NULL),
  engineering = format_power(x, omit_power = NULL)
)
knitr::kable(DT, align = "rr")
scientific engineering
\(1.22\times{10}^{-6}\) \(1.22\times{10}^{-6}\)
\(2.33\times{10}^{-5}\) \(23.3\times{10}^{-6}\)
\(3.44\times{10}^{-4}\) \(344\times{10}^{-6}\)
\(4.11\times{10}^{-3}\) \(4.11\times{10}^{-3}\)
\(5.22\times{10}^{-2}\) \(52.2\times{10}^{-3}\)
\(6.33\times{10}^{-1}\) \(633\times{10}^{-3}\)
\(7.44\times{10}^{0}\) \(7.44\times{10}^{0}\)
\(8.11\times{10}^{1}\) \(81.1\times{10}^{0}\)
\(9.22\times{10}^{2}\) \(922\times{10}^{0}\)
\(1.33\times{10}^{3}\) \(1.33\times{10}^{3}\)
\(2.44\times{10}^{4}\) \(24.4\times{10}^{3}\)
\(3.11\times{10}^{5}\) \(311\times{10}^{3}\)
\(4.22\times{10}^{6}\) \(4.22\times{10}^{6}\)

Data frame

Using density, a data frame included with formatdown that contains columns of class Date, character, factor, numeric, and integer.

density
#>          date  trial humidity    T_K   p_Pa     R  density
#>        <Date> <char>   <fctr>  <num>  <num> <int>    <num>
#> 1: 2018-06-12      a      low 294.05 101100   287 1.197976
#> 2: 2018-06-13      b     high 294.15 101000   287 1.196384
#> 3: 2018-06-14      c   medium 294.65 101100   287 1.195536
#> 4: 2018-06-15      d      low 293.35 101000   287 1.199647
#> 5: 2018-06-16      e     high 293.85 101100   287 1.198791

which is rendered un-formatted as

date trial humidity T_K p_Pa R density
2018-06-12 a low 294.05 101100 287 1.197976
2018-06-13 b high 294.15 101000 287 1.196384
2018-06-14 c medium 294.65 101100 287 1.195536
2018-06-15 d low 293.35 101000 287 1.199647
2018-06-16 e high 293.85 101100 287 1.198791

Treating a column as a vector,

# Copy to avoid "by reference" changes to density
DT <- copy(density)

# Format as a vector
format_power(DT$p_Pa, digits = 4)
#> [1] "$101.1\\times{10}^{3}$" "$101.0\\times{10}^{3}$" "$101.1\\times{10}^{3}$"
#> [4] "$101.0\\times{10}^{3}$" "$101.1\\times{10}^{3}$"

is rendered as \(101.1\times{10}^{3}\), \(101.0\times{10}^{3}\), \(101.1\times{10}^{3}\), \(101.0\times{10}^{3}\), \(101.1\times{10}^{3}\).

Treating a column within a data frame,

# Copy to avoid "by reference" changes to density
DT <- copy(density)

# Format one column, retain all columns
DT$p_Pa <- format_power(DT$p_Pa, digits = 4)
DT[]
#>          date  trial humidity    T_K                   p_Pa     R  density
#>        <Date> <char>   <fctr>  <num>                 <char> <int>    <num>
#> 1: 2018-06-12      a      low 294.05 $101.1\\times{10}^{3}$   287 1.197976
#> 2: 2018-06-13      b     high 294.15 $101.0\\times{10}^{3}$   287 1.196384
#> 3: 2018-06-14      c   medium 294.65 $101.1\\times{10}^{3}$   287 1.195536
#> 4: 2018-06-15      d      low 293.35 $101.0\\times{10}^{3}$   287 1.199647
#> 5: 2018-06-16      e     high 293.85 $101.1\\times{10}^{3}$   287 1.198791

is rendered as

date trial humidity T_K p_Pa R density
2018-06-12 a low 294.05 \(101.1\times{10}^{3}\) 287 1.197976
2018-06-13 b high 294.15 \(101.0\times{10}^{3}\) 287 1.196384
2018-06-14 c medium 294.65 \(101.1\times{10}^{3}\) 287 1.195536
2018-06-15 d low 293.35 \(101.0\times{10}^{3}\) 287 1.199647
2018-06-16 e high 293.85 \(101.1\times{10}^{3}\) 287 1.198791

Using lapply() to select and treat multiple columns from a data frame,

# Copy to avoid "by reference" changes to density
DT <- copy(density)

# Identify columns to format
cols_we_want <- c("T_K", "p_Pa", "density")

# Select and format.
DT <- DT[, lapply(.SD, function(x) format_power(x, 4)), .SDcols = cols_we_want]
DT[]
#>        T_K                   p_Pa density
#>     <char>                 <char>  <char>
#> 1: $294.1$ $101.1\\times{10}^{3}$ $1.198$
#> 2: $294.1$ $101.0\\times{10}^{3}$ $1.196$
#> 3: $294.6$ $101.1\\times{10}^{3}$ $1.196$
#> 4: $293.4$ $101.0\\times{10}^{3}$ $1.200$
#> 5: $293.9$ $101.1\\times{10}^{3}$ $1.199$

is rendered as

T_K p_Pa density
\(294.1\) \(101.1\times{10}^{3}\) \(1.198\)
\(294.1\) \(101.0\times{10}^{3}\) \(1.196\)
\(294.6\) \(101.1\times{10}^{3}\) \(1.196\)
\(293.4\) \(101.0\times{10}^{3}\) \(1.200\)
\(293.9\) \(101.1\times{10}^{3}\) \(1.199\)

Repeat, but retain all columns,

# Copy to avoid "by reference" changes to density
DT <- copy(density)

# Identify columns to format
cols_we_want <- c("T_K", "p_Pa", "density")

# Format selected columns, retain all columns
DT <- DT[, (cols_we_want) := lapply(.SD, function(x) format_power(x, 4)), .SDcols = cols_we_want]
DT[]
#>          date  trial humidity     T_K                   p_Pa     R density
#>        <Date> <char>   <fctr>  <char>                 <char> <int>  <char>
#> 1: 2018-06-12      a      low $294.1$ $101.1\\times{10}^{3}$   287 $1.198$
#> 2: 2018-06-13      b     high $294.1$ $101.0\\times{10}^{3}$   287 $1.196$
#> 3: 2018-06-14      c   medium $294.6$ $101.1\\times{10}^{3}$   287 $1.196$
#> 4: 2018-06-15      d      low $293.4$ $101.0\\times{10}^{3}$   287 $1.200$
#> 5: 2018-06-16      e     high $293.9$ $101.1\\times{10}^{3}$   287 $1.199$

is rendered as

date trial humidity T_K p_Pa R density
2018-06-12 a low \(294.1\) \(101.1\times{10}^{3}\) 287 \(1.198\)
2018-06-13 b high \(294.1\) \(101.0\times{10}^{3}\) 287 \(1.196\)
2018-06-14 c medium \(294.6\) \(101.1\times{10}^{3}\) 287 \(1.196\)
2018-06-15 d low \(293.4\) \(101.0\times{10}^{3}\) 287 \(1.200\)
2018-06-16 e high \(293.9\) \(101.1\times{10}^{3}\) 287 \(1.199\)

If different columns should be reported with different significant digits, treat the columns separately.

# Copy to avoid "by reference" changes to density
DT <- copy(density)

# Format selected columns, retain all columns with signif digits = 5
cols_we_want <- c("T_K", "density")
DT <- DT[, (cols_we_want) := lapply(.SD, function(x) format_power(x, 5)), .SDcols = cols_we_want]

# Individually format one column with signif digits = 4
DT$p_Pa <- format_power(DT$p_Pa, 4)
DT[]
#>          date  trial humidity      T_K                   p_Pa     R  density
#>        <Date> <char>   <fctr>   <char>                 <char> <int>   <char>
#> 1: 2018-06-12      a      low $294.05$ $101.1\\times{10}^{3}$   287 $1.1980$
#> 2: 2018-06-13      b     high $294.15$ $101.0\\times{10}^{3}$   287 $1.1964$
#> 3: 2018-06-14      c   medium $294.65$ $101.1\\times{10}^{3}$   287 $1.1955$
#> 4: 2018-06-15      d      low $293.35$ $101.0\\times{10}^{3}$   287 $1.1996$
#> 5: 2018-06-16      e     high $293.85$ $101.1\\times{10}^{3}$   287 $1.1988$

is rendered as

date trial humidity T_K p_Pa R density
2018-06-12 a low \(294.05\) \(101.1\times{10}^{3}\) 287 \(1.1980\)
2018-06-13 b high \(294.15\) \(101.0\times{10}^{3}\) 287 \(1.1964\)
2018-06-14 c medium \(294.65\) \(101.1\times{10}^{3}\) 287 \(1.1955\)
2018-06-15 d low \(293.35\) \(101.0\times{10}^{3}\) 287 \(1.1996\)
2018-06-16 e high \(293.85\) \(101.1\times{10}^{3}\) 287 \(1.1988\)

Non-numeric input

Applying the function to non-numeric objects produces errors. For example, “Date” class.

x <- density$date
format_power(x)
#> Error in format_power(x): Assertion on 'class(x)' failed: Must be disjunct from {'Date','POSIXct','POSIXt'}, but has elements {'Date'}.

Error on “character” class,

x <- density$trial
format_power(x)
#> Error in format_power(x): Assertion on 'x' failed. Must be of class 'numeric', not 'character'.

Error on “factor” class,

x <- density$humidity
format_power(x)
#> Error in format_power(x): Assertion on 'x' failed. Must be of class 'numeric', not 'factor'.

References

Chase, Morgan. 2021. Technical Mathematics. https://openoregon.pressbooks.pub/techmath/chapter/module-11-scientific-notation/.