The notation we use to represent large and small numbers depends on
the context of our communications. In a computer script, for example, we
might encode Avogadro’s number as 6.0221*10^23
. A computer
printout of this number would typically use E-notation, as in
6.0221E+23
.
In professional technical communications, however, computer syntax
should be avoided—the asterisk (*
) and carat
(^
) in 6.0221*10^23
communicate
instructions to a computer, not mathematics to a
reader. And while E-notation (6.0221E+23
) has currency
in some discourse communities, the general convention in technical
communications is to format large and small numbers using powers-of-ten
notation of the form,
\[ a \times 10^n, \]
The \(\times\) multiplication symbol, often avoided in other contexts, is conventional syntax in powers-of-ten notation. Also, the notation has two forms in general use: scientific and engineering (Chase 2021, 63–67).
Scientific. \(n\) is an integer and \(a \in Re: 1\leq{|a|}<10\). For example, \(6.022\times{10}^{23}\)
Engineering. \(n\) is a multiple of 3 and \(a \in Re: 1\leq{|a|}<1000\). For example, \(602.2\times{10}^{21}\)
Exceptions. When exponents are in the neighborhood of zero, for example, \(n \in \{-1, 0, 1, 2\}\), decimal notation may be preferred to power of ten notation. Decimal values such as 0.1234, 1.234, 12.34, and 123.4 might be printed as-is. The range of exponents to include in this set is discretionary.
In R Markdown and Quarto Markdown, we use an inline equation markup
delimited by $...$
to create a math expression in the
output document. For example, the markup for Avogadro’s number is given
by,
$N_A = 6.0221\times{10}^{23}$
which yields \(N_A = 6.0221 \times 10^{23}\) in the document. To program the markup, however, we enclose the markup as a character string, that is,
"$N_A = 6.0221\\times{10}^{23}$"
which requires the backward slash \
to be “escaped”,
hence the product symbol is denoted \\times
. This is the
form of the output produced by format_power()
.
format_power()
Given a number, a numerical vector, or a numerical column from a data
frame, format_power()
converts the numbers to character
strings of the form,
"$a\\times{10}^{n}$"
where a
is the coefficient and n
is the
exponent. The user can specify the number of significant digits and
scientific or engineering format. Unless otherwise specified, numbers
with exponents in the neighborhood of zero are excluded from power of
ten notation and are converted to character strings of the form,
"$a$"
where a
is the number in decimal notation to the
specified number of significant digits.
Arguments.
x Numerical vector to be formatted. Can be a scalar, a vector, or a column from a data frame.
digits Positive nonzero integer to specify the number of significant digits. Default is 3.
format Possible values are “engr” (engineering notation) and “sci” (scientific notation). Default is “engr”.
omit_power Numeric vector of length two (or
NULL). Determines the range of exponents excluded from power-of-ten
notation. Default is c(-1, 2)
.
If you are writing your own script to follow along, we use these packages in this vignette:
library("formatdown")
library("data.table")
Equivalent usage. The first two arguments do not have to be named if the argument order is maintained.
# Numerical value
<- 6.0221E+23
avogadro
# Arguments named
<- format_power(x = avogadro, digits = 3))
(x #> [1] "$602\\times{10}^{21}$"
# Arguments unnamed
<- format_power(avogadro, 3)
y
# Implicit use of default argument
<- format_power(avogadro)
z
# Demonstrate equivalence
all.equal(x, y)
#> [1] TRUE
all.equal(x, z)
#> [1] TRUE
Use with inline R code.
format_power(avogadro)
#> [1] "$602\\times{10}^{21}$"
which, in an Rmd or qmd document, is rendered as \(602\times{10}^{21}\).
<- c(
x 1.2222e-6, 2.3333e-5, 3.4444e-4, 4.1111e-3, 5.2222e-2, 6.3333e-1,
7.4444e+0, 8.1111e+1, 9.2222e+2, 1.3333e+3, 2.4444e+4, 3.1111e+5, 4.2222e+6
)format_power(x)
#> [1] "$1.22\\times{10}^{-6}$" "$23.3\\times{10}^{-6}$" "$344\\times{10}^{-6}$"
#> [4] "$4.11\\times{10}^{-3}$" "$52.2\\times{10}^{-3}$" "$0.633$"
#> [7] "$7.44$" "$81.1$" "$922$"
#> [10] "$1.33\\times{10}^{3}$" "$24.4\\times{10}^{3}$" "$311\\times{10}^{3}$"
#> [13] "$4.22\\times{10}^{6}$"
is rendered as \(1.22\times{10}^{-6}\), \(23.3\times{10}^{-6}\), \(344\times{10}^{-6}\), etc.
Argument does not have to be named.
format_power(x[1], 3)
#> [1] "$1.22\\times{10}^{-6}$"
format_power(x[1], 4)
#> [1] "$1.222\\times{10}^{-6}$"
are rendered as
Argument, if included, must be named.
format_power(x[3], format = "sci")
#> [1] "$3.44\\times{10}^{-4}$"
format_power(x[3], format = "engr")
#> [1] "$344\\times{10}^{-6}$"
are rendered as
To compare the effects across many orders of magnitude, we format the
example vector twice, placing the results side by side in a data frame
for comparison, rendered using knitr::kable()
,
# Compare two formats
<- data.table(
DT scientific = format_power(x, format = "sci"),
engineering = format_power(x)
)::kable(DT, align = "rr") knitr
scientific | engineering |
---|---|
\(1.22\times{10}^{-6}\) | \(1.22\times{10}^{-6}\) |
\(2.33\times{10}^{-5}\) | \(23.3\times{10}^{-6}\) |
\(3.44\times{10}^{-4}\) | \(344\times{10}^{-6}\) |
\(4.11\times{10}^{-3}\) | \(4.11\times{10}^{-3}\) |
\(5.22\times{10}^{-2}\) | \(52.2\times{10}^{-3}\) |
\(0.633\) | \(0.633\) |
\(7.44\) | \(7.44\) |
\(81.1\) | \(81.1\) |
\(922\) | \(922\) |
\(1.33\times{10}^{3}\) | \(1.33\times{10}^{3}\) |
\(2.44\times{10}^{4}\) | \(24.4\times{10}^{3}\) |
\(3.11\times{10}^{5}\) | \(311\times{10}^{3}\) |
\(4.22\times{10}^{6}\) | \(4.22\times{10}^{6}\) |
omit_power
argument, if included, must be named.
format_power(x[6], omit_power = c(-1, 3))
#> [1] "$0.633$"
format_power(x[6], omit_power = c(0, 3))
#> [1] "$633\\times{10}^{-3}$"
are rendered as
omit_power = NULL
removes the exceptions and all numbers
are rendered in the selected power-of-ten notation.
# Omit no values from power-of-ten notation
<- data.table(
DT scientific = format_power(x, format = "sci", omit_power = NULL),
engineering = format_power(x, omit_power = NULL)
)::kable(DT, align = "rr") knitr
scientific | engineering |
---|---|
\(1.22\times{10}^{-6}\) | \(1.22\times{10}^{-6}\) |
\(2.33\times{10}^{-5}\) | \(23.3\times{10}^{-6}\) |
\(3.44\times{10}^{-4}\) | \(344\times{10}^{-6}\) |
\(4.11\times{10}^{-3}\) | \(4.11\times{10}^{-3}\) |
\(5.22\times{10}^{-2}\) | \(52.2\times{10}^{-3}\) |
\(6.33\times{10}^{-1}\) | \(633\times{10}^{-3}\) |
\(7.44\times{10}^{0}\) | \(7.44\times{10}^{0}\) |
\(8.11\times{10}^{1}\) | \(81.1\times{10}^{0}\) |
\(9.22\times{10}^{2}\) | \(922\times{10}^{0}\) |
\(1.33\times{10}^{3}\) | \(1.33\times{10}^{3}\) |
\(2.44\times{10}^{4}\) | \(24.4\times{10}^{3}\) |
\(3.11\times{10}^{5}\) | \(311\times{10}^{3}\) |
\(4.22\times{10}^{6}\) | \(4.22\times{10}^{6}\) |
Using density
, a data frame included with formatdown
that contains columns of class Date, character, factor, numeric, and
integer.
density#> date trial humidity T_K p_Pa R density
#> <Date> <char> <fctr> <num> <num> <int> <num>
#> 1: 2018-06-12 a low 294.05 101100 287 1.197976
#> 2: 2018-06-13 b high 294.15 101000 287 1.196384
#> 3: 2018-06-14 c medium 294.65 101100 287 1.195536
#> 4: 2018-06-15 d low 293.35 101000 287 1.199647
#> 5: 2018-06-16 e high 293.85 101100 287 1.198791
which is rendered un-formatted as
date | trial | humidity | T_K | p_Pa | R | density |
---|---|---|---|---|---|---|
2018-06-12 | a | low | 294.05 | 101100 | 287 | 1.197976 |
2018-06-13 | b | high | 294.15 | 101000 | 287 | 1.196384 |
2018-06-14 | c | medium | 294.65 | 101100 | 287 | 1.195536 |
2018-06-15 | d | low | 293.35 | 101000 | 287 | 1.199647 |
2018-06-16 | e | high | 293.85 | 101100 | 287 | 1.198791 |
Treating a column as a vector,
# Copy to avoid "by reference" changes to density
<- copy(density)
DT
# Format as a vector
format_power(DT$p_Pa, digits = 4)
#> [1] "$101.1\\times{10}^{3}$" "$101.0\\times{10}^{3}$" "$101.1\\times{10}^{3}$"
#> [4] "$101.0\\times{10}^{3}$" "$101.1\\times{10}^{3}$"
is rendered as \(101.1\times{10}^{3}\), \(101.0\times{10}^{3}\), \(101.1\times{10}^{3}\), \(101.0\times{10}^{3}\), \(101.1\times{10}^{3}\).
Treating a column within a data frame,
# Copy to avoid "by reference" changes to density
<- copy(density)
DT
# Format one column, retain all columns
$p_Pa <- format_power(DT$p_Pa, digits = 4)
DT
DT[]#> date trial humidity T_K p_Pa R density
#> <Date> <char> <fctr> <num> <char> <int> <num>
#> 1: 2018-06-12 a low 294.05 $101.1\\times{10}^{3}$ 287 1.197976
#> 2: 2018-06-13 b high 294.15 $101.0\\times{10}^{3}$ 287 1.196384
#> 3: 2018-06-14 c medium 294.65 $101.1\\times{10}^{3}$ 287 1.195536
#> 4: 2018-06-15 d low 293.35 $101.0\\times{10}^{3}$ 287 1.199647
#> 5: 2018-06-16 e high 293.85 $101.1\\times{10}^{3}$ 287 1.198791
is rendered as
date | trial | humidity | T_K | p_Pa | R | density |
---|---|---|---|---|---|---|
2018-06-12 | a | low | 294.05 | \(101.1\times{10}^{3}\) | 287 | 1.197976 |
2018-06-13 | b | high | 294.15 | \(101.0\times{10}^{3}\) | 287 | 1.196384 |
2018-06-14 | c | medium | 294.65 | \(101.1\times{10}^{3}\) | 287 | 1.195536 |
2018-06-15 | d | low | 293.35 | \(101.0\times{10}^{3}\) | 287 | 1.199647 |
2018-06-16 | e | high | 293.85 | \(101.1\times{10}^{3}\) | 287 | 1.198791 |
Using lapply()
to select and treat multiple columns from
a data frame,
# Copy to avoid "by reference" changes to density
<- copy(density)
DT
# Identify columns to format
<- c("T_K", "p_Pa", "density")
cols_we_want
# Select and format.
<- DT[, lapply(.SD, function(x) format_power(x, 4)), .SDcols = cols_we_want]
DT
DT[]#> T_K p_Pa density
#> <char> <char> <char>
#> 1: $294.1$ $101.1\\times{10}^{3}$ $1.198$
#> 2: $294.1$ $101.0\\times{10}^{3}$ $1.196$
#> 3: $294.6$ $101.1\\times{10}^{3}$ $1.196$
#> 4: $293.4$ $101.0\\times{10}^{3}$ $1.200$
#> 5: $293.9$ $101.1\\times{10}^{3}$ $1.199$
is rendered as
T_K | p_Pa | density |
---|---|---|
\(294.1\) | \(101.1\times{10}^{3}\) | \(1.198\) |
\(294.1\) | \(101.0\times{10}^{3}\) | \(1.196\) |
\(294.6\) | \(101.1\times{10}^{3}\) | \(1.196\) |
\(293.4\) | \(101.0\times{10}^{3}\) | \(1.200\) |
\(293.9\) | \(101.1\times{10}^{3}\) | \(1.199\) |
Repeat, but retain all columns,
# Copy to avoid "by reference" changes to density
<- copy(density)
DT
# Identify columns to format
<- c("T_K", "p_Pa", "density")
cols_we_want
# Format selected columns, retain all columns
<- DT[, (cols_we_want) := lapply(.SD, function(x) format_power(x, 4)), .SDcols = cols_we_want]
DT
DT[]#> date trial humidity T_K p_Pa R density
#> <Date> <char> <fctr> <char> <char> <int> <char>
#> 1: 2018-06-12 a low $294.1$ $101.1\\times{10}^{3}$ 287 $1.198$
#> 2: 2018-06-13 b high $294.1$ $101.0\\times{10}^{3}$ 287 $1.196$
#> 3: 2018-06-14 c medium $294.6$ $101.1\\times{10}^{3}$ 287 $1.196$
#> 4: 2018-06-15 d low $293.4$ $101.0\\times{10}^{3}$ 287 $1.200$
#> 5: 2018-06-16 e high $293.9$ $101.1\\times{10}^{3}$ 287 $1.199$
is rendered as
date | trial | humidity | T_K | p_Pa | R | density |
---|---|---|---|---|---|---|
2018-06-12 | a | low | \(294.1\) | \(101.1\times{10}^{3}\) | 287 | \(1.198\) |
2018-06-13 | b | high | \(294.1\) | \(101.0\times{10}^{3}\) | 287 | \(1.196\) |
2018-06-14 | c | medium | \(294.6\) | \(101.1\times{10}^{3}\) | 287 | \(1.196\) |
2018-06-15 | d | low | \(293.4\) | \(101.0\times{10}^{3}\) | 287 | \(1.200\) |
2018-06-16 | e | high | \(293.9\) | \(101.1\times{10}^{3}\) | 287 | \(1.199\) |
If different columns should be reported with different significant digits, treat the columns separately.
# Copy to avoid "by reference" changes to density
<- copy(density)
DT
# Format selected columns, retain all columns with signif digits = 5
<- c("T_K", "density")
cols_we_want <- DT[, (cols_we_want) := lapply(.SD, function(x) format_power(x, 5)), .SDcols = cols_we_want]
DT
# Individually format one column with signif digits = 4
$p_Pa <- format_power(DT$p_Pa, 4)
DT
DT[]#> date trial humidity T_K p_Pa R density
#> <Date> <char> <fctr> <char> <char> <int> <char>
#> 1: 2018-06-12 a low $294.05$ $101.1\\times{10}^{3}$ 287 $1.1980$
#> 2: 2018-06-13 b high $294.15$ $101.0\\times{10}^{3}$ 287 $1.1964$
#> 3: 2018-06-14 c medium $294.65$ $101.1\\times{10}^{3}$ 287 $1.1955$
#> 4: 2018-06-15 d low $293.35$ $101.0\\times{10}^{3}$ 287 $1.1996$
#> 5: 2018-06-16 e high $293.85$ $101.1\\times{10}^{3}$ 287 $1.1988$
is rendered as
date | trial | humidity | T_K | p_Pa | R | density |
---|---|---|---|---|---|---|
2018-06-12 | a | low | \(294.05\) | \(101.1\times{10}^{3}\) | 287 | \(1.1980\) |
2018-06-13 | b | high | \(294.15\) | \(101.0\times{10}^{3}\) | 287 | \(1.1964\) |
2018-06-14 | c | medium | \(294.65\) | \(101.1\times{10}^{3}\) | 287 | \(1.1955\) |
2018-06-15 | d | low | \(293.35\) | \(101.0\times{10}^{3}\) | 287 | \(1.1996\) |
2018-06-16 | e | high | \(293.85\) | \(101.1\times{10}^{3}\) | 287 | \(1.1988\) |
Applying the function to non-numeric objects produces errors. For example, “Date” class.
<- density$date
x format_power(x)
#> Error in format_power(x): Assertion on 'class(x)' failed: Must be disjunct from {'Date','POSIXct','POSIXt'}, but has elements {'Date'}.
Error on “character” class,
<- density$trial
x format_power(x)
#> Error in format_power(x): Assertion on 'x' failed. Must be of class 'numeric', not 'character'.
Error on “factor” class,
<- density$humidity
x format_power(x)
#> Error in format_power(x): Assertion on 'x' failed. Must be of class 'numeric', not 'factor'.
Chase, Morgan. 2021. Technical Mathematics. https://openoregon.pressbooks.pub/techmath/chapter/module-11-scientific-notation/.