Goals and use cases for a system that stores, queries,

and manipulates equations.

Nathaniel Beaver

January 9, 2017

Contents

1 License 2

2 Motivation 2

3 Use case 2

3.1 Print resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3.2 Electronic resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

4 The problem. 6

5 What the solution should look like. 7

6 Features 7

6.1 Semantic features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

6.2 Technical design features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

7 Some objections. 9

7.1 There’s more to physics and math than equations, you know. . . . . . . . . . 9

7.2 Equations without context are dangerous, and people will use them when they

are not applicable or assume what they are trying to prove, i.e. use circular

reasoning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

7.3 Why “equations” and not “expressions” or “formulas”? . . . . . . . . . . . . 10

7.4 Making it easier to change conventions will encourage fragmentation. . . . . 10

7.5 Is it really so much work to use a web search or a book index? . . . . . . . . 10

7.6 Surely there are people already working on this, or something similar? . . . . 11

7.7 Would people actually use this? . . . . . . . . . . . . . . . . . . . . . . . . . 11

7.8 This is way too hard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

7.9 Talk is cheap. Show me the code/data. . . . . . . . . . . . . . . . . . . . . . 11

7.10 There are too many equations. Searching them all would be hopeless. . . . . 11

8 More ambitious possibilities. 12

8.1 Semantics of representation and elimination of ambiguity. . . . . . . . . . . . 12

8.2 Automatically converting existing documents. . . . . . . . . . . . . . . . . . 12

8.3 Insight into dependency structure. . . . . . . . . . . . . . . . . . . . . . . . . 13

8.4 “Fingerprints” for equivalent mathematical expressions. . . . . . . . . . . . . 13

8.5 New ways of seeing and reasoning about equations. . . . . . . . . . . . . . . 13

8.5.1 An example coloring scheme for simple equations. . . . . . . . . . . . 14

8.6 Reasoning about physical systems and connections between them based on

mathematical structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1 License

This work is licensed under the Creative Commons Attribution 3.0 Unported License. To

view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/.

2 Motivation

My memory for even the commonest formulae is poor, and I rely on either re-

constructing them or referring to an accepted text.

— D. J. Finney, ”Dimensions of Statistics”

Equations are ubiquitous in mathematics and science. However, the way in which we

store and reference equations suﬀers from severe ﬂaws, such as the inability to transparently

manipulate equations, the diﬃculty of connecting disparate systems by shared mathematical

structure, and lack of insight into invariance. This hinders scientiﬁc inquiry in a variety of

ways.

3 Use case

Suppose I want to look up boundary conditions for an electric ﬁeld to include in a paper I

am writing.

3.1 Print resources.

Most physicists will reach for their trusty E&M books in this case.

In the 2nd edition of David J. Griﬃths’ “Introduction to Electrodynamics”, for example,

a search of the index for “Boundary conditions” gives these equations (in SI units) on page

313:

⊥

− D

⊥

= σ

− E

= 0

The 3rd edition of John David Jackson’s “Classical Electrodynamics” also uses SI units,

and a quick check of the index gives:

− D

) · n = σ

n × (E

− E

) = 0

The drawback is that these equations is that they are “read-only.” They must be manu-

ally typed up for use in a L

X document or word processing software, a laborious process

even for relatively short equations. They are even less suited for use in computational soft-

ware.

Ebooks and PDF ﬁles do not solve this problem, as they do not use L

X or computable

formats internally.

If the original document is written in L

X however, it is possible to attach the entire

X source code to a PDF, although this is somewhat less than convenient for reusing

the equation markup, since it does not automatically unwrap macros into standard L

commands. This means that the authors must either avoid the use of macros, or the user

must include all necessary packags and copy all the necessary macros to use the equations.

3.2 Electronic resources.

Let’s see if Wolfram Alpha’s database has the boundary conditions. After all, it does a good

job with other equations.

https://www.wolframalpha.com/input/?i=electric+field+boundary+conditions

Hm, that was a little disappointing.

For quick and dirty information ﬁnding, Google works well. Let’s see what the top results

are.

https://www.google.com/search?q=electric+field+boundary+conditions

The ﬁrst result, as of January 9, 2017, explains their origin from Gauss’s Law and gives

the equations as embedded png images:

The second result distinguishes between a charged/uncharged surface and uses diﬀerent

notation. They also use images, gif in this case (although I converted them to png for

inclusion in this document):

Note that already there is considerable variation in notation. Furthermore, neither site

includes any equations which can be parsed or modiﬁed without ﬁrst performing optical

character recognition on the images; they are profoundly read-only.

What about Wikipedia?

This is the relevant equation.

Fortunately, there is embedded L

X markup available under “Edit Source”.

:<math>\mathbf{n}_{12} \times (\mathbf{E}_2 - \mathbf{E}_1) = \mathbf{0} </math>

:<math>(\mathbf{D}_2 - \mathbf{D}_1) \cdot \mathbf{n}_{12} = \rho_{s} </math>

This will be suﬃcient for entering into a paper if:

1. We want L

X markup.

2. The Wikipedia editor uses the L

X markup correctly.

3. The Wikipedia markup uses the same symbols and units system that we want to use.

4. We only want to typeset the equation, not use it for actual computation.

4 The problem.

The mathematical community is split into small groups, each one with its own

customs, notation and terminology. It may soon be indispensable to present the

same result in several versions, each one accessible to a speciﬁc group; the price

one might have to pay otherwise is to have our work rediscovered by someone

who uses a diﬀerent language and notation, and who will rightly claim it as his

own.

— Gian-Carlo Rota, “Ten Lessons I wish I had been Taught”

Both the print and electronic resources have a number of deﬁciencies. In particular,

• Even electronic versions of equations are not generally suitable for using with compu-

tational tools.

• There is no reliable standard for representation of variables, even for the simplest and

most common equations. Sometimes a surface charge is σ, sometimes it is σ

, and

sometimes it is ρ

• There are a variety of diﬀerent conventions in use that diﬀer between physics and

mathematics. For example, while the right-hand rule is fairly universal, the quantities

φ and θ are swapped in spherical coordinates. Similar problems exist for Euler angles.

• There are many units systems in use in the sciences, several of which are popular

enough to include in, say, the software of a given subﬁeld, and even if everyone were

to standardize tomorrow, there are centuries of scholarly literature that is expressed in

diﬀerent units. One of the most common examples is the diﬀering forms of equations

for SI (MKSA) and Gaussian (cgs) systems. There are valid arguments for using either

system, or using the so-called “natural units”. Converting between these systems,

however, is tedious, and people should not have to spend much time on it.

• In much the same way as there are many unit systems, there are many equally valid

choices of gauge in electrodynamics and a proliferation of tensor notations. Dirac

notation is another example of an elegant notation that is occasionally translated into

more familiar notation for some manipulations.

• Most equations are embedded in a particular coordinate space and vector basis, such as

the old standbys of Cartesian, cylindrical, or spherical coordinates. Often, a problem is

much easier to attack in a favorable coordinate system. If transformation of coordinate

systems was eﬀortless, exploring alternatives solutions would become far less time-

consuming and laborious. (Wouldn’t it be fun to try solving something in a non-

holonomic basis?)

• Many derivations in physics rely on substitutions that make the equations unitless, yet

another related form of the equation that is diﬃcult to ﬁnd without looking up the

original derivation. These kinds of change of variables are very common, but seldom

tabulated systematically.

• Curves like conic sections have representations in Cartesian coordinates, polar coor-

dinates, parametric coordinates, and a variety of other exotic options. These can be

invaluable for simplifying analysis, but looking them up is time-consuming and re-

deriving them is time-consuming and error-prone.

• No print or electronic resource is free from errors, and not everyone checks the errata

page. Nor is the errata page always free from errors.

Together, these problems can make what should be simple tasks into a chore. There is

frequently unnecessary friction, for example, in using results from two diﬀerent papers, or

collaborating with a colleague who uses a diﬀerent unit system. And how many scholars

have lost hours of work by writing a paper with one choice of symbols and units and then

having to change them all for a journal submission?

5 What the solution should look like.

Instead of trying to convince everyone to use the same units, gauges, and symbols, it is saner

to leverage the ability of machines to do busywork.

An electronic database of equations with the relevant information could easily translate

between any system of units, and could produce output suitable for any software, regardless

of whether it is intended for use in a Fortran program, a L

X document, or a web browser.

If a transformation between these forms is non-trivial, the best thing is simply to have

a human provide the input and store them both, and make them easy to ﬁnd later. Rather

than attempting to ﬁnd the “one true form” of the equation, it should simply associate

various useful formats.

Furthermore, it should take free-form input queries that would outpace any or paper

index. It could also outpace a web search by taking advantage of limited input domain.

Queries could include items like common names (“Maxwell equations”), units of variables

(e.g. “all equations with the left side in units of force”), and even algebraic forms (all

equations of the form a x^2 + b x + c).

Given suﬃcient heuristics and metadata (much of which will, admittedly, have to be

entered manually), this is a solvable problem.

Finally, it should be personal and collaborative. It should be easy to import the standard

equations, but most people have speciﬁc, personalized needs, and collaborate with a relatively

small group of people, so the software should reﬂect that.

6 Features

6.1 Semantic features.

These are broad requirements that do not specify implementation.

• Provide a means to write explanatory text for symbols or sub-expressions in the relevant

equations, which most texts do anyway as a matter of course. For example,

L = I~ω,

where

L is angular momentum, I is the moment of inertia, and ~ω is the angular velocity.

• Search for equations by popular name and subﬁeld.

• Search for equations by regular expression.

• Search for equations by the symmetries they obey and mathematical groups they belong

to.

• Enter an equation and get back structurally similar and related equations.

• Enter some symbols and get equations back that use those symbols.

• Provide the option to perform simple substitutions (so a query for a + b

could return

an equation stored as x+y

, and one could specify substitution rules so that it outputs

m + n

)

• Provide the option to search for algebraically equivalent forms (e.g. a(b + c) could

return ab + ac). Note that this would only work for equations that had unambigu-

ous computable forms provided, so that the software could call, e.g. Sage, Maxima,

or Mathematica to determine whether or not the forms are algebraically equivalent.

(MathML renderers and L

X compilers will happily typeset algebraically ambiguous

or uncomputable gibberish.)

• Provide the option to search for algebraically equivalent forms with simple substitutions

(e.g. a(b + c) could return xy + xz)

• Provide the ability to search based on units, such as all equations with an exponent

that has units of radians or degrees, for example.

• Provide the option to choose units system, so that a single equation would have both

cgs and SI forms available, for example.

• Allow extensions to specify new unit systems or gauges.

6.2 Technical design features.

These are some speciﬁc requirements of the implementation.

• Store equation database locally. Network connectivity must not be required for typical

use.

• Provide the capability to update from network sources and merge in other people’s

changes, as existing distributed version control systems do.

• Store representations in at least one format of a variety of options (Unicode UTF-

8 plaintext, L

X, MathML, OpenMath, etc.) as well as a way to add more speciﬁc

forms. For example, one might be a L

X expression with the vector quantities having

arrows like this: ~r; and another with vector quantities bolded like this: r.

• Provide the ability to optionally store and access other formats (e.g. Mathematica,

MS Word, Matlab code, Fortran code, C code etc.)

• Call existing software to convert between representations, so that not everything has

to be converted manually. This falls into two categories:

1. Conversions that are essentially just formatting conversions. For example, while

X and MathML have some very apparent diﬀerences, they both require only

enough information to typeset an equation, not actually evaluate it. The auto-

matically generated markup could subsequently be tweaked by hand.

2. Conversions from a typesetting or markup language like L

X or MathML to

an expression that can actually be evaluated; one that you can plug in numbers

and get a numeric result. Some equations are simple enough that a conversion

to, say, C code is trivial – something just using trig, exponents, and arithmetic

functions, for example. Others will be suﬃciently abstract as to require manual

conversion or avoiding a computable format at all. (Alternately, a link to some

remotely hosted code may be in order; see below.) Incidentally, Stephen Wolfram

(of Mathematica fame) had this to say about this kind of conversion.

Unlike with ordinary human natural language, it is actually possible to

take a very close approximation to familiar mathematical notation, and

have a computer systematically understand it. That’s one of the big

things that we did about ﬁve years ago in the third version of Mathe-

matica. And at least a little of what we learned from doing that actually

made its way into the speciﬁcation of MathML.

• Link to internal and external references (refer to another equation in the same database,

jump to a speciﬁed page of a local or remote PDF or ebook, standard html-style links

to urls of relevant websites or source code implementations, digital object identiﬁers

(DOIs), bibtex references, etc.)

7 Some objections.

7.1 There’s more to physics and math than equations, you know.

Yes, and people did complex math for centuries without nice modern algebraic notation,

but equations do provide a very lovely and compact way to represent relationships between

variables.

Once we’ve got an easy and reliable way to ﬁnd the equations we want, we can spend

more time reasoning about whether the equation is applicable, what approximations to make,

what the physical interpretation is, and how that interpretation squares with experiment and

physical intuition.

7.2 Equations without context are dangerous, and people will use

them when they are not applicable or assume what they are

trying to prove, i.e. use circular reasoning.

A valid concern, which is why it’s important to write explanatory text about each symbol

and link to more complete discussions.

In any case, people already misuse equations, and this is generally because they don’t

want to put in the eﬀort to look up the context of the equation. Properly used, this software

could help mitigate this problem.

7.3 Why “equations” and not “expressions” or “formulas”?

This is just nomenclature.

I assume that this software could work for mathematical expressions in general, identi-

ties, approximations, chemical formulas, etc. Most of the time, though, we need to know

about relationships between variables, so “equations” are what most people think of and use

regularly.

7.4 Making it easier to change conventions will encourage frag-

mentation.

Possibly, but it’s pretty badly fragmented already. If there is a reliable system to auto-

matically convert systems of units and other conventions, fragmentation will not matter as

much.

Some fragmentation is due to the diﬃculty of updating existing bodies of work to match

modern improvements, so a system to make this easier could actually reduce fragmentation.

Furthermore, such a system would make it possible to use a standard form when writ-

ing papers and textbooks, since everyone could convert to their favorite set of conventions

without diﬃculty.

7.5 Is it really so much work to use a web search or a book index?

Yes.

Try doing a Google search for L = r × p. Symbolab works somewhat better for this, as

does searching for likely L

X markup.

This is just a simple example of hard it is to ﬁnd even a basic equation with relatively

few ways to express it.

As for books, if you can get everything you need from one book, great. The ones I need

are generally scattered across several books and journal articles, none of which use the same

notation. Note all books have a good glossary or index.

7.6 Surely there are people already working on this, or something

similar?

There are some interesting websites out there, but they’re more about searching existing

websites and scholarly literature and don’t accomplish more than a few of the features

mentioned above.

Wolfram Alpha is probably the closest right now, but its goals are broader than storing

equations. The lack of a local, user-controlled database is probably the biggest problem.

One result of this is a tendency towards only storing the “one true form” of an equation.

MathML and OpenMath are projects that employ similar ideas, although MathML is

focused mainly on web browsers and OpenMath is still unﬁnished. (The OpenMath website

lists only 58 members, many of whom are professors that work on it in their spare time.)

More importantly, OpenMath is working towards a standard for representing the math-

ematical objects for computer algebra systems, not a working piece of software performing

the functions mentioned above.

The database of equations could certainly use the OpenMath standard as another repre-

sentation — a reliable, standardized representation — but it would not require it to work.

MathJax is doing great things for putting math on the web, but not so much for storage

and retrieval.

There are also eﬀorts to make derivations automatic, i.e. proof assistants and automated

theorem proving. These eﬀorts are intriguing, but physicists at any rate are more interested

in the equations themselves and where they are applicable than rigorously deﬁning the

mechanism to derive the results.

7.7 Would people actually use this?

I would, and I have reason to believe other people would, too.

7.8 This is way too hard.

It’s really not; see below for the actually hard/interesting problems. The individual compo-

nents have existed for decades, they just haven’t been tied together yet. I regularly use a

desktop search tool to index and query my local documents, but it isn’t geared to equations

and hence doesn’t have the speciﬁc features I would like.

7.9 Talk is cheap. Show me the code/data.

There is an example of a minimal prototype/proof of concept on GitHub here:

https://github.com/nbeaver/equajson

7.10 There are too many equations. Searching them all would be

hopeless.

There number of well-formed formulas is indeed inﬁnite, (though countably so by G¨odel

numbering) but only a ﬁnite and relatively tiny subset of them are useful or interesting.

For comparison, there are a lot of Unicode characters, and more on the way, but writing

a program to search for the one you want is not unusual. The code for querying the database

is pretty simple, too.

8 More ambitious possibilities.

8.1 Semantics of representation and elimination of ambiguity.

Mathematical software is traditionally poor at retaining semantics. This is sometimes ben-

eﬁcial; it’s easy, for example, to simulate a universe with physical laws that do not match

this universe. However, some level of semantic information can be retained by, for example,

respecting consistent dimensionality and distinguishing indices from regular variables.

This could alleviate the namespace problem which is rampant in physics. For example,

introductory kinematics generally uses m for mass and µ for coeﬃcients of friction. However,

many upper-level mechanics books use µ for reduced mass. This can get awkward if you

want to use reduced mass in a problem with coeﬃcients of friction.

The problem gets much worse when diﬀerent subﬁelds of a discipline try to use each

other’s equations. (There’s only so much subscripts can do to increase the number of unique

symbols, especially if subscripts and superscripts are already used for tensor notation. There

are other ways to distinguish quantities, however.)

If the dimensions of each quantity were unambiguously speciﬁed, the symbols peculiar to

the problem wouldn’t matter as much, and symbol collisions could be detected and averted

automatically if the software had a suitable list of candidates for representing each quantity.

8.2 Automatically converting existing documents.

By taking advantage of certain reasonable assumptions, making simple substitution transfor-

mations on, say, a L

X document could be straightforward. Of more interest to physicists,

however, are more diﬃcult transformations, such as automatically convert the units from,

say, cgs to SI, as John David Jackson did for the ﬁrst 10 chapters of 3rd edition of “Classical

Electrodynamics”.

This is possible to do, and an equation with suﬃcient metadata to declare whether it was

in Gaussian or MKSA units could theoretically be translated to the other using a lookup table

for each quantity. In practice, because of things like L

X macros and multiple conventions

for cgs units — does that equation need a factor of c or c

? — automatic conversion tends

to be fragile. Some transformations are best done manually, or at least semi-manually.

A more promising possibility is to develop and use standards like OpenMath with the

goal of making it trivial to shift between notations and conventions. This will be largely

invisible in terms of form, but such functionality will make collaboration and reuse easier

and more robust.

For example, a textbook with self-describing unit systems built-in could be used by an

engineering class with customary units, an introductory physics class with SI units, or as

a supplement to an existing work in cgs units, all without confusion or tedious manual

conversion.

8.3 Insight into dependency structure.

Hiding behind every derived equation is a dependency graph leading all the way back to ﬁrst

principles.

Equations that “know” how they are derived can be more easily altered to understand

which assumptions they require.

Once an equations’ ancestry is unambiguously speciﬁed, it becomes trivial to answer

questions like, “Is this equation linear in θ?” or “Does this equation require isotropic per-

mittivity?” or “Does this equation hold for non-Euclidean geometries?” or “How would this

equation be diﬀerent if the sign convention for charge were reversed?”

8.4 “Fingerprints” for equivalent mathematical expressions.

Regardless of the notation used in, for example, the Pythagorean theorem, there are always

three independent variables. There is also always two operations (self-multiplication and

addition), or three if we count equality as an operation. Also, the commutativity properties

means that symmetry groups can be used to describe the equation.

These examples of notation-invariant properties could potentially be tabulated and sys-

tematized as a kind of ﬁngerprint for a large number of commonly-used expressions, which

could make recognizing mathematical patterns easier, since familiarity with the particular

notation and choice of symbols would not be necessary to recognize the pattern.

Ideally the ﬁngerprint would be robust enough to help identify the components of a

complex equation’s sub-expressions with simpler equations.

For example, an expression of the form

a =

b + c

will get many physicists thinking about electronics, since the equivalent expression

corresponds to adding resistors in parallel or capacitors in series. However, the symmetry is

obscured somewhat in the ﬁrst form.

A desirable mathematical ﬁngerprint would bring this pattern out into the open, and

would extend to helping identify sums of any number of reciprocals.

Equation metadata provides insight into what is invariant and fundamental in an equation

and what is an artifact of conventions such as positive and negative charge, origin, gauge,

basis, and orientation (left or right handedness) of a vector space.

8.5 New ways of seeing and reasoning about equations.

Once an equation can be parsed into a machine-manipulable format, there are many possi-

bilities, including color-coding, dependency graphs to keep track of which parts depend on

a given quantity, and whether it is separable into functions of the variables it depends on.

For example, Richard Feynman had grapheme-color synesthesia.

“When I see equations, I see the letters in colors – I don’t know why. As I’m

talking, I see vague pictures of Bessel functions from Jahnke and Emde’s book,

with light-tan j’s, slightly violet-bluish n’s, and dark brown x’s ﬂying around.

And I wonder what the hell it must look like to the students.”

Feynman might have seen something like this.

(ix)J

(x) − iJ

(x)J

(ix) = 0

One might imagine that it would somehow be useful to distinguish variables that

are diﬀerent colors. In my experience it is ﬁne to do this in annotating a formula.

But it becomes totally confusing if, for example, a red and green x are supposed

to be distinct variables.

With suﬃcient metadata about an equation, software could automatically color-code the

expression by assigning diﬀerent colors to functions and operators, independent free variables,

dependent free variables, and bound (a.ka. “dummy”) variables or indices.

This makes it faster and easier to answer questions like, “How many degrees of freedom

does this equation have?” or “How much of this equation could be replaced by a numeric

constant?”

8.5.1 An example coloring scheme for simple equations.

Operators and special functions are purple. This includes elementary binary arithmetic

operations like +, −, ×, and ÷; abstractions of them like

, and the in

;

integer-only operations like factorial ! and modulo arithmetic 5 ≡ 17 mod 12 real-

valued functions like log, sin, absolute value |x|, and the gamma function; vector

operations like the cross-product, dot-product, and various ∇ operations; set opera-

tions like ∩ and ∪, the operators of diﬀerential and integral calculus like lim

x→∞

sinx

dx, and

and their complex and vector versions; generalized binary operators like

the Kronecker delta, Poisson brackets, and commutator brackets; and also generalized

functions like the Dirac delta function. If it acts on zero or more inputs and cannot be

freely redeﬁned, it falls into this category.

Free variables and free functions are green. This includes quantities such as x and n

in the binomial approximation (1+x)

≈ 1+nx and the function f in the linear

approximation f(x) ≈ f(a) + f

(a)(x − a). It also includes free constants such as

constants of integration. If it could have a value or deﬁnition but has not been assigned

one, it falls into this category, even if it is later assigned a value or deﬁnition.

The independent variable is red. The choice of independent variable is not intrinsic to

the equation, but making it a diﬀerent color gives context to the intended use of the

relation. It is a semantic annotation, not a rigorously deﬁned construct.

Fixed constants are blue. These include numeric constants like 0, π or e, vector constants

like the unit vector ˆx, physical constants like the speed of light c or Avogadro’s number

, unchanging sets such as the set of real numbers R, unchanging groups like the

symmetry group of a square D

, and even poorly deﬁned but ﬁxed abstractions such as

∞ and −∞. It does not include constants of integration, unless the constant’s value

is ﬁxed by some constraint. If it is a ﬁxed quantity with a single, widely accepted

deﬁnition and a few common representations (ideally only one representation), it ﬁts

into this category. (Note that some physical constants may not actually be constant, so

this is context-dependent and depends on the author’s intent. Also note that for these

purposes, “constant” does not include quantities that must be kept constant to make

equation valid. For example, n

is constant over space in a homogeneous material

described by Snell’s Law n

sinθ

= n

sinθ

Nevertheless, although the relation

assumes n

is constant over time and space, n

of the entire medium is free to vary,

so n

is still considered a free variable. Similarly, while drag coeﬃcients are constant

for a given object at a given Reynolds number, they are freely variable components of

the drag equation.)

Indices and other bound variables are gray. This includes k in

100

k=0

and x in

as well as tensor indices. If its value is assigned and constrained by an operator in a

well-deﬁned way, but also varies, and the choice of symbol is completely arbitrary, it

falls into this category. Note that the constraint must be explicit and local to the

expression; most free variables have some kind of constraints on them, but not enough

to evaluate the expression.

Everything else remains black. This includes clarifying parenthesis in expressions like

(a/b)/c, as well as mathematical shorthand like ∀, ∈, ⊂, ∴, and relations like >, ≥,

=, 6=, and ≈.

This scheme has many limitations. For example, it

• is skewed in favor of equations used by physicists, not equations used by e.g. statisti-

cians or group theorists;

• does not distinguish between kinds of operators or functions;

• does not distinguish between integer-value, real, and complex free variables;

• does not establish whether free variables depend on other free variables or not;

• does not explicitly show the implicit multiplication operations;

• does not separate the operation of raising a value to the nth power (e.g 2

) from the

free variable n.

• Contains some ambiguities. For example, squaring a constant is performing an oper-

ation on that constant, but the result is still a constant, so perhaps the entire thing

should be colored as a constant?

However, it does have some good points as well. For example,

• Color is unused by existing notation, so any equations can be coded without loss of

information.

• Highlighting the independent variable in red conveys the author’s intent.

• An extra degree of freedom for disambiguating expressions. A primed variable f

diﬀerent from the derivative of a function f

• Mental variable collisions are less likely. It is easy to distinguish a constant of integra-

tion c from the speed of light c.

• The line between functions and variables is generally not clear-cut, since most vari-

ables depend on other variables. This scheme sidesteps the problem by lumping them

together.

• It increases the information density of an equation without adding excess complexity;

only ﬁve easily distinguishable hues are added.

• It constitutes a rudimentary visual type system for the expressions, which helps catch

error and inconsistencies. Diﬀerentiating with respect to a ﬁxed constant is obviously

wrong, for example.

Here is a small selection of equations codes with this scheme, mainly selected from calculus

(i.e. real analysis) and the physical sciences.

Compton scattering

−λ =

(1−cos θ)

Fundamental Theorem of Calculus

(x)dx = F (b)−F (a)

Binomial coeﬃcients





k! (n−k)!

i = 1

n−(k−i)

Geometric series

a+ar+ar

+ar

+ · · · +ar

n−1

k = 0

= a

1−r

Gauss’ Law

E·d

A =

Divergence theorem



∇·



dV =

‹

F · ~n dS

Stokes’ theorem

∇×

F · d

S =

∂S

F · d~r

Drag force acting on a projectile

= −ˆv

(v)ρAv

Van der Waals gas:



p +



(V −nb) = nRT

Trapezoidal rule

f(x) dx ≈ (b−a)



f(a)+f(b)



Bohr model radius of electron orbit

Rydberg formula for hydrogen

−1

8ε



−2

−n

−2



Virial theorem

2hT i = −

k=1

·~r

Einstein ﬁeld equations

µν

−

µν

R+g

µν

Λ =

8πG

µν

Fourier transform

f(ξ) =

∞

−∞

f(x) e

−2πixξ

8.6 Reasoning about physical systems and connections between

them based on mathematical structure.

The practical purpose of associating equivalent forms is to save time and eﬀort. The greater

purpose is to make explicit connections between seemingly disparate systems.

There are many famous examples of the same equation describing seemingly unrelated

phenomena. For example, the scalar wave equation describes vibrations in both solids and

ﬂuids, and similar equations describe electromagnetic waves and quantum mechanical wave-

functions. The hydraulic analogy provides a useful and intuitive (though potentially mis-

leading) way to reason about electronic circuits.

However, such connections are not isolated to famous results; such connections are discov-

ered or rediscovered regularly. Such connections could be made more quickly and rigorously

if governing equations, symmetries, and boundary conditions were explicitly stated in a way

that could be compared to other systems to test for equivalence.

Physicists delight in the rich mathematical structure of the systems they study. Sadly,

they are all too often divided from the mathematicians and from each other by mere notation.

By unifying and linking the language of mathematics, we can avoid reduplication of eﬀort

and make explicit connections which were otherwise unknown or neglected.