Pythonica

a system for doing mathematics with Python

Download:
pythonica.py
pythonicacore.py
tostr.py

Note: I am no longer maintaining Pythonica, and the version of the code linked directly from this page contains some bugs. For the latest version, including bug fixes, please see Kevin Ballard's page at http://www.tildesoft.com/Pythonica.html. -- JJS


Introduction

Computers are exceedingly good at arithmatic, and software for doing numerical calculation is plentiful. However, though computers can also be quite good at abstract mathematics, software applications for symbolic math are more difficult to come by. Perhaps the most widely known and powerful program for symbolic math is Mathematica, which grants its users access to a fast, accurate, and very knowledgable mathematicion.

But the principles of a symbolic math program such as this are not all that difficult to follow, and in a language such as Python, may be implemented in a matter of hours. The real value of a commercial program lies in the many thousands of mathematical functions, identities, and transformation rules they contain -- the embodiment of centuries of mathematical research. But the barest functional skeleton of such a program may require only a few pages of code.

Pythonica is a Python implementation of a symbolic math program, based upon the fantastic precedent set by Mathematica. In this paper, the functionality and simple usage rules are first presented. Then the principles behind the program are discussed, along with the details of implementing these principles in Python. Future directions for extending Pythonica are also considered.

In[1] :=   x*5 - 6 x / 2

in FullForm: Minus[Times[x,5],Times[6,Divide[x,2]]]
evaluates to: Plus[Times[5,x],Times[-1,Times[6,Divide[x,2]]]]
simplifies to: Plus[Times[5,x],Times[-6,Times[0.5,x]]]
simplifies to: Times[2,x]

Out[1] =   2 x

In[2] :=   Out[1] /. x->21

in FullForm: ReplaceAll[Out[1],Rule[x,21]]
evaluates to: 42

Out[2] =   42

In[3] :=   % / 7 + x

in FullForm: Plus[Divide[Out[-1],7],x]
evaluates to: Plus[x,6]

Out[3] =   x + 6

In[4] :=   (%^2)-7 /. x->1

in FullForm: ReplaceAll[Minus[Power[Out[-1],2],7],Rule[x,1]]
evaluates to: 42

Out[4] =   42
Figure 1: Sample Pythonica session.

Pythonica's Functionality

Pythonica follows Mathematica syntax for functions which they have in common; anything that works in Pythonica, will also work in Mathematica, though the reverse is of course untrue.

The user enters a function, equation, or other expression. Pythonica parses this "natural" input into a cononical expression form known as "FullForm". The expression is then evaluated, generating a result. The result is simplified as much as the program is able, then printed to the user -- both in FullForm, and in "natural" form. (The intermediate steps are displayed to make it more apparent what is happening; minor changes to the code could surpress this intermediate output.)

Out of the box, Pythonica can parse expressions containing the basic arithmatic operators (+,-,*,/,^), plus the assignment and unassignment operators (= and =.), the equality operator (==), the rule-definition operator (->), and the rule-application operator (/.). The latter two may require some explanation. A rule is similar to an assignment, but assigns value temporarily, only within the scope of an application. Thus "x=y" defines x as equal to y until further notice, but "x->y" means something more like "suppose x equals y just for this example." To apply a rule, you use the "/." operator, which can be read "given" or "supposing". So the input "x^2 /. x->4" is read "x squared, supposing that x is 4" and evaluates to 16. After this input is evaluated, x is undefined (or retains its previous definition), because no permanent assignment was involved. In addition, the parser converts '%' to Out[-1] (the previous result), '%%' to Out[-2], and '%n' to Out[n], following the Mathematica convention.

When faced with a complex expression, Pythonica will attempt to simplify it. It is programmed with only a few techniques, including the combination of common factors or terms in a sum, and "flattening" a nested operation (e.g., 2 x * (3 y * 5) can be flattened to 2x * 3y * 5).

Basic Principles

Everything is an expression. An expression is a term of the form

Head[expr1,expr2,...]

Everything in Pythonica is stored in this form: function calls, commands, operations, etc. So when you enter "40+2", this is immediately converted to "Plus[40,2]". Note that the argument of an expression are themselves expressions; they can be nested arbitrarily. Since function calls (e.g., "Sin[x]" are expressions, they may be passed as arguments to another expression. This is the key principle upon which symbolic computation is based.
  1. Parse input.

  2. Evaluate input.

  3. Simplify result.

  4. Unparse result.
Figure 2: Pythonica processing steps.

Actually, expressions may always be entered in their conical (full) form. You could actually type "Times[6,7]" instead of "6*7", for example. But the latter is easier and more natural for us to use. So a front-end parser is provided, which parses your natural input and converts it to the internal representation. The input can then be evaluated; all built-in expression types (such as "Plus" and "Times") are programmed to act on their arguments and return a new result. Other expressions may simply evaluate to themselves. Next, the expression resulting from this evaluation is simplified. Again, built-in expression types are programmed with ways to simplify themselves or their arguments. Finally, the simplified expression is "unparsed" -- that is, converted from the full form to a form more readable by humans. Instead of "Power[x,2]", the unparser gives us "x^2".

Simplification is the most difficult and important step of a symbolic math system. There are two basic approaches. First, an expression type (such as "Plus") may have some simplification rules hard-coded into it. In Pythonica, this type of code checks for Times expressions and combines them where possible, so that Plus[Times[3,x],Times[5,x]] is converted to Times[8,x] -- even though during the evaluation step, Plus only knows how to add numbers. This sort of hard-coded simplification is very efficient, which is one reason why Pythonica is so fast. But it is rather inflexible, difficult to extend, and may be prone to programming errors.

The other approach to simplification is to provide tables of transformation rules. For the above example, the rule would be something like "n*a + m*a can be replaced with (n+m)*a". The software would have to check each rule to see whether it can be made to match some part of the current expression by choosing appropriate substitutions for the variables (n, m, and a). When a matching rule is found, it is applied, and the process repeats until no more mathes are found. This approach is more general and flexible, but it requires sophisticated pattern-matching algorithms, and it is much slower than the hard-coded approach. Pythonica does not currently contain transformation tables, and relies solely on hardcoded simplification.

Implementation

The code is currently divided into two modules. In pythonica.py, the user's input is parsed into internal form. The parsing procedure is complicated by use of operators, which have various grouping and precedence rules. Once the input string has been converted to an expression (probably containing other expressions as arguments), the expression is evaluated, simplified, then "unparsed" for output. Unparsing is considerably easier than parsing, because it begins with a well-defined internal data structure rather than a human-entered string.
__init__(self,data=None,head=None):
	set the expression head and arguments (data)

__str__(self):
	return self in FullForm representation

__cmp__(self,other):
	return 0 if head and data match other,
	else return 1

Head(self):
	return the head of this expression

Eval(self):
	evaluate self, and return new expression

Simplify(self):
	simplify self, and return new expression
Figure 3: Methods of class Expr.

All the code defining various expression types, along with their evaluation and simplification methods, is contained in pythonicacore.py. All expressions derive from class Expr, whose methods are listed in Figure 3. An expression's data is always a list of expressions, with two exceptions: ExprNumber's data is a number, and ExprSymbol's data is a string.

Evaluation of most expression types is fairly straightforward. First, its data are evaluated; then these data are combined according the the type of expression (they are added in ExprPlus, multiplied in ExprTimes, etc.). The result is returned as an ExprNumber if possible; but if the data include symbols whose values cannot be found, the result will generally be of the same type as the original expression (e.g., ExprPlus), with arguments containing both symbols and numbers.

Values are assigned to user-defined symbols through Set, and removed through Unset. These values are stored in a global dictionary, gSymTable. The functionality included in Pythonica has not needed the use of local symbol tables, though these would be needed if the system were developed much further (e.g., with transformation patterns or user-defined functions). The ReplaceAll function (operator "/.") uses a trick to imitate local variables: the previous value of the changed symbol is saved, then restored after evaluation.

Future Directions

Pythonica is a bare shadow of the powerful programs which inspired it. It could be readily extended with more built-in functions (e.g., Sin[], Rand[], etc.). Some functions should include modifications to the parser and unparser as well, e.g., the input "{a,b,c}" should be converted automatically to "List[a,b,c]".

In fact, the parser could probably use an overhaul. Grouping rules are handled poorly in the current implementation; many more operators would require a more uniform treatment.

The most glaring limitation of Pythonica at this point is the inability to handle user-defined functions. That is, "f[x_]:=x*2" should define a function "f", which can then be applied (f[4]==8). This would require introducing local variables, but should not be too difficult to implement.

The next step would be the addition of patterns and transformations. A pattern is an expression with placeholders. A pattern matches an expression if the placeholders can be assigned values that make the pattern and the expression equal. A transformation is a pair of patterns; an expression is transformed by matching the first pattern, then substituting the placeholder values into the second. Once this functionality is in place, it would be sensible to start adding transformations to the evaluation/simplification process. While this would be a major job, it would make Pythonica into a powerful symbolic math program. It could then be readily extended by adding more functions and transformation rules.

Conclusion

Pythonica demonstrates how symbolic mathematics can be implemented on a computer. As a research tool, it is inadequate, but it may serve as a useful tutorial for those wishing to understand better how such programs operate. With additional work, it could even be made into a minor math tool useful to those who can't justify the purchase of expensive commercial software.

Author's Note

I encourage comments on Pythonica. Please send bug reports, feature wishes, and offers of help to joe@strout.net. I hope you enjoy Pythonica and find it amusing, interesting, or even useful.


http://www.strout.net/python/pythonica.html
Last Modified: 6/10/04 . . . . . Joe Strout