a system for doing mathematics with Python
Note: I am no longer maintaining Pythonica, and the version of the code linked directly from this page contains some bugs. For the latest version, including bug fixes, please see Kevin Ballard's page at http://www.tildesoft.com/Pythonica.html. -- JJS
But the principles of a symbolic math program such as this are not all that difficult to follow, and in a language such as Python, may be implemented in a matter of hours. The real value of a commercial program lies in the many thousands of mathematical functions, identities, and transformation rules they contain -- the embodiment of centuries of mathematical research. But the barest functional skeleton of such a program may require only a few pages of code.
Pythonica is a Python implementation of a symbolic math program, based upon the fantastic precedent set by Mathematica. In this paper, the functionality and simple usage rules are first presented. Then the principles behind the program are discussed, along with the details of implementing these principles in Python. Future directions for extending Pythonica are also considered.
In := x*5 - 6 x / 2 in FullForm: Minus[Times[x,5],Times[6,Divide[x,2]]] evaluates to: Plus[Times[5,x],Times[-1,Times[6,Divide[x,2]]]] simplifies to: Plus[Times[5,x],Times[-6,Times[0.5,x]]] simplifies to: Times[2,x] Out = 2 x In := Out /. x->21 in FullForm: ReplaceAll[Out,Rule[x,21]] evaluates to: 42 Out = 42 In := % / 7 + x in FullForm: Plus[Divide[Out[-1],7],x] evaluates to: Plus[x,6] Out = x + 6 In := (%^2)-7 /. x->1 in FullForm: ReplaceAll[Minus[Power[Out[-1],2],7],Rule[x,1]] evaluates to: 42 Out = 42
|Figure 1: Sample Pythonica session.|
The user enters a function, equation, or other expression. Pythonica parses this "natural" input into a cononical expression form known as "FullForm". The expression is then evaluated, generating a result. The result is simplified as much as the program is able, then printed to the user -- both in FullForm, and in "natural" form. (The intermediate steps are displayed to make it more apparent what is happening; minor changes to the code could surpress this intermediate output.)
Out of the box, Pythonica can parse expressions containing the basic arithmatic operators (+,-,*,/,^), plus the assignment and unassignment operators (= and =.), the equality operator (==), the rule-definition operator (->), and the rule-application operator (/.). The latter two may require some explanation. A rule is similar to an assignment, but assigns value temporarily, only within the scope of an application. Thus "x=y" defines x as equal to y until further notice, but "x->y" means something more like "suppose x equals y just for this example." To apply a rule, you use the "/." operator, which can be read "given" or "supposing". So the input "x^2 /. x->4" is read "x squared, supposing that x is 4" and evaluates to 16. After this input is evaluated, x is undefined (or retains its previous definition), because no permanent assignment was involved. In addition, the parser converts '%' to Out[-1] (the previous result), '%%' to Out[-2], and '%n' to Out[n], following the Mathematica convention.
When faced with a complex expression, Pythonica will attempt to simplify it. It is programmed with only a few techniques, including the combination of common factors or terms in a sum, and "flattening" a nested operation (e.g., 2 x * (3 y * 5) can be flattened to 2x * 3y * 5).
Everything in Pythonica is stored in this form: function calls, commands, operations, etc. So when you enter "40+2", this is immediately converted to "Plus[40,2]". Note that the argument of an expression are themselves expressions; they can be nested arbitrarily. Since function calls (e.g., "Sin[x]" are expressions, they may be passed as arguments to another expression. This is the key principle upon which symbolic computation is based.
|Figure 2: Pythonica processing steps.|
Actually, expressions may always be entered in their conical (full) form. You could actually type "Times[6,7]" instead of "6*7", for example. But the latter is easier and more natural for us to use. So a front-end parser is provided, which parses your natural input and converts it to the internal representation. The input can then be evaluated; all built-in expression types (such as "Plus" and "Times") are programmed to act on their arguments and return a new result. Other expressions may simply evaluate to themselves. Next, the expression resulting from this evaluation is simplified. Again, built-in expression types are programmed with ways to simplify themselves or their arguments. Finally, the simplified expression is "unparsed" -- that is, converted from the full form to a form more readable by humans. Instead of "Power[x,2]", the unparser gives us "x^2".
Simplification is the most difficult and important step of a symbolic math system. There are two basic approaches. First, an expression type (such as "Plus") may have some simplification rules hard-coded into it. In Pythonica, this type of code checks for Times expressions and combines them where possible, so that Plus[Times[3,x],Times[5,x]] is converted to Times[8,x] -- even though during the evaluation step, Plus only knows how to add numbers. This sort of hard-coded simplification is very efficient, which is one reason why Pythonica is so fast. But it is rather inflexible, difficult to extend, and may be prone to programming errors.
The other approach to simplification is to provide tables of transformation rules. For the above example, the rule would be something like "n*a + m*a can be replaced with (n+m)*a". The software would have to check each rule to see whether it can be made to match some part of the current expression by choosing appropriate substitutions for the variables (n, m, and a). When a matching rule is found, it is applied, and the process repeats until no more mathes are found. This approach is more general and flexible, but it requires sophisticated pattern-matching algorithms, and it is much slower than the hard-coded approach. Pythonica does not currently contain transformation tables, and relies solely on hardcoded simplification.
__init__(self,data=None,head=None): set the expression head and arguments (data) __str__(self): return self in FullForm representation __cmp__(self,other): return 0 if head and data match other, else return 1 Head(self): return the head of this expression Eval(self): evaluate self, and return new expression Simplify(self): simplify self, and return new expression
|Figure 3: Methods of class Expr.|
All the code defining various expression types, along with their evaluation and simplification methods, is contained in pythonicacore.py. All expressions derive from class Expr, whose methods are listed in Figure 3. An expression's data is always a list of expressions, with two exceptions: ExprNumber's data is a number, and ExprSymbol's data is a string.
Evaluation of most expression types is fairly straightforward. First, its data are evaluated; then these data are combined according the the type of expression (they are added in ExprPlus, multiplied in ExprTimes, etc.). The result is returned as an ExprNumber if possible; but if the data include symbols whose values cannot be found, the result will generally be of the same type as the original expression (e.g., ExprPlus), with arguments containing both symbols and numbers.
Values are assigned to user-defined symbols through Set, and removed through Unset. These values are stored in a global dictionary, gSymTable. The functionality included in Pythonica has not needed the use of local symbol tables, though these would be needed if the system were developed much further (e.g., with transformation patterns or user-defined functions). The ReplaceAll function (operator "/.") uses a trick to imitate local variables: the previous value of the changed symbol is saved, then restored after evaluation.
In fact, the parser could probably use an overhaul. Grouping rules are handled poorly in the current implementation; many more operators would require a more uniform treatment.
The most glaring limitation of Pythonica at this point is the inability to handle user-defined functions. That is, "f[x_]:=x*2" should define a function "f", which can then be applied (f==8). This would require introducing local variables, but should not be too difficult to implement.
The next step would be the addition of patterns and transformations. A pattern is an expression with placeholders. A pattern matches an expression if the placeholders can be assigned values that make the pattern and the expression equal. A transformation is a pair of patterns; an expression is transformed by matching the first pattern, then substituting the placeholder values into the second. Once this functionality is in place, it would be sensible to start adding transformations to the evaluation/simplification process. While this would be a major job, it would make Pythonica into a powerful symbolic math program. It could then be readily extended by adding more functions and transformation rules.