Loading
[ loading ]
Loading
[ loading ]
[ math ]ninjasha 2c8e64ameasured-in-repo
ninja is a from-scratch computer algebra system in pure Python. Its core is a symbolic differentiation engine that parses an expression as text and returns the exact derivative, simplified, never numeric. No SymPy, no NumPy, no parser library in the symbolic core.
the hardest rule, made physical
x^x has no power rule and no exponential rule. The engine differentiates it by logarithmic differentiation, two terms summed, every coefficient kept exact.
x^x has no plain power rule and no plain exponential rule, so the power node differentiates it by logarithmic differentiation: two terms, x^x · ln(x) and x · x^(x−1), summed and printed exact as x^x*ln(x)+x*x^(x−1).Most "I built a math toolkit" projects wrap SymPy or NumPy, or they stop at plugging numbers into a formula. I wanted the hard part: take a math expression as a plain string, build it into a tree, and differentiate it symbolically and exactly, then print the simplified result back as math.
Exactly means the coefficients stay integer or rational the whole way through. D("(x^2+1)^3") has to come out as 6x^5+12x^3+6x, not a float approximation that drifts in the last digit. That constraint rules out the easy paths and forces real algebra.
I model an expression as a small algebraic type system: Polynomial (a rational function carried as numerator and denominator coefficient lists), plus trig, log, ln, power, and a mixed_func node for the + − * / operators. A handwritten recursive-descent parser builds the tree, splitting on operator precedence with parenthesis-balance tracking, and it catches implicit multiplication like 2x, (..)(..), and x sin x.
Each node implements two methods. derive() encodes one calculus rule; ptf() (print-to-format) renders the node back to a string. The trig, log, and ln nodes apply the chain rule directly: sin maps to cos(arg) · arg', tan to arg' / cos², and a power tower f^g uses logarithmic differentiation. Underneath, a hand-rolled fraction layer and a polynomial GCD by Euclid keep every coefficient exact.
The engine is one pipeline in differentiate.py: a string goes in, an exact simplified derivative comes out. The whole project is five flat modules. The engine carries most of the weight; the rest is the rendering, plotting, and number-theory surface around it.
sin(x^2) example below. Recolored from the repo's own architecture diagram at sha 2c8e64a.The whole point was to learn the math by building it, so I made choices that cost robustness on purpose. I kept them in. Each names what I chose and what it costs.
Instead of fractions.Fraction or numpy.polynomial, the engine carries its own fraction add, subtract, multiply, divide and a polynomial GCD by Euclid. Writing it by hand is the learning, and it keeps the symbolic core dependent on nothing. It is also more code to get right than the library call would have been.
simplify() cleans up the output at the string level rather than normalizing the AST. So D("x^2/(x+1)") comes out as (x^2+2x)/(x^2+2x+1): exact and correct, but not factored back to (x+1)² in the denominator. The answer is right; the presentation is not always minimal.
Five flat modules, no package layout, no test suite, one squashed commit. The constant e is not supported. I built ninja to understand symbolic differentiation, not to ship it for reuse, and the structure shows that honestly.
The check below carries the same honesty: there is no in-repo speed benchmark against SymPy, so I claim none. The evidence is correctness.
The field-standard pure-Python CAS is SymPy, and ninja's differentiator is a from-scratch alternative to sympy.diff.
I ran D() on 11 expressions that exercise every rule the engine implements, then set each output beside the answer I derived by hand. All 11 match, and the coefficients stay integer or rational throughout. Latency is from time.perf_counter on CPython, single thread, shown only as supporting detail.
11/11 expressions match the hand-derived answer
coefficients integer / rational · no float drift
| input | D(input) = engine output | rule | vs answer | µs/call |
|---|---|---|---|---|
| x^3+2*x | 3x^2+2 | power + sum | match ✓ | 295.1 |
| 3*x^4-5*x^2+7 | 12x^3-10x | power + difference | match ✓ | 463.4 |
| (x^2+1)^3 | 6x^5+12x^3+6x | chain (expanded exact) | match ✓ | 295.3 |
| sin(x^2) | 2x*cos(x^2) | chain + trig | match ✓ | 123.5 |
| cos(x) | -sin(x) | trig | match ✓ | 61.4 |
| x*ln(x) | 1+ln(x) | product + log | match ✓ | 146.7 |
| ln(x) | 1/x | log | match ✓ | 64.4 |
| x^2/(x+1) | (x^2+2x)/(x^2+2x+1) | quotient (GCD-reduced) | match ✓ | 323.1 |
| tan(x)/x | ((x)/((cos(x))^2)-tan(x))/(x^2) | quotient + trig | match ✓ | 176.3 |
| x^x | (x)^(x)*ln(x)+x*(x)^(x-1) | power-tower (log-diff) | match ✓ | 269.7 |
| x^5 | 5x^4 | power | match ✓ | 166.0 |
Five places where the engine does the work a wrapper skips. Each links into the source at the pinned SHA.
The playground runs the real D() on whatever you type, rendered with ninja's own Unicode math formatter. It is pre-seeded with the verified examples, so it always shows genuine engine output, not mocked results.