back to workPython · GPL-3.0 · 711 LOC engine

[ math ]ninjasha 2c8e64ameasured-in-repo

I differentiate calculus exactly, depending on nothing.

ninja is a from-scratch computer algebra system in pure Python. Its core is a symbolic differentiation engine that parses an expression as text and returns the exact derivative, simplified, never numeric. No SymPy, no NumPy, no parser library in the symbolic core.

[ ninja ], noun

1.a pure-Python computer algebra system, no deps in the core

2.parse, derive, print, simplify; symbolic, not numeric

3.exact rational coefficients, checked rule by rule

view the repo jump to the correctness check

engine size: 711lines of pure Python, the differentiation engine hand-written
dependencies: 0in the symbolic core, stdlib only
correctness: 11/11derivatives match the hand-derived answer, coefficients exact

lang: Python
license: GPL-3.0
size: 1,430 LOC / 5 modules
baseline: hand-derived answer
field std: SymPy (sympy.diff)
verdict: measured-in-repo

the hardest rule, made physical

x^x has no power rule and no exponential rule. The engine differentiates it by logarithmic differentiation, two terms summed, every coefficient kept exact.

The hardest case the engine handles. x^x has no plain power rule and no plain exponential rule, so the power node differentiates it by logarithmic differentiation: two terms, x^x · ln(x) and x · x^(x−1), summed and printed exact as x^x*ln(x)+x*x^(x−1).

problem

Most math toolkits wrap SymPy or stop at numbers

Most "I built a math toolkit" projects wrap SymPy or NumPy, or they stop at plugging numbers into a formula. I wanted the hard part: take a math expression as a plain string, build it into a tree, and differentiate it symbolically and exactly, then print the simplified result back as math.

Exactly means the coefficients stay integer or rational the whole way through. D("(x^2+1)^3") has to come out as 6x^5+12x^3+6x, not a float approximation that drifts in the last digit. That constraint rules out the easy paths and forces real algebra.

approach

An algebraic type system over exact rationals

I model an expression as a small algebraic type system: Polynomial (a rational function carried as numerator and denominator coefficient lists), plus trig, log, ln, power, and a mixed_func node for the + − * / operators. A handwritten recursive-descent parser builds the tree, splitting on operator precedence with parenthesis-balance tracking, and it catches implicit multiplication like 2x, (..)(..), and x sin x.

Each node implements two methods. derive() encodes one calculus rule; ptf() (print-to-format) renders the node back to a string. The trig, log, and ln nodes apply the chain rule directly: sin maps to cos(arg) · arg', tan to arg' / cos², and a power tower f^g uses logarithmic differentiation. Underneath, a hand-rolled fraction layer and a polynomial GCD by Euclid keep every coefficient exact.

architecture

One pipeline, five stages, five flat modules

The engine is one pipeline in differentiate.py: a string goes in, an exact simplified derivative comes out. The whole project is five flat modules. The engine carries most of the weight; the rest is the rendering, plotting, and number-theory surface around it.

parse()string to node tree, recursive-descent
count()canonicalize the tree
.derive()per-node calculus rule
.ptf()print-to-format
simplify()string cleanup

The ninja differentiation architecture. The five-stage pipeline (parse, count, derive, print-to-format, simplify) runs over an exact rational substrate of hand-rolled fraction operations, polynomial GCD, and long division. Below it, a worked example shows the parse tree for sin of x squared resolving through the chain rule into the exact result 2x times cosine of x squared. — The pipeline over its exact-rational substrate, with the worked `sin(x^2)` example below. Recolored from the repo's own architecture diagram at sha 2c8e64a.

differentiate.pythe engine: parser, algebraic type system, exact-rational substrate711 LOC
utilities.pyUnicode/ANSI math-notation renderer, symbolic-constant detection330 LOC
number.pystdlib-only number theory: primality, factorization, base conversion213 LOC
plot.pyplotly wrapper for 2D/3D/quiver derivative fields108 LOC
root.pySecant-method root finding plus Vieta's formulas68 LOC

tradeoffs · road not taken

Choices that cost robustness, kept in on purpose

The whole point was to learn the math by building it, so I made choices that cost robustness on purpose. I kept them in. Each names what I chose and what it costs.

[01]road not taken
I rewrote rational arithmetic by hand
Instead of fractions.Fraction or numpy.polynomial, the engine carries its own fraction add, subtract, multiply, divide and a polynomial GCD by Euclid. Writing it by hand is the learning, and it keeps the symbolic core dependent on nothing. It is also more code to get right than the library call would have been.
[02]road not taken
Simplification rewrites strings, not a normalized tree
simplify() cleans up the output at the string level rather than normalizing the AST. So D("x^2/(x+1)") comes out as (x^2+2x)/(x^2+2x+1): exact and correct, but not factored back to (x+1)² in the denominator. The answer is right; the presentation is not always minimal.
[03]road not taken
It is an exploration, not a packaged library
Five flat modules, no package layout, no test suite, one squashed commit. The constant e is not supported. I built ninja to understand symbolic differentiation, not to ship it for reuse, and the structure shows that honestly.

The check below carries the same honesty: there is no in-repo speed benchmark against SymPy, so I claim none. The evidence is correctness.

benchmark · vs the hand-derived answer

Checked against the answer I derived by hand

The field-standard pure-Python CAS is SymPy, and ninja's differentiator is a from-scratch alternative to sympy.diff.

I ran D() on 11 expressions that exercise every rule the engine implements, then set each output beside the answer I derived by hand. All 11 match, and the coefficients stay integer or rational throughout. Latency is from time.perf_counter on CPython, single thread, shown only as supporting detail.

11/11 expressions match the hand-derived answer

coefficients integer / rational · no float drift

Eleven expressions differentiated by ninja's engine at sha 2c8e64a. Each row shows the input, the engine's output, the calculus rule exercised, whether it matches the hand-derived answer, and the per-call latency in microseconds.
input	D(input) = engine output	rule	vs answer	µs/call
x^3+2*x	3x^2+2	power + sum	match ✓	295.1
3x^4-5x^2+7	12x^3-10x	power + difference	match ✓	463.4
(x^2+1)^3	6x^5+12x^3+6x	chain (expanded exact)	match ✓	295.3
sin(x^2)	2x*cos(x^2)	chain + trig	match ✓	123.5
cos(x)	-sin(x)	trig	match ✓	61.4
x*ln(x)	1+ln(x)	product + log	match ✓	146.7
ln(x)	1/x	log	match ✓	64.4
x^2/(x+1)	(x^2+2x)/(x^2+2x+1)	quotient (GCD-reduced)	match ✓	323.1
tan(x)/x	((x)/((cos(x))^2)-tan(x))/(x^2)	quotient + trig	match ✓	176.3
x^x	(x)^(x)ln(x)+x(x)^(x-1)	power-tower (log-diff)	match ✓	269.7
x^5	5x^4	power	match ✓	166.0

§ correctness · n=11 · baseline the hand-derived calculus answer · sha 2c8e64a

proof · the hard parts

The code that proves it is real algebra

Five places where the engine does the work a wrapper skips. Each links into the source at the pinned SHA.

demo · playground

Try it on your own expression

The playground runs the real D() on whatever you type, rendered with ninja's own Unicode math formatter. It is pre-seeded with the verified examples, so it always shows genuine engine output, not mocked results.

interactive · playgroundDifferentiate your own expression

The playground runs the real D() from differentiate.py on whatever you type, rendered with ninja's own Unicode math formatter. It is pre-seeded with the verified examples, so it always shows genuine engine output.

open the playground

[ loading ]

§ rendering route

I differentiate calculus exactly, depending on nothing.

[ ninja ], noun

1.a pure-Python computer algebra system, no deps in the core

2.parse, derive, print, simplify; symbolic, not numeric

3.exact rational coefficients, checked rule by rule

engine size

711lines of pure Python, the differentiation engine hand-written

dependencies

0in the symbolic core, stdlib only

correctness

11/11derivatives match the hand-derived answer, coefficients exact

lang

Python

license

GPL-3.0

size

1,430 LOC / 5 modules

baseline

hand-derived answer

field std

SymPy (sympy.diff)

verdict

measured-in-repo

input

D(input) = engine output

rule

vs answer

µs/call

x^3+2*x

3x^2+2

power + sum

match ✓

295.1

3*x^4-5*x^2+7

12x^3-10x

power + difference

match ✓

463.4

(x^2+1)^3

6x^5+12x^3+6x

chain (expanded exact)

match ✓

295.3

sin(x^2)

2x*cos(x^2)

chain + trig

match ✓

123.5

cos(x)

-sin(x)

trig

match ✓

61.4

x*ln(x)

1+ln(x)

product + log

match ✓

146.7

ln(x)

1/x

log

match ✓

64.4

x^2/(x+1)

(x^2+2x)/(x^2+2x+1)

quotient (GCD-reduced)

match ✓

323.1

tan(x)/x

((x)/((cos(x))^2)-tan(x))/(x^2)

quotient + trig

match ✓

176.3

x^x

(x)^(x)*ln(x)+x*(x)^(x-1)

power-tower (log-diff)

match ✓

269.7

x^5

5x^4

power

match ✓

166.0

Most math toolkits wrap SymPy or stop at numbers

An algebraic type system over exact rationals

One pipeline, five stages, five flat modules

Choices that cost robustness, kept in on purpose

I rewrote rational arithmetic by hand

Simplification rewrites strings, not a normalized tree

It is an exploration, not a packaged library

Checked against the answer I derived by hand

The code that proves it is real algebra

Try it on your own expression

Onemoment

Most math toolkits wrap SymPy or stop at numbers

An algebraic type system over exact rationals

One pipeline, five stages, five flat modules

Choices that cost robustness, kept in on purpose

I rewrote rational arithmetic by hand

Simplification rewrites strings, not a normalized tree

It is an exploration, not a packaged library

Checked against the answer I derived by hand

The code that proves it is real algebra

Try it on your own expression