How to calculate a Normal Distribution percent point function in python

How do I do the equivalent of scipy.stats.norm.ppf without using Scipy. I have python's Math module has erf built in but I cannot seem to recreate the function.

PS: I cannot just use scipy because Heroku does not allow you to install it and using alternate buildpacks breaches the 300Mb maximum slug size limit.

2 Answers

There's not a simple way to use erf to implement norm.ppf because norm.ppf is related to the inverse of erf. Instead, here's a pure Python implementation of the code from scipy. You should find that the function ndtri returns exactly the same value as norm.ppf:

import math
s2pi = 2.50662827463100050242E0
P0 = [ -5.99633501014107895267E1, 9.80010754185999661536E1, -5.66762857469070293439E1, 1.39312609387279679503E1, -1.23916583867381258016E0,
]
Q0 = [ 1, 1.95448858338141759834E0, 4.67627912898881538453E0, 8.63602421390890590575E1, -2.25462687854119370527E2, 2.00260212380060660359E2, -8.20372256168333339912E1, 1.59056225126211695515E1, -1.18331621121330003142E0,
]
P1 = [ 4.05544892305962419923E0, 3.15251094599893866154E1, 5.71628192246421288162E1, 4.40805073893200834700E1, 1.46849561928858024014E1, 2.18663306850790267539E0, -1.40256079171354495875E-1, -3.50424626827848203418E-2, -8.57456785154685413611E-4,
]
Q1 = [ 1, 1.57799883256466749731E1, 4.53907635128879210584E1, 4.13172038254672030440E1, 1.50425385692907503408E1, 2.50464946208309415979E0, -1.42182922854787788574E-1, -3.80806407691578277194E-2, -9.33259480895457427372E-4,
]
P2 = [ 3.23774891776946035970E0, 6.91522889068984211695E0, 3.93881025292474443415E0, 1.33303460815807542389E0, 2.01485389549179081538E-1, 1.23716634817820021358E-2, 3.01581553508235416007E-4, 2.65806974686737550832E-6, 6.23974539184983293730E-9,
]
Q2 = [ 1, 6.02427039364742014255E0, 3.67983563856160859403E0, 1.37702099489081330271E0, 2.16236993594496635890E-1, 1.34204006088543189037E-2, 3.28014464682127739104E-4, 2.89247864745380683936E-6, 6.79019408009981274425E-9,
]
def ndtri(y0): if y0 <= 0 or y0 >= 1: raise ValueError("ndtri(x) needs 0 < x < 1") negate = True y = y0 if y > 1.0 - 0.13533528323661269189: y = 1.0 - y negate = False if y > 0.13533528323661269189: y = y - 0.5 y2 = y * y x = y + y * (y2 * polevl(y2, P0) / polevl(y2, Q0)) x = x * s2pi return x x = math.sqrt(-2.0 * math.log(y)) x0 = x - math.log(x) / x z = 1.0 / x if x < 8.0: x1 = z * polevl(z, P1) / polevl(z, Q1) else: x1 = z * polevl(z, P2) / polevl(z, Q2) x = x0 - x1 if negate: x = -x return x
def polevl(x, coef): accum = 0 for c in coef: accum = x * accum + c return accum
3

The function ppf is the inverse of y = (1+erf(x/sqrt(2))/2. So we need to solve this equation for x, given y between 0 and 1. Here is a code doing this by the bisection method. I imported SciPy function to illustrate that the result is the same.

from math import erf, sqrt
from scipy.stats import norm # only for comparison
y = 0.123
z = 2*y-1
a = 0
while erf(a) > z or erf(a+1) < z: # looking for initial bracket of size 1 if erf(a) > z: a -= 1 else: a += 1
b = a+1 # found a bracket, proceed to refine it
while b-a > 1e-15: # 1e-15 ought to be enough precision c = (a+b)/2.0 # bisection method if erf(c) > z: b = c else: a = c
print sqrt(2)*(a+b)/2.0 # this is the answer
print norm.ppf(y) # SciPy for comparison

Left for you to do:

  • preliminary bound checks (y must be between 0 and 1)
  • scaling and shifting if other mean / variance are desired; the code is for standard normal distribution (mean 0, variance 1).
3

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like