Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argument names for random.uniform are different in Numba and numpy #8161

Closed
2 tasks done
brunojacobs opened this issue Jun 14, 2022 · 5 comments · Fixed by #8250
Closed
2 tasks done

Argument names for random.uniform are different in Numba and numpy #8161

brunojacobs opened this issue Jun 14, 2022 · 5 comments · Fixed by #8250
Labels
bug - failure to compile Bugs: failed to compile valid code good first issue A good issue for a first time contributor

Comments

@brunojacobs
Copy link

brunojacobs commented Jun 14, 2022

  • I have tried using the latest released version of Numba (most recent is
    visible in the change log (https://github.com/numba/numba/blob/main/CHANGE_LOG).
  • I have included a self contained code sample to reproduce the problem.
    i.e. it's possible to run as 'python bug.py'.

I think I have discovered a very minor bug - or rather inconsistency with numpy - in Numba's implementation of np.random.uniform. In numpy the arguments for the lower- and upperbound are named low and high, in Numba they are named a and b. My expectation was that the argument names would be the same across Numba and numpy. Of course this problem can be completely avoided by specifying positional arguments instead of keyword arguments.

As a sidenote, Numba also doesn't accept a call to np.random.uniform without arguments, while in numpy this results in a draw from Uniform[0, 1]. I suspect this is because np.random.uniform mirrors Python's implementation of random.uniform as listed here: https://docs.python.org/3/library/random.html#random.uniform which also explains the use of the argument names a and b.

Please see below for a minimal working example.

Edit: Updated code for clarity.

import numpy as np
import numba as nb

def uniform_pos_arg():
    return np.random.uniform(0.0, 1.0)

uniform_pos_arg_njit = nb.njit(uniform_pos_arg)

def uniform_numpy_arg():
    return np.random.uniform(low=0.0, high=1.0)

uniform_numpy_arg_njit = nb.njit(uniform_numpy_arg)

def uniform_numba_arg():
    return np.random.uniform(a=0.0, b=1.0)

uniform_numba_arg_njit = nb.njit(uniform_numba_arg)

print('numba version:', nb.__version__)
print('numpy version:', np.__version__)

print(uniform_pos_arg())  # works

print(uniform_pos_arg_njit())  # works

print(uniform_numpy_arg())  # works
try:
    print(uniform_numpy_arg_njit())
except:
    print('Calling uniform_numpy_arg_njit() does not work: Numba expects arguments a and b')

try:
    print(uniform_numba_arg())
except:
    print('Calling uniform_numba_arg() does not work: Numpy expects arguments low and high')
print(uniform_numba_arg_njit())

Output:

numba version: 0.55.2
numpy version: 1.22.3
0.8939617128078611
0.9575756779190199
0.6867936141928337
Calling uniform_numpy_arg_njit() does not work: Numba expects arguments a and b
Calling uniform_numba_arg() does not work: Numpy expects arguments low and high
0.6859305961634644
@guilhermeleobas guilhermeleobas added bug - failure to compile Bugs: failed to compile valid code good first issue A good issue for a first time contributor labels Jun 15, 2022
@guilhermeleobas
Copy link
Collaborator

Hi @brunojacobs, thanks for the report. I can reproduce the error.

In fact, NumPy uses low and high to name uniform args, whereas Numba uses the argument of the typer function to determine the name. If one changes the typer arguments to low/high, it matches the NumPy interface and works as expected.

So, to solve this, one can clone the entire template and change the argument names in the typer function to match the NumPy API.

@glue_typing(np.random.uniform, typing_key="np.random.uniform")
class Random_binary_distribution_uniform(ConcreteRandomTemplate):
    cases = [signature(tp, tp, tp) for tp in _float_types]

    def generic(self):
        def typer(low, high, size=None):
            return self.array_typer(size)(low, high)
        return typer

It would be good to check if there are other functions in this module that have the same issue.

@stuartarchibald
Copy link
Contributor

xref: #8008 Item: "Naming of arguments to functions is sometimes inconsistent."

@Anika-Roy
Copy link

i would like to work on this.I am a beginner

@bszollosinagy
Copy link
Contributor

Okay, I have implemented the solutions suggested by @guilhermeleobas
Keyword parameter names are now consistent between Numba and Numpy for these random distributions:

  • np.random.uniform
  • np.random.gumbel
  • np.random.wald

Unit tests were also made to test that the named keyword arguments are properly handled.

The issue presently could not be solved for these:

  • np.random.f
  • np.random.vonmises

Reason: It turns out that their unit tests tested Numba against the Python implementation and not Numpy. In itself this is not a big problem, because the python implementations are also delivering the characteristics of the chosen random distribution, but in practice, if started from the same seed, the Numpy and Python implementations give differing samples (which makes the unit test fail, if you really want to compare Numba to Numpy and not python).

For example TestRandom.test_numpy_vonmises() compares the Numba implementation against the Python not the Numpy "uniform" distribution. It should use "_follow_numpy" instead of "_follow_cpython"

Why does this matter? Because the keyword arguments have different names in Python Random, and Numpy Random, so one cannot simply keep using the Python distributions as baselines if the keyword names are important. One also cannot use the Numpy versions, because they give different samples than Numba, if started from the same seed. This is most likely caused by Numba and Numpy using the random number generator slightly differently for the same algorithm. For example if you need three random numbers A,B and C, maybe one of them asks three samples and puts them into A,B and C, maybe the other simply initializes C before A and B, thus putting the first random sample into C instead of A. This results in a distribution with the appropriate characteristics, yet numerically different output. This issue only affects the Numba implementations of the distributions F and VonMises as far as I can tell.

Maybe this is not a big issue, and it is perfectly okay for Numba to give the same pseudo-random stream as Python, and not the same as Numpy. And someone could just make a wrapper function for testing, that wraps the Python functions to have the keywords used by Numpy.

@bszollosinagy
Copy link
Contributor

A long term solution to this issue is provided by using the new random API in Numpy, which is also available in Numba since PR #8040, and additional distributions are being added.

Using that API keyword parameters are handled just fine, as far as I could test, for all distributions.

This new Numpy API is often called BitGenerator and the Numpy project is slowly deprecating the old API.

So if you need the VonMises or the F distributions, then use the new BitGenerator based API, and keywords will work fine. For the rest, this issue should be fixed for the old API as well (ie. the simple calls to np.random.uniform for example)

Reference:
New random API in Numpy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug - failure to compile Bugs: failed to compile valid code good first issue A good issue for a first time contributor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants