# Fastest way to calculate an inner product of binary vectors in numpy

337
April 10, 2017, at 2:12 PM

So I have `X` and `Y` which are binary vectors of the two elements 0,1, and are about 30000 elements long. What I want to do exactly is find a fast way to calculate `np.inner(X, Y)/np.sum(Y)`. I don't think much can be done about the `np.sum(Y)` part, which basically just counts how many 1s there are in `Y`. The `np.inner(X,Y)` part actually counts the number of 1s in `X` whose corresponding element (i.e. same index) in `Y` is also 1, i.e. it works out the same as

``````re = 0
for i in range(X.shape[0]):
if X[i]==1 and Y[i]==1:
re += 1
return re
``````

I think `np.inner` is the most straightforward way and also the fastest way I can think about within `numpy`. However the performance is still not satisfactory: this calculation makes up most of the computational cost of a function that has to be iterated hundreds of times, and the total runtime turns out as long as 90s. So I've been thinking quite hard about optimising it. I tried using `X @ Y.T`, which didn't make much difference unsurprisingly. I also checked that my `numpy` package is linked to all the common optimising packages like `blas` or `mkl` etc (the only one unavailalbe is `openblas`). Anyway is there any possible room for speeding up this calculation?

