Common NumPy Interview Questions for Data Scientists

NumPy is a fundamental library for numerical computing in Python, providing support for multi-dimensional arrays and mathematical functions. In data science interviews, a strong grasp of NumPy is essential. Below, we present frequently asked NumPy interview questions with detailed explanations and code examples.


1. What is NumPy and why is it used?

NumPy (Numerical Python) is a library that provides efficient array operations, broadcasting, and mathematical functions. It is widely used in data science and machine learning for numerical computations.

Example:

import numpy as np

# Creating a NumPy array
arr = np.array([1, 2, 3, 4, 5])
print(arr)

Output:

[1 2 3 4 5]


2. What are the advantages of NumPy arrays over Python lists?

  • Faster computations due to optimized C implementation.
  • Less memory usage because of fixed-type storage.
  • Vectorized operations instead of explicit loops.

Example:

import numpy as np
import time

# Using a list
lst = list(range(1000000))
start = time.time()
sum(lst)
print("List time:", time.time() - start)

# Using a NumPy array
arr = np.array(lst)
start = time.time()
np.sum(arr)
print("NumPy time:", time.time() - start)

Output:

List time: 0.009779930114746094 
NumPy time: 0.0010318756103515625


3. How do you create different types of arrays in NumPy?

NumPy provides multiple ways to create arrays:

Example:

np.zeros((2,3))   # 2x3 matrix of zeros
np.ones((3,3))    # 3x3 matrix of ones
np.eye(4)         # 4x4 identity matrix
np.arange(0, 10, 2)  # Array from 0 to 10 with step 2
np.linspace(0, 1, 5)  # 5 evenly spaced values from 0 to 1


4. What is the difference between reshape() and ravel()?

  • .reshape() changes the shape of an array without modifying data.
  • .ravel() flattens the array into a 1D array.

Example:

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.reshape(3, 2))  # Reshape to 3x2
print(arr.ravel())  # Flatten the array

Output:

[[1 2]  [3 4]  [5 6]] [1 2 3 4 5 6]


5. How do you perform element-wise operations in NumPy?

NumPy allows vectorized operations without explicit loops.

Example:

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(a + b)  # Element-wise addition
print(a * b)  # Element-wise multiplication
print(a ** 2) # Squaring each element

Output:

[5 7 9]
[[ 4 10 18]
[1 4 9]]


6. How do you filter values in a NumPy array?

Use Boolean indexing or the .where() method.

Example:

arr = np.array([10, 20, 30, 40, 50])
print(arr[arr > 25])  # Elements greater than 25
print(np.where(arr > 25, "High", "Low"))  # Conditional replacement

Output:

[30 40 50] 
['Low' 'Low' 'High' 'High' 'High']


7. How does broadcasting work in NumPy?

Broadcasting allows arrays of different shapes to be used together in operations.

Example:

arr = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
print(arr + vector)  # Adds vector to each row

Output:

[[11 22 33]  
[14 25 36]]


8. How do you compute statistics using NumPy?

NumPy provides built-in functions for statistics.

Example:

arr = np.array([1, 2, 3, 4, 5])
print(arr.mean())  # Mean
print(arr.std())   # Standard deviation
print(arr.min(), arr.max())  # Min and max values

Output:

3.0 
1.4142135623730951 
1 5


9. How do you generate random numbers in NumPy?

Use np.random module.

Example:

np.random.seed(42)  # Set random seed for reproducibility
print(np.random.rand(3, 3))  # Random values in [0,1]
print(np.random.randint(1, 10, size=(3,3)))  # Random integers

Output:

[[0.37454012 0.95071431 0.73199394]  
[0.59865848 0.15601864 0.15599452]  
[0.05808361 0.86617615 0.60111501]] 
[[8 3 6]  
[5 2 8]  
[6 2 5]]


10. How do you perform matrix operations in NumPy?

Use np.dot() or @ operator for matrix multiplication.

Example:

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(np.dot(A, B))  # Matrix multiplication
print(A @ B)  # Alternative syntax

Output:

[[19 22]  
[43 50]]
[[19 22]  
[43 50]]


11. How do you find unique values and counts in an array?

Use np.unique() to get unique elements and their counts.

Example:

arr = np.array([1, 2, 2, 3, 3, 3, 4])
values, counts = np.unique(arr, return_counts=True)
print(values, counts)

Output:

[1 2 3 4] 
[1 2 3 1]


12. How do you read and write files using NumPy?

Use np.loadtxt() and np.savetxt() for text files or np.save() for binary files.

Example:

np.savetxt("data.csv", arr, delimiter=",")  # Save to CSV
loaded_arr = np.loadtxt("data.csv", delimiter=",")  # Load from CSV
print(loaded_arr)

Output:

[1. 2. 2. 3. 3. 3. 4.]


Conclusion

NumPy is a vital tool for data science, enabling efficient numerical computations. Mastering NumPy's array manipulations, statistical functions, and broadcasting will significantly enhance your data processing skills. Keep practicing with real-world datasets and check the official NumPy documentation for more details.

Tags: Data Science, Data Science Basics, Data Scientist Interview, Data Scientist Interview Questions, Interview Preparation, Interview Questions, Machine Learning Interview, NumPy