Parameter n_samples for speeding up computations

The purpose of this example is to demonstrate the speedup gained by using a small sample of the dataset to build the topological coordinates, instead of the full dataset. The size of the sample is chosen with the parameter n_samples. In this simple example we do not average several runs since the trend is clearly apparent even without a careful efficiency analysis.

[2]:
import time
import numpy as np
import matplotlib.pyplot as plt
from dreimac import CircularCoords, GeometryExamples
[3]:
X = GeometryExamples.noisy_circle(n_samples = 1500, noise_size=0.8)

subsample_sizes = (np.arange(10) + 1) * 150

times = []
for size in subsample_sizes:
    start = time.time()
    _ = CircularCoords(X, size, prime=3)
    end = time.time()
    times.append(end - start)

plt.scatter(subsample_sizes, times)
plt.xlabel("n_samples")
_ = plt.ylabel("computation time (seconds)")
../_images/notebooks_parameter_n_samples_2_0.png