Welcome to graphgrove’s documentation!

A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search.

Package Classes

class graphgrove.covertree.Node(this)

CoverTree node from c++.

class graphgrove.covertree.NNS_L2(this)

CoverTree Class for NN search in Euclidean distance.

class graphgrove.covertree.MIPS(this, phi2)

CoverTree Class for maximum inner product search.

kNearestNeighbours(points, k=10, use_multi_core=- 1, return_points=False)

main nearest neighbor function.

class graphgrove.covertree.MCSS(this)

CoverTree Class for NN search in cosine distance.

kNearestNeighbours(points, k=10, use_multi_core=- 1, return_points=False)

main nearest neighbor function.

class graphgrove.graph_builder.Index(k)
class graphgrove.graph_builder.Cosine_CoverTree(k, cores=4, add_noise=True, noise_amount=1e-06, assume_unit_normed=True)
class graphgrove.graph_builder.Cosine_SGTree(k, cores=4, add_noise=True, noise_amount=1e-06, assume_unit_normed=True)
class graphgrove.graph_builder.Cosine_SGTreeBeam(k, beam_size=100, cores=4, add_noise=True, noise_amount=1e-06, assume_unit_normed=True)
class graphgrove.graph_builder.Cosine_FaissFlat(k, cores=4, add_noise=True, noise_amount=1e-06, assume_unit_normed=True)
class graphgrove.graph_builder.Cosine_FaissHNSW(k, max_degree=128, efSearch=128, efConstruction=200, add_noise=True, noise_amount=1e-06, assume_unit_norm=True)
class graphgrove.llama.LLAMA(this)
assignments()

Return clusters of the DAG-structure discovered.

Returns: coo_matrix of size N by K where N is the number of data points and K is the number of nodes in the DAG. coo_matrix[i,j] = 1 if the point i is a descendant of the node j (i.e., the cluster represented by node j contains point i).

cluster()

Run the DAG-clustering process.

classmethod from_graph(coo_graph, num_rounds, cores=4, linkage=2, max_num_parents=5, max_num_neighbors=100, thresholds=None, lowest_value=- 10000)

Instantiate a LLAMA object with the given graph & hyperparameters.

Parameters
  • coo_graph – the graph to cluster.

  • num_rounds – the number of rounds to use.

  • cores – number of parallel threads to use (default 4).

  • linkage – linkage function to use either integer (0 for single, 1 for average, 2 for approx. average). (default 2). or string valued (‘single’, ‘average, ‘approx_average’)

  • max_num_parents – maximum number of parents any node can have (default 5).

  • max_num_neighbors – maximum number of neigbhors any node can have in the graph (default 100).

  • thresholds – None (for no threshold use). Or a numpy array (float32) of the minimum similarity to allow in an agglomeration (default None).

  • lowest_value – value used for missing / minimum similarity (default -10000)

round(r)

Return the cover of the r^{th} round.

Returns: coo_matrix of size N by K_r where N is the number of data points and K_r is the number of nodes in r^{th} round. coo_matrix[i,j] = 1 if the point i is a descendant of the node j (i.e., the cluster represented by node j contains point i).

structure()

Return edges of the DAG-structure discovered.

Returns: a numpy matrix of size M by 2 where M is the number of edges. each row of the has a value of [child_node_id, parent_node_id]

class graphgrove.scc.Node(this)
class graphgrove.scc.Level(this)
class graphgrove.scc.SCC(this)
class graphgrove.sgtree.Node(this)

SGTree node from c++.

class graphgrove.sgtree.NNS_L2(this)

SGTree Class for NN search in Euclidean distance.

class graphgrove.sgtree.MIPS(this, phi2)

SGTree Class for maximum inner product search.

kNearestNeighbours(points, k=10, use_multi_core=- 1, return_points=False)

main nearest neighbor function.

class graphgrove.sgtree.MCSS(this)

SGTree Class for NN search in cosine distance.

kNearestNeighbours(points, k=10, use_multi_core=- 1, return_points=False)

main nearest neighbor function.

class graphgrove.vec_scc.Cosine_SCC(k=25, num_rounds=50, thresholds=None, index_name='cosine_sgtree', cores=12, cc_alg=0, par_minimum=100000, verbosity=0, beam_size=100, hnsw_max_degree=200, hnsw_ef_search=200, hnsw_ef_construction=200)

Indices and tables