GFQL Hop Matcher#

Hop is the core primitive behind a single matcher step in chain.

Calling hop directly has performance benefits over calling chain so may be helpful for larger graphs.

graphistry.compute.hop.generate_safe_column_name(base_name, df, prefix='__temp_', suffix='__')#

Generate a temporary column name that doesn’t conflict with existing columns. Uses a simple incrementing counter to avoid dependencies.

Parameters:#

base_namestr: The original column name to base the temporary name on
dfDataFrame: The DataFrame to check for column name conflicts
prefixstr: Prefix to prepend to the temporary column name
suffixstr: Suffix to append to the temporary column name

Returns:#

str: A unique column name that doesn’t exist in the DataFrame

graphistry.compute.hop.hop(self, nodes=None, hops=1, to_fixed_point=False, direction='forward', edge_match=None, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, return_as_wave_front=False, target_wave_front=None, engine=EngineAbstract.AUTO)#

Given a graph and some source nodes, return subgraph of all paths within k-hops from the sources

This can be faster than the equivalent chain([…]) call that wraps it with additional steps

See chain() examples for examples of many of the parameters

g: Plotter nodes: dataframe with id column matching g._node. None signifies all nodes (default). hops: consider paths of length 1 to ‘hops’ steps, if any (default 1). to_fixed_point: keep hopping until no new nodes are found (ignores hops) direction: ‘forward’, ‘reverse’, ‘undirected’ edge_match: dict of kv-pairs to exact match (see also: filter_edges_by_dict) source_node_match: dict of kv-pairs to match nodes before hopping (including intermediate) destination_node_match: dict of kv-pairs to match nodes after hopping (including intermediate) source_node_query: dataframe query to match nodes before hopping (including intermediate) destination_node_query: dataframe query to match nodes after hopping (including intermediate) edge_query: dataframe query to match edges before hopping (including intermediate) return_as_wave_front: Exclude starting node(s) in return, returning only encountered nodes target_wave_front: Only consider these nodes + self._nodes for reachability engine: ‘auto’, ‘pandas’, ‘cudf’ (GPU)

Parameters:

self (Plottable)
nodes (Any | None)
hops (int | None)
to_fixed_point (bool)
direction (str)
edge_match (dict | None)
source_node_match (dict | None)
destination_node_match (dict | None)
source_node_query (str | None)
destination_node_query (str | None)
edge_query (str | None)
target_wave_front (Any | None)
engine (EngineAbstract | str)

Return type:

Plottable

graphistry.compute.hop.prepare_merge_dataframe(edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, is_reverse=False)#

Prepare a merge DataFrame handling column name conflicts for hop operations. Centralizes the conflict resolution logic for both forward and reverse directions.

Parameters:#

edges_indexedDataFrame: The indexed edges DataFrame
column_conflictbool: Whether there’s a column name conflict
source_colstr: The source column name
dest_colstr: The destination column name
edge_id_colstr: The edge ID column name
node_colstr: The node column name
temp_colstr: The temporary column name to use in case of conflict
is_reversebool, default=False: Whether to prepare for reverse direction hop

Returns:#

DataFrame: A merge DataFrame prepared for hop operation

Parameters:

edges_indexed (Any)
column_conflict (bool)
source_col (str)
dest_col (str)
edge_id_col (str)
node_col (str)
temp_col (str)
is_reverse (bool)

Return type:

Any

graphistry.compute.hop.process_hop_direction(direction_name, wave_front_iter, edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, intermediate_target_wave_front, base_target_nodes, target_col, node_match_query, node_match_dict, is_reverse, debugging)#

Process a single hop direction (forward or reverse)

Parameters:#

direction_namestr: Name of the direction for debug logging (‘forward’ or ‘reverse’)
wave_front_iterDataFrame: Current wave front of nodes to expand from
edges_indexedDataFrame: The indexed edges DataFrame
column_conflictbool: Whether there’s a name conflict between node and edge columns
source_colstr: The source column name
dest_colstr: The destination column name
edge_id_colstr: The edge ID column name
node_colstr: The node column name
temp_colstr: The temporary column name for conflict resolution
intermediate_target_wave_frontDataFrame or None: Pre-calculated target wave front for filtering
base_target_nodesDataFrame: The base target nodes for destination filtering
target_colstr: The target column for merging (destination or source depending on direction)
node_match_querystr or None: Optional query for node filtering
node_match_dictdict or None: Optional dictionary for node filtering
is_reversebool: Whether this is the reverse direction
debuggingbool: Whether debug logging is enabled

Returns:#

Tuple[DataFrame, DataFrame]: The processed hop edges and node IDs

Parameters:

direction_name (str)
wave_front_iter (Any)
edges_indexed (Any)
column_conflict (bool)
source_col (str)
dest_col (str)
edge_id_col (str)
node_col (str)
temp_col (str)
intermediate_target_wave_front (Any | None)
base_target_nodes (Any)
target_col (str)
node_match_query (str | None)
node_match_dict (dict | None)
is_reverse (bool)
debugging (bool)

Return type:

Tuple[Any, Any]

graphistry.compute.hop.query_if_not_none(query, df)#

Parameters:

query (str | None)
df (Any)

Return type:

Any

GFQL Hop Matcher

Contents

GFQL Hop Matcher#

Parameters:#

Returns:#

Parameters:#

Returns:#

Parameters:#

Returns:#