GFQL Hop Matcher#
Hop is the core primitive behind a single matcher step in chain.
Calling hop directly has performance benefits over calling chain so may be helpful for larger graphs.
- graphistry.compute.hop.generate_safe_column_name(base_name, df, prefix='__temp_', suffix='__')#
Generate a temporary column name that doesn’t conflict with existing columns. Uses a simple incrementing counter to avoid dependencies.
Parameters:#
- base_namestr
The original column name to base the temporary name on
- dfDataFrame
The DataFrame to check for column name conflicts
- prefixstr
Prefix to prepend to the temporary column name
- suffixstr
Suffix to append to the temporary column name
Returns:#
- str
A unique column name that doesn’t exist in the DataFrame
- graphistry.compute.hop.hop(self, nodes=None, hops=1, to_fixed_point=False, direction='forward', edge_match=None, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, return_as_wave_front=False, target_wave_front=None, engine=EngineAbstract.AUTO)#
Given a graph and some source nodes, return subgraph of all paths within k-hops from the sources
This can be faster than the equivalent chain([…]) call that wraps it with additional steps
See chain() examples for examples of many of the parameters
g: Plotter nodes: dataframe with id column matching g._node. None signifies all nodes (default). hops: consider paths of length 1 to ‘hops’ steps, if any (default 1). to_fixed_point: keep hopping until no new nodes are found (ignores hops) direction: ‘forward’, ‘reverse’, ‘undirected’ edge_match: dict of kv-pairs to exact match (see also: filter_edges_by_dict) source_node_match: dict of kv-pairs to match nodes before hopping (including intermediate) destination_node_match: dict of kv-pairs to match nodes after hopping (including intermediate) source_node_query: dataframe query to match nodes before hopping (including intermediate) destination_node_query: dataframe query to match nodes after hopping (including intermediate) edge_query: dataframe query to match edges before hopping (including intermediate) return_as_wave_front: Exclude starting node(s) in return, returning only encountered nodes target_wave_front: Only consider these nodes + self._nodes for reachability engine: ‘auto’, ‘pandas’, ‘cudf’ (GPU)
- Parameters:
self (Plottable)
nodes (Any | None)
hops (int | None)
to_fixed_point (bool)
direction (str)
edge_match (dict | None)
source_node_match (dict | None)
destination_node_match (dict | None)
source_node_query (str | None)
destination_node_query (str | None)
edge_query (str | None)
target_wave_front (Any | None)
engine (EngineAbstract | str)
- Return type:
- graphistry.compute.hop.prepare_merge_dataframe(edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, is_reverse=False)#
Prepare a merge DataFrame handling column name conflicts for hop operations. Centralizes the conflict resolution logic for both forward and reverse directions.
Parameters:#
- edges_indexedDataFrame
The indexed edges DataFrame
- column_conflictbool
Whether there’s a column name conflict
- source_colstr
The source column name
- dest_colstr
The destination column name
- edge_id_colstr
The edge ID column name
- node_colstr
The node column name
- temp_colstr
The temporary column name to use in case of conflict
- is_reversebool, default=False
Whether to prepare for reverse direction hop
Returns:#
- DataFrame
A merge DataFrame prepared for hop operation
- Parameters:
edges_indexed (Any)
column_conflict (bool)
source_col (str)
dest_col (str)
edge_id_col (str)
node_col (str)
temp_col (str)
is_reverse (bool)
- Return type:
Any
- graphistry.compute.hop.process_hop_direction(direction_name, wave_front_iter, edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, intermediate_target_wave_front, base_target_nodes, target_col, node_match_query, node_match_dict, is_reverse, debugging)#
Process a single hop direction (forward or reverse)
Parameters:#
- direction_namestr
Name of the direction for debug logging (‘forward’ or ‘reverse’)
- wave_front_iterDataFrame
Current wave front of nodes to expand from
- edges_indexedDataFrame
The indexed edges DataFrame
- column_conflictbool
Whether there’s a name conflict between node and edge columns
- source_colstr
The source column name
- dest_colstr
The destination column name
- edge_id_colstr
The edge ID column name
- node_colstr
The node column name
- temp_colstr
The temporary column name for conflict resolution
- intermediate_target_wave_frontDataFrame or None
Pre-calculated target wave front for filtering
- base_target_nodesDataFrame
The base target nodes for destination filtering
- target_colstr
The target column for merging (destination or source depending on direction)
- node_match_querystr or None
Optional query for node filtering
- node_match_dictdict or None
Optional dictionary for node filtering
- is_reversebool
Whether this is the reverse direction
- debuggingbool
Whether debug logging is enabled
Returns:#
- Tuple[DataFrame, DataFrame]
The processed hop edges and node IDs
- Parameters:
direction_name (str)
wave_front_iter (Any)
edges_indexed (Any)
column_conflict (bool)
source_col (str)
dest_col (str)
edge_id_col (str)
node_col (str)
temp_col (str)
intermediate_target_wave_front (Any | None)
base_target_nodes (Any)
target_col (str)
node_match_query (str | None)
node_match_dict (dict | None)
is_reverse (bool)
debugging (bool)
- Return type:
Tuple[Any, Any]
- graphistry.compute.hop.query_if_not_none(query, df)#
- Parameters:
query (str | None)
df (Any)
- Return type:
Any