GFQL Hop Matcher#

Hop is the core primitive behind a single matcher step in chain.

Calling hop directly has performance benefits over calling chain so may be helpful for larger graphs.

graphistry.compute.hop.generate_safe_column_name(base_name, df, prefix='__temp_', suffix='__')#

Generate a temporary column name that doesn’t conflict with existing columns. Uses a simple incrementing counter to avoid dependencies.

Parameters:#

base_namestr

The original column name to base the temporary name on

dfDataFrame

The DataFrame to check for column name conflicts

prefixstr

Prefix to prepend to the temporary column name

suffixstr

Suffix to append to the temporary column name

Returns:#

str

A unique column name that doesn’t exist in the DataFrame

graphistry.compute.hop.hop(self, nodes=None, hops=1, to_fixed_point=False, direction='forward', edge_match=None, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, return_as_wave_front=False, target_wave_front=None, engine=EngineAbstract.AUTO)#

Given a graph and some source nodes, return subgraph of all paths within k-hops from the sources

This can be faster than the equivalent chain([…]) call that wraps it with additional steps

See chain() examples for examples of many of the parameters

g: Plotter nodes: dataframe with id column matching g._node. None signifies all nodes (default). hops: consider paths of length 1 to ‘hops’ steps, if any (default 1). to_fixed_point: keep hopping until no new nodes are found (ignores hops) direction: ‘forward’, ‘reverse’, ‘undirected’ edge_match: dict of kv-pairs to exact match (see also: filter_edges_by_dict) source_node_match: dict of kv-pairs to match nodes before hopping (including intermediate) destination_node_match: dict of kv-pairs to match nodes after hopping (including intermediate) source_node_query: dataframe query to match nodes before hopping (including intermediate) destination_node_query: dataframe query to match nodes after hopping (including intermediate) edge_query: dataframe query to match edges before hopping (including intermediate) return_as_wave_front: Exclude starting node(s) in return, returning only encountered nodes target_wave_front: Only consider these nodes + self._nodes for reachability engine: ‘auto’, ‘pandas’, ‘cudf’ (GPU)

Parameters:
  • self (Plottable)

  • nodes (Any | None)

  • hops (int | None)

  • to_fixed_point (bool)

  • direction (str)

  • edge_match (dict | None)

  • source_node_match (dict | None)

  • destination_node_match (dict | None)

  • source_node_query (str | None)

  • destination_node_query (str | None)

  • edge_query (str | None)

  • target_wave_front (Any | None)

  • engine (EngineAbstract | str)

Return type:

Plottable

graphistry.compute.hop.prepare_merge_dataframe(edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, is_reverse=False)#

Prepare a merge DataFrame handling column name conflicts for hop operations. Centralizes the conflict resolution logic for both forward and reverse directions.

Parameters:#

edges_indexedDataFrame

The indexed edges DataFrame

column_conflictbool

Whether there’s a column name conflict

source_colstr

The source column name

dest_colstr

The destination column name

edge_id_colstr

The edge ID column name

node_colstr

The node column name

temp_colstr

The temporary column name to use in case of conflict

is_reversebool, default=False

Whether to prepare for reverse direction hop

Returns:#

DataFrame

A merge DataFrame prepared for hop operation

Parameters:
  • edges_indexed (Any)

  • column_conflict (bool)

  • source_col (str)

  • dest_col (str)

  • edge_id_col (str)

  • node_col (str)

  • temp_col (str)

  • is_reverse (bool)

Return type:

Any

graphistry.compute.hop.process_hop_direction(direction_name, wave_front_iter, edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, intermediate_target_wave_front, base_target_nodes, target_col, node_match_query, node_match_dict, is_reverse, debugging)#

Process a single hop direction (forward or reverse)

Parameters:#

direction_namestr

Name of the direction for debug logging (‘forward’ or ‘reverse’)

wave_front_iterDataFrame

Current wave front of nodes to expand from

edges_indexedDataFrame

The indexed edges DataFrame

column_conflictbool

Whether there’s a name conflict between node and edge columns

source_colstr

The source column name

dest_colstr

The destination column name

edge_id_colstr

The edge ID column name

node_colstr

The node column name

temp_colstr

The temporary column name for conflict resolution

intermediate_target_wave_frontDataFrame or None

Pre-calculated target wave front for filtering

base_target_nodesDataFrame

The base target nodes for destination filtering

target_colstr

The target column for merging (destination or source depending on direction)

node_match_querystr or None

Optional query for node filtering

node_match_dictdict or None

Optional dictionary for node filtering

is_reversebool

Whether this is the reverse direction

debuggingbool

Whether debug logging is enabled

Returns:#

Tuple[DataFrame, DataFrame]

The processed hop edges and node IDs

Parameters:
  • direction_name (str)

  • wave_front_iter (Any)

  • edges_indexed (Any)

  • column_conflict (bool)

  • source_col (str)

  • dest_col (str)

  • edge_id_col (str)

  • node_col (str)

  • temp_col (str)

  • intermediate_target_wave_front (Any | None)

  • base_target_nodes (Any)

  • target_col (str)

  • node_match_query (str | None)

  • node_match_dict (dict | None)

  • is_reverse (bool)

  • debugging (bool)

Return type:

Tuple[Any, Any]

graphistry.compute.hop.query_if_not_none(query, df)#
Parameters:
  • query (str | None)

  • df (Any)

Return type:

Any