Skip to main content
Ctrl+K
Graphistry, Inc. Graphistry, Inc.

PyGraphistry documentation

  • 10 Minutes to PyGraphistry
  • Install
    • Installation Guide - Quick Start
    • Installation Guide - Extended
    • Using a Server with PyGraphistry
  • Login and Share
    • API authentication to Graphistry servers
    • Sharing and Access Control
  • Visualize
    • 10 Minutes to Graphistry Visualization
    • UI Guide
    • Quick Guide to PyGraphistry layouts
    • PyGraphistry Layout Catalog
    • Layout Settings & Visualization Embedding
  • GFQL: The Dataframe-Native Graph Query Language
    • 10 Minutes to GFQL
    • Overview of GFQL
    • GFQL Remote Mode
    • GFQL CPU & GPU Acceleration
    • Translate Between SQL, Pandas, Cypher, and GFQL
    • Combine GFQL with PyGraphistry Loaders, ML, AI, & Visualization
    • GFQL Quick Reference
    • GFQL Operator Reference
    • Working with Dates and Times
    • Temporal Predicates Wire Protocol Reference
  • Plugins
  • CPU & GPU Acceleration
  • Notebook Tutorials
    • Getting Started
      • For analysts
      • For developers
      • CSV upload miniapp
      • Visually analyze any table as a graph: Our 3 favorite shapings
    • Visualization
      • Colors
      • Sizes
      • Icons
      • Badges
      • Edge weights
      • Ring - categorical
      • Ring - continuous
      • Ring - time
      • Group in a box
      • Modularity weighted
      • Tree
      • External - networkx
      • External - manual
      • Sharing Tutorial: Securely Collaborating in Graphistry
    • GFQL Graph queries
      • Intro to graph queries with hop and chain
      • DateTime Filtering Examples
      • GPU Benchmarking
      • GFQL Remote mode
      • Python Remote mode
    • GPU
      • GPU I: CPU Pandas
      • GPU II: cuDF
      • GPU IV: cuML UMAP
      • GPU V: cuGraph
      • How much GPU RAM do you need and how much data fits into a GPU task?
    • AI
      • Story
      • RGCN
      • RGCN+UMAP
      • Link prediction with DGL (cyber)
    • Plugins - Data Providers
      • AlienVault: OTX indicators
      • AlienVault: Locker Goga
      • AlientVault: USM
      • Amazon Neptune I
      • Amazon Neptune II
      • Arango
      • Databricks
      • Memgraph
      • NodeXL
      • Splunk
      • Titan
      • Neo4j - Official
      • Neo4j - Contributed
      • Google Spanner - Finance Graph
      • SQL - Postgres
      • Tigergraph: Bindings
      • Tigergraph: Fraud
      • Tigergraph: Social
    • Plugins - Compute & Layout
      • graphviz
      • HyperNetX
      • NetworkX
  • Cheatsheets
  • Python API Reference
    • GraphistryClient
    • Plotter API Reference
    • GFQL API Reference
      • AST Objects
      • GFQL Chain Matcher
      • GFQL Edge Matchers
      • GFQL Hop Matcher
      • GFQL Node Matchers
      • GFQL Attribute Matchers
    • Compute API Reference
    • Hypergraphs
    • AI
    • Utilities
      • DGL Utils
      • Modules
      • Plugins
      • Plugin Types
      • Graphistry Validate Module
    • Layouts
      • Circle Layout
      • ForceAtlas2 Layout
      • Group-in-a-Box Layout
      • Modularity Weighted Layout
      • Ring Layouts: Categorical, Continuous, Time
      • Sugiyama Layout
      • Utils
    • Plugins
      • Compute
        • cuGraph
        • graphviz
        • igraph
        • NetworkX Methods
      • Data Providers
        • Azure Cosmos DB for Apache Gremlin
        • Gremlin - Apache ThinkerPop
        • Kusto Plugin
        • Amazon Neptune
        • Spanner Plugin
  • Join the Community
  • Support
  • Graphistry Ecosystem and Louie.AI
  • Architecture
  • Contribute
  • Development Setup
  • .ipynb

Your first graph neural network: Detecting suspicious logins with link prediction

Contents

  • 1. Graphs are awesome
  • 2. Graphs for identity data
  • 3. AI era of graph: GNNs + UMAP
  • 4. RGCNs - Relational graph convolutional networks
  • 5. Try it yourself
  • 6. Taking it to production
  • Next steps
  • Resource

Your first graph neural network: Detecting suspicious logins with link prediction#

Graphistry - Leo Meyerovich, Alex Morrise, Tanmoy Sarkar

Infosec Jupyterthon 2022, December 2022

Alert on & visualize anomalous identity events * Demo dataset: 1.6B windows events over 58 days => logins by 12K user over 14K systems * adapt to any identity system with logins * => Can we identify accounts & computers acting anomalously? Resources being oddly accessed? * => Can we spot the red team? * => Operations: Identity incident alerting + identity data investigations * Community/contact for help handling bigger-than-memory & additional features * Techniques explored: Graph AI - * RGCN (primary) - powerful with tweaking and in a pipeline * UMAP (secondary) - surprisingly effective with little tweaking * Runs on both CPU + multi-GPU * Tools: PyGraphistry[AI], DGL + PyTorch, and NVIDIA RAPIDS / umap-learn


1. Graphs are awesome#

  • Defenders think in lists, Attackers think in graphs. As long as this is true, attackers win.

  • Network graphs & event graphs & kill chains & ..: Honeypot

  • Today: Two techniques for the graph AI era, focusing on identity graphs

  • => Caught 96% of red team’s logins (400+ out of millions) with only 10% FPs

  • Graph neural networks (GNNs) + UMAP

2. Graphs for identity data#

Sample attacks * Fake account * Account takeover: Malware, credential stuffing, … * Insider threat: Helpdesk, rogue admin, … * Abnormal resource access patterns

Data & user activities (UEBA): - Entity resolution: You, your assets, your contexts, .. - Authentication - Authorization - 💰💰💰 Did I mention zero-trust identity protection ? 💰💰💰💰

Goals: Empower - * Identity detection * Identity investigation

3. AI era of graph: GNNs + UMAP#

  • GNN’s: Science’s Breakthrough of 2021 - example

  • Combines network thinking (interesting connectivity) with tabular (time, $, etc. features)

  • Primitives:

    • Classify nodes (“bot”)

    • Predict links (“recommendation”, “violation”) <– TODAY

    • Classify graphs (“motif mining”)

  • Compose into tools:

    • Anomaly detection <– today

    • abuse scoring

    • feeding into combined methods: today we’re looking for graph shapes, but temporal cool too (RNN)

    • if model can do well at some task, good chance of reuse on other bits

4. RGCNs - Relational graph convolutional networks#

Twitter botnet example

  • GNN - Graph neural network: Label prop

    • “if all their friends are bots, …”

    • multiple dimensions: bytes, region, …

  • GCNs - Graph convolutional network: Multiple layers

    • “even if know little about them, but their friends..”

    • shallow!

  • RGCNs - Relational GCNs:

    • multiple relationship types - follow vs block vs …

    • ex: remote desktop vs regular login

Watch 2 youtube videos at end for theoretical intuitions

5. Try it yourself#

See:

  • SSH logs RGCN anomaly detector in a few cells: simple-ssh-logs-rgcn-anomaly-detector.ipynb

  • In-depth RGCN: advanced-identity-protection-40m.ipynb

6. Taking it to production#

Watch the repo / contact to join us on:

  • Daily batch / real-time alerting => Splunk

  • Scaling & autonomous operation

  • Tuning: Time data, common FPs (new IPs, ..), …

  • Use for correlation ID generation for investigation context (see tmw’s UMAP talk!)

Next steps#

  • SSH logs RGCN anomaly detector in a few cells: simple-ssh-logs-rgcn-anomaly-detector.ipynb

  • In-depth RGCN: advanced-identity-protection-40m.ipynb

  • UMAP demo for 97% alert volume reduction & alert correlation

  • PyGraphistry (py, oss) + Graphistry Hub (free)

    • Dashboarding with graph-app-kit (containerized, gpu, graph Streamlit)

  • Happy to help:

    • Join our Slack

    • email and let’s chat! info@graphistry.com

Resource#

  • PyGraphistry[AI]

  • What is graph intelligence

  • GNN Videos:

    • GCN - https://www.youtube.com/watch?v=2KRAOZIULzw

    • RGCN - https://www.youtube.com/watch?v=wJQQFUcHO5U

    • Euler (combining RNN + GNN)- https://www.youtube.com/watch?v=1t124vguwJ8

[ ]:

previous

AI

next

Identity data anomaly detection: SSH session anomaly detection with RGCNs

Contents
  • 1. Graphs are awesome
  • 2. Graphs for identity data
  • 3. AI era of graph: GNNs + UMAP
  • 4. RGCNs - Relational graph convolutional networks
  • 5. Try it yourself
  • 6. Taking it to production
  • Next steps
  • Resource

By Graphistry, Inc.

© Copyright 2024, Graphistry, Inc..