nichecompass.data.edge_level_split
- nichecompass.data.edge_level_split(data, edge_label_adj, val_ratio=0.1, test_ratio=0.0, is_undirected=True, neg_sampling_ratio=0.0)
Split a PyG Data object into training, validation and test PyG Data objects using an edge-level split. The training split does not include edges in the validation and test splits and the validation split does not include edges in the test split. However, nodes will not be split and all node features will be accessible from all splits.
Check https://github.com/pyg-team/pytorch_geometric/issues/3668 for more context how RandomLinkSplit works.
- Parameters:
data (
Data) – PyG Data object to be split.edge_label_adj (
Optional[csr_matrix]) – Adjacency matrix which contains edges for edge reconstruction. If ´None´, uses the ‘normal’ adjacency matrix used for message passing.val_ratio (
float(default:0.1)) – Ratio of edges to be included in the validation split.test_ratio (
float(default:0.0)) – Ratio of edges to be included in the test split.is_undirected (
bool(default:True)) – If ´True´, the graph is assumed to be undirected, and positive and negative samples will not leak (reverse) edge connectivity across different splits. This is set to ´False´, as there is an issue with replication of self loops.neg_sampling_ratio (
float(default:0.0)) – Ratio of negative sampling. This should be set to 0 if negative sampling is done by the dataloader.
- Return type:
Tuple[Data,Data,Data]- Returns:
- train_data:
Training PyG Data object.
- val_data:
Validation PyG Data object.
- test_data:
Test PyG Data object.