nichecompass.data.initialize_dataloaders
- nichecompass.data.initialize_dataloaders(node_masked_data, edge_train_data=None, edge_val_data=None, edge_batch_size=64, node_batch_size=64, n_direct_neighbors=-1, n_hops=1, shuffle=True, edges_directed=False, neg_edge_sampling_ratio=1.0)
Initialize edge-level and node-level training and validation dataloaders.
- Parameters:
node_masked_data (
Data) – PyG Data object with node-level split masks.edge_train_data (
Optional[Data] (default:None)) – PyG Data object containing the edge-level training set.edge_val_data (
Optional[Data] (default:None)) – PyG Data object containing the edge-level validation set.edge_batch_size (
Optional[int] (default:64)) – Batch size for the edge-level dataloaders.node_batch_size (
int(default:64)) – Batch size for the node-level dataloaders.n_direct_neighbors (
int(default:-1)) – Number of sampled direct neighbors of the current batch nodes to be included in the batch. Defaults to ´-1´, which means to include all direct neighbors.n_hops (
int(default:1)) – Number of neighbor hops / levels for neighbor sampling of nodes to be included in the current batch. E.g. ´2´ means to not only include sampled direct neighbors of current batch nodes but also sampled neighbors of the direct neighbors.shuffle (
bool(default:True)) – IfTrue, shuffle the dataloaders.edges_directed (
bool(default:False)) – IfFalse, both symmetric edge index pairs are included in the same edge-level batch (1 edge has 2 symmetric edge index pairs).neg_edge_sampling_ratio (
float(default:1.0)) – Negative sampling ratio of edges. This is currently implemented in an approximate way, i.e. negative edges may contain false negatives.
- Return type:
- Returns:
loader_dict: Dictionary containing training and validation PyG LinkNeighborLoader (for edge reconstruction) and NeighborLoader (for gene expression reconstruction) objects.