Graph node classification: Stacked GCN layers seems not to learn anything (highly imbalanced data). #9614

nasrinsaalehi · 2024-08-21T12:11:25Z

nasrinsaalehi
Aug 21, 2024

I have a graph with nearly 100 nodes. I duplicated this graph 200 times. Then, I add a new feature to each node feature using concatenation each time I duplicate each graph. So, I have 100*200 nodes with different labels. If the batch size of my model is set to 3, I process 2 of these graphs in my data loader. Also, I have highly imbalanced data with nearly 0.1 data being positive. To solve this issue, I use PolyLoss, a more general version of FocalLoss:

class PolyLoss(nn.Module):
    def __init__(self, weight_loss, DEVICE, epsilon=1.0):
        super(PolyLoss, self).__init__()
        self.CELoss = nn.CrossEntropyLoss(weight=weight_loss, reduction='none')
        self.epsilon = epsilon
        self.DEVICE = DEVICE
    def forward(self, predicted, labels):
        predicted = predicted[labels.reshape(-1) != -1, :]
        labels = labels[labels != -1].long()
        one_hot = torch.zeros((len(labels), 2), device=self.DEVICE).scatter_(1, torch.unsqueeze(labels, dim=-1), 1)
        pt = torch.sum(one_hot * F.softmax(predicted, dim=1), dim=-1)
        ce = self.CELoss(predicted, labels)
        poly1 = ce + self.epsilon * (1-pt)
        return torch.mean(poly1)

Since I need to do binary node classification on these graphs, my accuracy, AUC, precision, recall, and f-score are not improved after epoch 200 and are as follows:

accuracy = 0.7795,
AUC= 0.7766, 
f-score = 0.6648, 
precision = [0.89556864 , 0.58515057], 
recall = [0.78341014 , 0.76980874]

To ensure my model is learning or not, I try to use tse-analysis with 2 dimensions. I get the last two layers of my model and visualize the layer before the last layer. After visualizing, I can see that the model is unable to detect the pattern for positive and negative samples and most of the negative samples are wrongly detected as a positive sample because of the Poly loss effect.
I have tried changing other hyper-parameters like changing batch size, trying other losses including Focal_loss and weighted cross-entropy loss, changing learning rates and even scheduling rating rates, adding various normalization techniques, using dropouts, different numbers of GCN layers, etc.
It is worth telling you that I have a graph this float values as relations between each two nodes and I use a threshold of 0.6 to convert this graph to a binary relation graph (as a network graph):

        with open(graph_filename + '.txt', 'r') as file:
            # Read all lines from the file
            lines = file.readlines()

        # Create an empty NetworkX graph
        G = nx.Graph()
        node_id_feature_dict = get_node_id_feature_dict_dataset(dataset)
        node_ids = list(node_id_feature_dict.keys())
        # Iterate over the lines
        for i, line in enumerate(lines):
            # Split the line by whitespace to get the values
            values = line.strip().split()
            for j, value in enumerate(values):
                # Convert the value to float
                weight = float(value)
                # graph does not contain self-loops (they are added in build_data function)
                if node_ids[i] == node_ids[j]:
                    continue
                if G.has_edge(node_ids[i], node_ids[j]):
                    continue
                if weight > 0.6:
                    G.add_node(node_ids[i])
                    G.add_node(node_ids[j])
                    G.add_edge(node_ids[i], node_ids[j], weight=weight)

After making the networkx graph, I connected node features to a new feature Tensor. Then, I convert each graph to a Data object containing node features, labels, and edge index. After that, I pass batches of these graphs to my model:

        edge_list, edge_feature_list = [], []

        # Iterate over the edges in the graph with self-loops
        for u, v in self.graph.edges():
            # Get the edge weight (or set it to 1 if no weight is provided)
            node1, node2 = self.node_to_index_dict[u], self.node_to_index_dict[v]
            edge_weight = self.graph[u][v].get('weight', 0.0)

            # Append the edge to the edge list
            edge_list.append([node1, node2])
            # edge_list.append([node2, node1])

            # Append the edge weight to the edge feature list
            edge_feature_list.append(edge_weight)
            # edge_feature_list.append(edge_weight)

        # for u in self.graph.nodes():
        #     node = self.node_to_index_dict[u]
        #     edge_list.append([node, node])
        #     edge_feature_list.append(1.0)

        # Convert the edge list and edge feature list to PyTorch tensors
        self.edge_index = torch.tensor(edge_list, dtype=torch.int64).t().contiguous()
        self.edge_weight = torch.tensor(edge_feature_list, dtype=torch.float)

My question is if there exists any issue with my codes and how can I force my model to learn this data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graph node classification: Stacked GCN layers seems not to learn anything (highly imbalanced data). #9614

{{title}}

Replies: 0 comments

Select a reply

Graph node classification: Stacked GCN layers seems not to learn anything (highly imbalanced data). #9614

nasrinsaalehi Aug 21, 2024

Replies: 0 comments

nasrinsaalehi
Aug 21, 2024