Code and data for the KDD2020 paper "Learning Opinion Dynamics From Social Traces"

Cleansing Wikipedia Categories using Centrality

WoMG: Word of Mouth Generator

A simple bayesian analysis of the role of social classes in Italian national elections of 2018-2019

Liquid FM: music recommendations with liquid democracy for Facebook

Vector Space Modelling for Humans

Learning Latent Category Matrix to Find Unexpected Relations

Reddit Rift: a replacement for the departed reddit toolbar, as a Firefox add-on.

A note sequencer to make music with Bezier curves!

issue closed

Question on semi-supervised part

Hello!

I am looking for graph neural network datasets with partially missing labels. As far as I understand from your paper, the WikiCS dataset can be used for node label prediction. Interpreting semi-supervised in this context, a network can use this dataset and learns from samples with and without labels. Am I right here? Assuming I am correct, I was trying to find such samples in the data, but could not find them easily, thus was a bit confused.

Thanks for your answer!

issue comment

Oh ok I see! The training mask does not only mask samples as a whole, but masks also some of the labels only, thus describing a certain pattern of missingness. Thanks!

issue comment

Hi! Yes, you are correct. In the context of semi-supervised node classification, the entire graph is normally known at training but with only a subset of the labels. The way the data is organized for WikiCS, all node labels are given in a single vector, and there's separate binary vectors specifying train/val/test masks. At training time, only the labels allowed by the train mask are used.

By the way, note that this dataset (along with many others) have been included in the torch_geometric GNN library, so you might find it convenient to just load it from there: https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html#torch_geometric.datasets.WikiCS

Cheers, Peter

issue opened

