## 计算机代写|神经网络代写neural networks代考|General Framework of Graph Neural Networks

The essential idea of graph neural networks is to iteratively update the node representations by combining the representations of their neighbors and their own representations. In this section, we introduce a general framework of graph neural networks in (Xu et al, 2019d). Starting from the initial node representation $H^{0}=X$, in each layer we have two important functions:

• AGGREGATE, which tries to aggregate the information from the neighbors of each node;
• COMBINE, which tries to update the node representations by combining the aggregated information from neighbors with the current node representations.
Mathematically, we can define the general framework of graph neural networks as follows:
Initialization: $H^{0}=X$
For $k=1,2, \cdots, K$,
$$\begin{array}{r} a_{v}^{k}=\operatorname{AGGREGATE}^{k}\left{H_{u}^{k-1}: u \in N(v)\right} \ H_{v}^{k}=\operatorname{COMBINE}^{k}\left{H_{v}^{k-1}, a_{v}^{k}\right}, \end{array}$$
where $N(v)$ is the set of neighbors for the $v$-th node. The node representations $I 1^{K}$ in the last layer can be treated as the final node representations.

Once we have the node representations, they can be used for downstream tasks. Take the node classification as an example, the label of node $v$ (denoted as $\hat{y}{v}$ ) can be predicted through a Softmax function, i.e., $$\hat{y}{v}=\operatorname{Softmax}\left(W H_{v}^{\top}\right),$$
where $W \in \mathbb{R}^{|\mathscr{L}| \times F},|\mathscr{L}|$ is the number of labels in the output space.

## 计算机代写|神经网络代写neural networks代考|Graph Convolutional Networks

We will start from the graph convolutional networks (GCN) (Kipf and Welling, 2017b), which is now the most popular graph neural network architecture due to its simplicity and effectiveness in a variety of tasks and applications. Specifically, the node representations in each layer is updated according to the following propagation rule:
$$H^{k+1}=\sigma\left(\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{k} W^{k}\right) .$$
$\tilde{A}=A+\mathbf{I}$ is the adjacency matrix of the given undirected graph $\mathscr{G}$ with selfconnections, which allows to incorporate the node features itself when updating the node representations. $\mathbf{I} \in \mathbb{R}^{N \times N}$ is the identity matrix. $\tilde{D}$ is a diagonal matrix with $\tilde{D}{i i}=\sum{j} \tilde{A}_{i j} . \sigma(\cdot)$ is an activation function such as ReLU and Tanh. The ReLU active function is widely used, which is defined as $\operatorname{ReLU}(x)=\max (0, x) . W^{k} \in \mathbb{R}^{F \times F^{\prime}}$ $\left(F, F^{\prime}\right.$ are the dimensions of node representations in the $\mathrm{k}$-th, (k+1)-th layer respectively) is a laywise linear transformation matrix, which will be trained during the optimization.

We can further dissect equation equation $4.5$ and understand the AGGREGATE and COMBINE function defined in GCN. For a node $i$, the node updating equation can be reformulated as below:
$$\begin{gathered} H_{i}^{k}=\sigma\left(\sum_{j \in{N(i) \cup i}} \frac{\tilde{A}{i j}}{\sqrt{\tilde{D}{i i} \tilde{D}{j j}}} H{j}^{k-1} W^{k}\right) \ H_{i}^{k}=\sigma\left(\sum_{j \in N(i)} \frac{A_{i j}}{\sqrt{\tilde{D}{i i} \tilde{D}{j j}}} H_{j}^{k-1} W^{k}+\frac{1}{\tilde{D}{i}} H{i}^{k-1} W^{k}\right) \end{gathered}$$

## 计算机代写|神经网络代写neural networks代考|General Framework of Graph Neural Networks

• AGGREGATE，它试图聚合来自每个节点的邻居的信息;
• COMBINE，它尝试通过将来自邻居的聚合信息与当前节点表示相结合来更新节点表示。 在数学上，我们可以定义图神经网络的一般框架如下:
初始化: $H^{0}=X$
为了 $k=1,2, \cdots, K$ ，
$\backslash 1 \mathrm{eft}$ 的分隔符缺失或无法识别
在哪里 $N(v)$ 是邻居的集合 $v$-th 节点。节点表示 $I 1^{K}$ 在最后一层可以被视为最终的节点表示。
一旦我们有了节点表示，它们就可以用于下游任务。以节点分类为例，节点的标签 $v($ 表示为 $\hat{y} v)$ 可以通过 Softmax 函数进行预测，即
$$\hat{y} v=\operatorname{Softmax}\left(W H_{v}^{\top}\right),$$
在哪里 $W \in \mathbb{R}^{|\mathscr{L}| \times F},|\mathscr{L}|$ 是输出空间中的标签数。

## 计算机代写|神经网络代写neural networks代考|Graph Convolutional Networks

$$H^{k+1}=\sigma\left(\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{k} W^{k}\right) .$$ 一个激活函数，例如 ReLU 和 Tanh。ReLU 主动函数被广泛使用，其定义为 $\operatorname{ReLU}(x)=\max (0, x)$. $W^{k} \in \mathbb{R}^{F \times F^{\prime}}\left(F, F^{\prime}\right.$ 是节点表示的维度 $\mathrm{k}$-th, (k+1)-th layer) 是一个逐层线性变换矩阵，将在优化过程中进行训练。

$$H_{i}^{k}=\sigma\left(\sum_{j \in N(i) \cup i} \frac{\tilde{A} i j}{\sqrt{\tilde{D} i i \tilde{D} j j}} H^{k-1} W^{k}\right) H_{i}^{k}=\sigma\left(\sum_{j \in N(i)} \frac{A_{i j}}{\sqrt{\tilde{D} i i \tilde{D} j j}} H_{j}^{k-1} W^{k}+\frac{1}{\tilde{D}_{i}} H i^{k-1} W^{k}\right)$$

