## 计算机代写|神经网络代写neural networks代考|Graph Attention Networks

In GCNs, for a target node $i$, the importance of a neighbor $j$ is determined by the weight of their edge $A_{i j}$ (normalized by their node degrees). However, in practice, the input graph may be noisy. The edge weights may not be able to reflect the true strength between two nodes. As a result, a more principled approach would be to automatically learn the importance of each neighbor. Graph Attention Networks (a.k.a. GAT(Veličković et al, 2018)) is built on this idea and try to learn the importance of each neighbor based on the Attention mechanism (Bahdanau et al, 2015; Vaswani et al, 2017). Attention mechanism has been wide used in a variety of tasks in natural language understanding (e.g. machine translation and question answering) and computer vision (e.g. visual question answering and image captioning). Next, we will introduce how attention is used in graph neural networks.

Graph Attention Layer. The graph attention layer defines how to transfer the hidden node representations at layer $k-1$ (denoted as $H^{k-1} \in \mathbb{R}^{N \times F}$ ) to the new node representations $H^{k} \in \mathbb{R}^{N \times F^{\prime}}$. In order to guarantee sufficient expressive power to transform the lower-level node representations to higher-level node representations, a shared linear transformation is applied to every node, denoted as $W \in \mathbb{R}^{F \times F^{\prime}}$. Afterwards, self-attention is defined on the nodes, which measures the attention coefficients for any pair of nodes through a shared attentional mechanism $a: \mathbb{R}^{F^{\prime}} \times \mathbb{R}^{F^{\prime}} \rightarrow$ R
$$e_{i j}=a\left(W H_{i}^{k-1}, W H_{j}^{k-1}\right) .$$

## 计算机代写|神经网络代写neural networks代考|Neural Message Passing Networks

Another very popular graph neural network architecture is the Neural Message Passing Network (MPNN) (Gilmer et al, 2017), which is originally proposed for learning molecular graph representations. However, MPNN is actually very general, provides a general framework of graph neural networks, and could be used for the task of node classification as well. The essential idea of MPNN is formulating existing graph neural networks as a general framework of neural message passing among nodes. In MPNNs, there are two important functions including Message and Updating function:
$$\begin{gathered} m_{i}^{k}=\sum_{i \in N(j)} M_{k}\left(H_{i}^{k-1}, H_{j}^{k-1}, e_{i j}\right), \ H_{i}^{k}=U_{k}\left(H_{i}^{k-1}, m_{i}^{k}\right) . \end{gathered}$$
$M_{k}(\cdot,,,$, defines the message between node $i$ and $j$ in the k-th layer, which depends on the two node representations and the information of their edge. $U_{k}$ is the node updating function in the k-th layer which combines the aggregated messages from the neighbors and the node representation itself. We can see that the MPNN framework is very similar to the general framework we introduced in Section 4.2.1. The AGGREGATE function defined here is simply a summation of all the messages from the neighbors. The COMBINE function is the same as the node Updating function.

