Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Crossfeat: a transformer-based cross-feature learning model for predicting drug side effect frequency

Fig. 2

Schematic of the CNN and the cross-feature learning (feature-wise cross-attention) mechanism in the CrossFeat architecture. A An \(l\times l\) dimension embedding matrix is passed through four convolutional layers, each consisting of a Conv2D, batch normalization, and ReLU activation function, followed by mean pooling to extract feature matrices. These feature matrices are then input into the transformer encoder. B Queries (Q) from the drug encoder and keys (K) from the side effect encoder are used to form the attention scores. Specifically, the queries are derived from the previous sublayer of the drug encoder, while the keys and values (V) are obtained from the first sublayer of the side effect encoder. Attention scores are calculated as the dot product of the queries and keys, which are then passed through a softmax function to generate the attention weights. These weights are subsequently multiplied by the values to produce the output. This cross-attention process enables the effective fusion of features between the drug and side effects. It enhances the ability of the model to capture the complex relationships between drugs and their side effects

Back to article page