Fig. 4

Self-attention calculation process. a Function a calculates the similarity between drugs. b The softmax function calculates the proportion of similarity score. c For one compound, through matrix multiplication, the weight layer will have compound similarity features. d The \(W'\) layer will be updated during the training process and finally reflects the contribution of genetic mutation to each drug’s sensitivity. By multiplying the W layer with the softmax matrix, drug–drug similarity is taken into account