ViewTube

Skip

Recommended videos

Umar Jamil

58:04

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

300,362 views

11 months ago

Outlier

13:06

Cross Attention | Method Explanation | Math Explained

20,799 views

1 year ago

Yannic Kilcher

29:56

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)

324,581 views

3 years ago

Alfredo Canziani

1:12:01

10 – Self / cross, hard / soft attention and the Transformer

34,831 views

2 years ago

Cross-Attention in Transformer Architecture Can Merge Images with Text

3,082 views

Vaclav Kosar

608 subscribers

Wed, 23 Mar 2022 00:00:00 GMT

Cross-attention is a way to merge two token or embedding sequences in transformer architecture. - POST: https://vaclavkosar.com/ml/cross-attention-in-transformer-architecture - Self-attention: https://vaclavkosar.com/ml/transformers-self-attention-mechanism-simplified - Feed-forward layer: https://vaclavkosar.com/ml/Feed-Forward-Self-Attendion-Key-Value-Memory Cross-attention is a very similar to self-attention, except we are putting together two sequences asymmetrically. One of the sequences serves as a query, while the other as a key and value. an attention mechanism in Transformer architecture that mixes two different embedding sequences the two sequences can be of different modalities (e.g. text, image, sound) one of the modalities defines the output dimensions and length by playing a role of a query similar to the feed forward layer where the other sequence is static

ViewTube

Recommended videos

Cross-Attention in Transformer Architecture Can Merge Images with Text

4 Comments