The most advanced neural network architecture to date. The transformer is a major improvement over the recurrent neural network (RNN). Initially presented in a 2017 paper by eight scientists at Google entitled "Attention Is All You Need," transformers are able to understand the relationships between words that are far apart in a sentence much more effectively than RNNs. The attention mechanism determines which words are the most important, and the patterns in the data are found mathematically.
From Words to Tokens to Vectors
First cleaned by removing punctuation and symbols, the text is turned into "tokens," and the tokens are converted into vectors (mathematical representations). Everything remains as vectors in the model until it is asked to generate results. The tokens are then decoded and the output is formatted back into readable text with the proper punctuation. See
AI token,
GPT,
machine learning and
recurrent neural network.