ML Glossary
Parameters: Change as part of training (e.g. Weights)
HyperParameters: Manual Config Settings such as:
Assumption about relationship between inputs vs outputs such as:
Refined during training.
dying ReLU and inactive neurons
problem when x < 0Combines the strengths of bidirectional encoding (like BERT) and auto-regressive decoding (like GPT)
Encoder-Decoder Transformer Architecture. Open Weight.
BART deliberately applies noise on input (masking, deletion, etc) during pre-training to make it more robust for inference. This denoising is a slight variation on Transformer model.
BART Large (406M)
T5, mBART, GPT-4 are better models than BART
An autoencoder is a type of NN that learns to compress (encode) and then reconstruct (decode) data.
Auto implies unsupervised learning by both encoder and decoder.
Several variations exist:
Not all encoder-decoder models are auto-encoders. e.g. Machine translation model does not aim to reconstruct but translates