Hacker News new | past | comments | ask | show | jobs | submit login

Should the line

    Z_encoder_decoder = layer_norm(Z_encoder_decoder + Z)

in Decoder step 7 instead be

    Z_encoder_decoder = layer_norm(Z_encoder_decoder + Z_self_attention)
? Also, is layer_norm missing in Decoder step 8...



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: