So I just gave your blarghl line to Claude, and it replied "It seems like you included a mix of text including "blarghl unorthodox babble" followed by the phrase "The rain in Spain."
Did you mean to ask about the well-known phrase "The rain in Spain falls mainly on the plain"? This is a famous elocution exercise from the musical "My Fair Lady," where it's used to teach proper pronunciation.
Or was there something specific you wanted to discuss about Spain's rainfall patterns or perhaps something else entirely? I'd be happy to help with whatever you intended to ask. "
I think you have a point here, but maybe re-express it? Because right now your argument seems trivially falsifiable even under your own terms.
If you feed Claude you're getting Claude's "system prompt" before the text you give it.
If you want to test convolution you have to use a raw model with no system prompt. You can do that with a Llama or similar. Otherwise your context window is full of words like "helpful" and "answer" and "question" that guide the response and make it harder (not impossible) to see the effect I'm talking about.
I'm a bit confused here. Are you saying that if I zero out the system prompt on any LLM, including those fine-tuned to give answers in an instructional form, they will follow your effect -- that nonsense prompts will get similar results to coherent prompts if they contain many of the same words?
Because I've tried it on a few local models I have handy, and I don't see that happening at all. As someone else says, some of that difference is almost certainly due to supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) -- but it's weird to me, given the confidence you made your prediction, that you didn't exclude those from your original statement.
I guess, maybe the real question here is: could you give me a more explicit example of how to show what you are trying to show? And explain why I'm not seeing it while running local models without system prompts?
At this point, you might as well be claiming completions model behaves differently than a fine-tuned model. Which is true but the prompt in API without any systems message seems to also not match your prediction.
True but also irrelevant. The "AI" is the entirety of the system, which includes the model itself as well as any prompts and other machinery around it.
I mean, if you dig down enough, the LLM doesn't even generate tokens - it merely gives you a probability distribution, and you still need to explicitly pick the next token based on those probabilities, append it to the input, and start next iteration of the loop.
Did you mean to ask about the well-known phrase "The rain in Spain falls mainly on the plain"? This is a famous elocution exercise from the musical "My Fair Lady," where it's used to teach proper pronunciation.
Or was there something specific you wanted to discuss about Spain's rainfall patterns or perhaps something else entirely? I'd be happy to help with whatever you intended to ask. "
I think you have a point here, but maybe re-express it? Because right now your argument seems trivially falsifiable even under your own terms.