Could you summarize your algorithm? Like a high-level description?
Like algorithm 1 (from the article) checks N possible positions at a time by matching the first and last character; algorithm 2 instead tests the 4 leading characters.
What I do is to build a substring out of the initial strings that is a multiple of 2. If the string I try to search is 9 characters long, then I extract an 8 characters substrings that I transform into an integer over an integer:
Here is the example for 4 characters:
//We look for the presence of four characters in a row
```
int32_t cc = search[3];
for (shift = 2; shift >= 0; shift--) {
cc <<= 8;
cc |= search[shift];
}
__m256i firstchar = _mm256_set1_epi16(cc);
```
In this case, I will look for a 4 bytes integers over my sequence:
```
current_bytes = _mm256_cmpeq_epi16(firstchar, current_bytes);
q <<= 1;
q |= _mm256_movemask_epi8(current_bytes);
````
I'm looking for blocks of 4 characters at a time in my string.