Background: Many scholars maintain that in the very line of a 1st century Latin poem that hints at the place of origin of the author his name is encrypted by means of a simple steganography. Other scholars maintain that the occurrence of the sequence of letters involved is coincidental. I'm thinking that if the chance against that sequence of letters occurring in that position within that poem coincidentally could calculated, there would be more than just subjective opinion on the matter.
It's the kind of problem that I could have solved myself by means of the binomial coefficient formula if it were not for a certain feature of the problem that puts the solution--if it has one--beyond my capability.
The details of the problem are:
- The poem consists of 2522 letters (X).
- Letters number 1971 to 2103 (133 letters = Y) constitute the (syntactically self-contained) hint as to the place of origin of the poet.
- The 11 letters of the name (in their order in the name but non-contiguous, i.e. interspersed with other letters) within a string of 130 letters (Z) occurs within Y, short of Y's beginning by 1 letter and short of Y's end by 2 letters. However,
- The letters of the name (likewise in their order in the name but non-contiguous) turn up 8 other times in X in strings of more or less the same number of letters as Z (as per natural probability due to limited alphabet).
- What is the chance against 3. not being coincidental if it could have occurred anywhere in the text? Great? Very great? Extremely great? Astronomically great?
It's the non-contiguous occurrence of the letters of the name that make it difficult (for me) to calculate the probability by the binomial coefficient formula.