top of page

Why Your Password Might Be in the Charts

willbourne_outside_square-5.jpg

Michael Brown

A Musical Approach to Cracking Passwords - Part 2

Penetration Testing

1 May 2026

In Brief

Exploring results when removing unlikely characters.

By removing characters from the wordlist which are intuitively less likely to be present in real passwords, the number of passwords cracked was far higher compared to using the original wordlist.

Understanding the effectiveness of a smaller lyrics-based list.

The approach of sorting and streamlining the initial wordlist based on popularity did not lead to additional passwords being cracked. The results also did not improve when enriching misspelt lyrics or those with contractions.

Evidencing the improvements when replacing numbers.

By replacing numerical characters with their cardinal (for example, one, two or three) or ordinal (for example, first, second, third) equivalent, an increase in successfully cracked hashes was evidenced.

Introduction

In the last post, we performed a proof-of-concept test by running a large list of song lyrics against several sets of hashes. The initial tests showed that this is a potentially untapped source of hash candidates, which resulted in several complex passphrases that have not been cracked by the community until this point.

 

Following this initial test, we will be making improvements to our methodology and tools, with the goal of finding more effective ways to modify large data sets to crack predictable passphrases in an efficient manner.

Cleanup

Before carrying out any major adjustments, we ran Chris Moberly’s existing cleanup.py script on our wordlist (lyricsv1.txt) to produce a “cleaner” list, this eliminated any lines containing non-Latin characters, the resulting wordlist worked better with Moberly’s passphrase rules.

.\cleanup.py .\wordlists\lyricsv1.txt .\wordlists\lyricsv1.5.txt

​​The resulting wordlist lyricsv1.5.txt list was run against the same hashes as lyricsv1.txt, and the results were then compared. The table below displays the number of hashes cracked by wordlists lyricsv1 and lyricsv1.5.

lyricsv1
only

7

3

3

1

2

lyricsv1.5

53

27

2

9

1

Hash Type
(Hashcat Mode)

MD5
(0)

SHA1
(100)

NTLM
(1000)

SHA2-256 (1400)

md5(md5($pass))
(2600)

image.png

Figure 1 - A comparison of lyricsv1 and lyricsv1.5 (here named lyricsv1clean.txt).

Comparison of the wordlists showed that the reason for the discrepancy was due to the uppercase letter "I" being converted to lowercase "i" by the cleanup.py script. The passphrase rules produced the following candidates:

shutterstock_2254509249_3_edited_edited_
shutterstock_2254509249_3_edited.png

“the closer i got the more i lost” (:)

shutterstock_2254509249_3_edited_edited.

“The Closer I Got The More I Lost” (E)

“The closer I got the more I lost”.

shutterstock_2254509249_3_edited_edited.

“The closer i got the more i lost” (c)

This problem can be solved by modifying the clean-up script, rather than converting the whole list to lower case. The letter "i" on its own can be capitalised by adding the following lines to cleanup.py:​​​​​​

line = re.sub('^i ', 'I ', line)
line = re.sub(' i$', ' I', line)
line = re.sub(' i ', ' I ', line)

At this point we also added an additional rule, substituting the numeral 1 for the letter I.

sI1
si1

Optimisation
jon-r8AFUpRp0J0-unsplash.jpg

For version 2 of the wordlist (lyricsv2.txt) we aimed to optimise the list of lyrics, sorting by song popularity, reducing the number of phrases and allowing for more rules and variations to be used.

We started by using a dataset of the top songs on Spotify found on Kaggle, this dataset is updated daily and includes a popularity value we can use to sort the lyrics and improve early hit rates. This top song dataset also includes the “Spotify ID”, as does the lyrics dataset, which we used as a cross reference. The csvtools suite made this task very simple​.

csvsort -d "," -q '"' -b -u1 -r -c9 universal_top_spotify_songs.csv |

csvcut -d "," -q '"' -b -u1 -c1 > popular_ids.txt

image.png

csvgrep -d "," -q '"' -b -u1 -c id -f popular_ids.txt songs_with_attributes_and_lyrics.csv |

csvcut -d "," -q '"' -b -u1 -c lyrics > lyricsv2.txt

The resulting lyricsv2.txt file contained the lyrics for all songs listed in the Top Spotify Songs dataset, sorted by popularity in descending order.


Some additional processing was performed on the resulting wordlist, tidying up artefacts from the csv including removing excess whitespace, removing backing vocals or adlibs in parenthesis and removing duplicated and blank lines.

sed -i "s/\['\|'\]//g;s/', '/\n/g;s/^\"//;s/\"$//;s/^\s*//;s/\s*$//;s/\(.*\)/\L\1/; s/[\"]//g;s/(.*)//g;/^$/d" lyricsv2.txt

image.png

awk '!a[$0]++' lyricsv2.txt > lyricsv2.tmp && mv lyricsv2.tmp lyricsv2.txt#

The resulting file was extremely small for a cracking wordlist at 12,275 lines long, down from the 18,975,569 lines of lyricsv1.txt. We opted against running the cleanup.py script against our wordlist to preserve the non-Latin characters. 

​The results show that v2 cracked no hashes, this could be due to the popularity of the songs chosen or that these hashes had already been cracked by prior researchers. At this point in time we considered generating two wordlists, one comprised of popular songs and one comprehensive list of all available lyrics.

Modification – Let's rethink

The way that lyrics are transcribed online can be very different to how a user would set a password. Lyrical transcription usually adheres to guidelines or a style guide. The lyrics from our dataset were ultimately sourced from the MusixMatch community, which has extensive guidelines for their submissions. There are several factors that could lead to heard lyrics being interpreted differently by different people. We have prepared three modifications that can be programmatically applied to our lists of song lyrics. With each method we have listed the length of the resulting wordlist, demonstrating the effect of the modification on version 2 of the wordlist (lyricsv2.txt) which was 12,275 lines long.

Numerals

There are instances where numbers could be written as Arabic numerals or written as words. In order to account for these potential discrepancies we used the “inflect” python library. We created a script to write several representations of numbers throughout the wordlist. This produced the following results from one sample line:

     love of 2 is 1

     love of 2 is one

     love of two is 1

     love of two is one

The resulting lyricsv2a wordlist is 13,758 lines long.

Misspellings

Users can make spelling errors when setting passwords, either by mistake or on purpose. Spelling errors differ from typographical errors in that they represent how the individual believes a word to be spelled and not making a mistake in the typing of the password. This means that an individual would make the same mistake reliably rather than mistype, submitting an invalid password. We sourced a list of commonly misspelled words from the Birkbeck Spelling Error Corpus. We used this list and a python script to add misspelled alternatives for every commonly misspelled word:

     all our times have come

     a our times have come

     at our times have come

     all our times have cam

     all our times have com

     all our times a come

     all our times after come

     all our times fo come

     all our times had come

As demonstrated by the sample above, the plausibility of some of these spelling mistakes are questionable, some seem to be typographical errors than misconceptions of how a word is spelled. 


The resulting lyricsv2b wordlist was 331,382 lines long.

Contractions

A common discrepancy between transcriptions of songs is the use of contractions, for example “Sittin' on the dock of the bay” versus “Sitting on the dock of the bay”. We were able to replace contractions using the following sed expression:

sed -r "s/^((.*)in'([ \?]+.*))$/\1\n\2ing\3/g;s/^((.*)gon'[$ \?](.*))$/\1\n\2gonna\3/g"

This method was limited in that it didn’t replace all of the contractions, a method using python would be more flexible and ideally produce the following results.

 workin' 9 to 5, what a way to make a livin'

 workin' 9 to 5, what a way to make a living

 working 9 to 5, what a way to make a livin'

 working 9 to 5, what a way to make a living

The resulting lyricsv2c.txt wordlist was 12,639 lines long.

Results

The optimised list of lyrics continues to return no results, even with the additional processing. We theorise that due to our dataset, being previously uncracked password hashes, the most popular song lyrics have already been cracked by wordlists such as Chris Moberly’s. As such, only the more obscure song lyrics are returning matches.

For the next set of tests, we evaluated the three modifications, to determine which process to perform on the most effective wordlist so far, lyricsv1.5.txt.

Numerals was a good candidate to be run against v1.5 as it increased the length of lyricsv2 by 12%, estimating a length of 20,730,426 for v1.5a. This was a negligible difference in time, it would take 12 minutes to crack NTLM hashes up from 11 minutes for v1.5.


Misspellings needs to be re-evaluated, as it increased the length of lyricsv2 by 2,599%, this would increase v1.5b to 499,329,883 which would take an estimated 5 hours to run against the existing rule lists. 


The incomplete version of Contractions only increased the length of lyricsv2 by 2.9%, this could very easily be run against v1.5, resulting in 19,032,484 lines, again, taking 11 minutes.


For the next set of tests, we attempt to run v1.5 through the numeral and contraction modifications. ​​

lyricsv1
(Plain)

7

3

3

2

2

lyricsv1.5
(Plain)

53

27

2

9

1

lyricsv1.5a (Numerals)

66

33

2

10

1

lyricsv1.5c
(Contractions)

54

27

2

10

1

Hash Type
(Hashcat Mode)

MD5
(0)

SHA1
(100)

NTLM
(1000)

SHA2-256 (1400)

md5(md5($pass))
(2600)

As it currently stands, the modification of contractions does not seem to be returning any additional hashes, the two additional hashes returned were due to additional rules added between v1 and v2.​​

Conclusion

While the filtering and sorting of songs by popularity did not pay off against our dataset, it may still be worth doing when attempting to crack common passphrases, in order to test this properly we would need a more comprehensive set of hashes. 


The key takeaway from Part 2 of our research is that modification of numerals netted an increase of 21% over the base version of v1.5. 

Looking Ahead

​If you would like to learn more or need assistance with performing penetration testing or red team engagements, please do not hesitate to get in touch with us.

bottom of page