Note: This post lists the flag designs that are most popular with the general public. For my own judgement on the best proposals, see this post.
When I was thinking of designs for the New Zealand flag competition, I was curious about the preferences of the wider public. No doubt others are too. Unfortunately, polls had a limited selection of designs to begin with, and while the government gallery had social media sharing and suggestions for every submitted flag, there was no way to sort the gallery to show the most popular.
So I made a quick Java script to scrape all entries in the website and identify the most popular flags. This is measured in number of times each design was independently suggested. Ten was the minimum number to get on this list.
Keep in mind that popularity does not equal quality, nor is it a final indicator of public preferences. It is affected by many factors like age, status and prior exposure of the design. Obviously, this list is biased towards well-known older designs rather than newer designs even if they’re great. This list is simply for interest of the data itself.
Flags are listed in ascending order of popularity. Each one lists the three main points of the respondents.
Silver Fern Flag – Kyle Lockwood’s ‘New Zealand Colours’. Designed by: Kyle Lockwood.
- Similar to current flag
- Black and white are national colours. Silver fern is national symbol. These are already recognised worldwide and have historical significance.
- Māori represented by black (I think this is a reference to the Tino Rangatiratanga flag which includes black?)
I was in a statistical geography mood so I made this map based on Wikimedia statistics. It shows the most popular Wikipedia language edition for each countries and territories that had hits in 2014 Q1. If the majority of hits from a place are for a single language, I marked that language’s colour. If there was no majority language, I marked the top two in a gradient.
I hope you find this as interesting as I did.
Map of countries and territories by most popular Wikipedia edition (2014 Q1). Click on image for full size.
- Out of ~6000 languages in the world, only 32 (0.5%) account for most popular Wikipedia edition in every country and territory in the world that tried to access it. All of these languages are from Eurasia, which really says something about the power structures over history and the digital divide.
- Language geography corresponds well with European imperial holdings with some exceptions. Who would have guessed that Puerto Rico, Suriname and East Timor would have English as their preferred Wikipedia language? Regionalisation is also a factor.
- English has more popularity than the rest of the languages combined.
- Regions with no single majority language include North Africa, the Caucasus, the Balkans and the Baltics. Other such places include Belgium (French and Dutch), Norway (English and Norwegian), Greenland (English and Danish), Israel (English and Hebrew) and South Korea (English and Korean).
Leave your thoughts in the comments section below!
This paper proposes a new pattern in the text of the Voynich Manuscript named the “Curve-Line System” (CLS). This pattern is fundamentally based on shapes of individual glyphs but also informs the structure of words. The hypotheses of the system are statistically tested by two independent people to judge their significance. It is also compared to existing word structure paradigms. The results suggest that the shapes of glyphs affect their placement in a word, the Curve-Line System is an intentional feature of the text design, and the text of the Voynich Manuscript is a highly artificial system.
I’ve been toying with the idea of using Pascal’s Triangle to make a cipher that results in similar statistics to the text “system” of the Voynich Manuscript. My concepts are premature but I’m pleased to note that so far I’ve devised something (relatively) simple with short words, binomial word lengths, strong word structure, lines as semantic units, lack of repeated sequences, and word-adjacent repetition. I haven’t had the time to really dig in and quantify any of these and compare with the VMS text but on first glance it appears fairly close.
For example, here is a ciphered phrase using an early version of the cipher and EVA transcription: (deliberately seeded to end in -n all the time)
chiain chiin dain choiin shoin shoiin chiiin chiin shn dain chiiin diin in.
Here is the same phrase again:
potir chiin dain shoedy shoin shoiin chols sheey toy chddy chiiin ooli aiim.
Here is the same phrase yet again:
fodar choiin shn sheey diiin diin choli shoedy chedy ty choin shels daiim.
Here is the same phrase without vowels:
shedy shoey sheyi shoiin choiin chtchar cheli shoyiiim.
Interesting things about my system (so far):
- Word context is highly important and affects all content.
- The same plaintext sequence is almost guaranteed to end up completely different every time it is included. This applies to individual words too. For a word of length n enciphered twice, the probability that its ciphered versions will match is approximately 1/(2^18n), with a few caveats here and there. I wish WordPress could embed formulae easily (can someone please tell me how in the comments below?).
- Multiple appearances of the same ciphertext sequence are almost guaranteed to be completely unrelated. This applies to individual words and similar sequences like Timm Pairs. The probability is similar to the one mentioned above. However, if they are at the very beginning or end of lines they might be a bit related. If they are labels (i.e. enciphered outside a line) they become much more similar.
- Blank spaces in words are meaningful. What do I mean by this? All words actually store 10 letters of information, but one letter of the alphabet is an invisible glyph (we’ll call it “_”), giving the appearance of different word lengths. For example (not a real example), fodar might actually be f _ _ o d _ a _ _ r. The system allows us to unambiguously reconstruct the original ten letter sequence with ease. This allows words to store more information than they would suggest.
- Similar words that appear next to each other (Bad Romance sequences) are an unintended side effect. They store just as much information as any other sequence because of their context.
- It allows for a total of 9^4=6561 unique words, though this can be adjusted with some tricks and workarounds. Stolfi counted a total of 6525 unique words in the Voynich Manuscript.
- If this was confirmed to be the system behind the Voynich Manuscript’s text, I would still have very little idea of how to decipher it.
- Update: At certain points you could pack filler at the beginning or end of a line to make them equal length and make the system a bit more secure. In the Voynich Manuscript itself, some see evidence of meaningless filler material at the beginning or end of some lines.
- Update 2: It also accounts for the findings that the first two letters of each word are more predictable than the rest, and that there is some mild correlation between the end of one word and the start of the next.