Why you should never use the type “char” in Java

The post title may be blunt. But I think after reading this article, you will never use the type “char” in Java ever again.

The origin of type “char”

At the beginning, everything was ASCII, and every character on a computer could be encoded with 7 bits. While this is fine for most English texts and can also suit most European languages if you strip the accents, it definitely has its limitations. So the extended character table came, bringing a full new range of characters to ASCII, including the infamous character 255, which looks like a space, but is not a space. And code pages were defining how to show any character between 128 and 255, to allow for different scripts and languages to be printed.

Then, Unicode brought this to a brand new level by encoding characters on… 16 bits. This is about the time when Java came out in the mid-1990s. Thus, Java designers made the decision to encode Strings with characters encoded on 16 bits. All Java char has always been and is still encoded with 16 bits.

However, when integrating large numbers of characters, especially ideograms, the Unicode team understood 16 bits were not enough. So they added more bits and notified everyone: “starting now, we can encode a character with more than 16 bits”.

In order not to break compatibility with older programs, Java chars remained encoded with 16 bits. Instead of seeing a “char” as a single Unicode character, Java designers thought it best to keep the 16 bits encoding. They thus had to introduce the new concepts from Unicode, such as “surrogate” chars to indicate that one specific char is actually not a character, but an “extra thing”, such as an accent, which can be added to a character.

Character variations

In fact, some characters can be thought of in different ways. For instance, the letter “ç” can be considered:

  • either as a full character on its own, this was the initial stance of Unicode,
  • either as the character “c” on which a cedilla “¸” is applied.

Both approaches have advantages and drawbacks. The first one is generally the one used in linguistics. Even double characters are considered “a character” in some languages, such as the double l “ll” in Spanish which is considered as a letter on its own, separate from the single letter “l”.

However, this approach is obviously very greedy with individual character unique numbers: you have to assign a number to every single possible variation of a character. For someone who is only familiar with English, this might seem like a moot point. However, Vietnamese, for instance, uses many variations of those appended “thingies”. The single letter “a”, can follow all those individual variations: aàáâãặẳẵằắăậẩẫầấạả. And this goes for all other vowels as well as some consonants. Of course, the same goes for capital letters. And this is only Vietnamese.

The second approach has good virtues when it comes to transliterating text into ASCII, for instance, since transliterating becomes a simple matter of eliminating diacritics. And of course, when typing on a keyboard, you cannot possibly have one key assigned to every single variation of every character, so the second approach is a must.

Special cases: ideograms

When considering ideograms, there are also a small number of “radicals” (roughly 200 for Chinese). Those get combined together to form the large number of ideograms we know (tens of thousands).

Breakdown of a chinese word into character, radical and stroke
A Chinese Word’s decomposition (credit: Nature: https://www.nature.com/articles/s41598-017-15536-w)


It would be feasible to represent any Chinese character using a representation using radicals and their position. However, it is more compact to list all possible Chinese characters and assign a number to each of them, which is what was done by Unicode.

Korean Hangul

Another interesting case is Hangul, which is used to write Korean. Every character is actually a combination of letters and represents a syllable:

Hangul characters are syllables that can be broken down into individual phonemes.
Credit: https://www.youtube.com/watch?v=fHbkwKAIQLA


So, in some cases, it is easier to assign a number to every individual components and then combine them (which happens when typing in Korean on a keyboard). There are only 24 letters (14 vowels and 10 consonants). However, the number of combinations to form a syllable is very large: it amounts to more than 11 000, although only about 3 000 of them produce correct Korean syllables.

Funny characters

People, especially in social media, use an increasing number of special characters, emojis, and other funny stuff, from 𝄞 to 🐻. Those have made it into Unicode, thus making it possible to write ʇxǝʇ uʍop ǝpısdn, 𝔤𝔬𝔱𝔥𝔦𝔠 𝔱𝔢𝔵𝔱, or even u̳n̳d̳e̳r̳l̳i̳n̳e̳d̳ ̳t̳e̳x̳t̳ without the need for formatting or special fonts (all the above are written without special fonts or images, those are standard Unicode characters). Every flag of the world’s countries have even made it as a single character into the Unicode norm.

This plethora of new characters which made it late into the standard are often using more than 16 bits for their encoding.

Using type “char” in Java

When using the type “char” in Java, you accept that things like diacritics or non existent characters will be thrown at you, because remember, a char is encoded with 16 bits. So, when doing “𝄞”.toCharArray() or iterating through this String’s chars, Java will throw at you two characters that don’t exist on their own:

  • \uD834
  • \uDD1E

Both those characters are illegal, and they only exist as a pair of characters.

Bottom line, when it comes to text, chars shouldn’t be used. Ever. In the end, as a Java developer, you have probably learned that, unless doing bit operations, you should never use String.getBytes(), and use chars instead. Well, with the new Unicode norms and the increasing use of characters above 0xFFFF, when it comes to Strings, using char is as bad as using byte.

Java type “char” will break your data

Consider this one:


What do you think this prints? 1? Nope. It prints 2.

Here is one of the consequences of this. Try out the following code:


This prints the following, which might have surprised you before reading this blog post:


But after reading this post, this makes sense. Sort of.

Because substring() is actually checking chars and not code points, we are actually cutting the String which is encoded this way:

\uD834 \uDD1E \u0031
\___________/ \____/
𝄞 1

It is amazing that a technology such as Java hasn’t addressed the issue in a better way than this.

Unicode “code points”

Actually, it is a direct consequence of what was done at the Unicode level. If you tried to break down the character 𝄞 into 16 bits chunks, you wouldn’t get valid characters. But this character is correctly encoded with U+1D11E. This is called a “code point”, and every entry in the Unicode character set has its own code point.

The down side is that an individual character may have several code points.

Indeed, the character “á” can be either of these:

  • the Unicode letter “á” on its own, encoded with U+00E1,
  • the Unicode combination of the letter “a” and its diacritic “◌́”, which results in the combination of U+0061 and U+0301.

Java code points instead of char

A code point in Java is a simple “int”, which corresponds to the Unicode value assigned to the character.

So when dealing with text, you should never use “char”, but “code points” instead. Rather than

“a String”.toCharArray()


“a String”.codePoints()

Instead of iterating on chars, iterate on code points. Whenever you want to check for upper case characters, digits or anything else, never use the char-based methods of class Character or String. Always use the code point counterparts.

Note that this code will actually fail with some Unicode characters:

for (int i = 0 ; i < string.length() ; i++)
   if (Character.isUpperCase(string.charAt(i)))
        ... do something

This will iterate through characters that are NOT characters, but Unicode “code units” which are possibly… garbage.

Inserting data into a database

Consider a simple relational table:

id 🔑 (primary key) int(11)
c (unique constraint) varchar(4)

Now imagine your java program is inserting unique characters in this table. If based on “char” the Java program will consider two different surrogate chars as different since their code are different, but the database will store strange things at some point since those are not valid Unicode codes. And the unique constraint will kick in, crashing your program, and possibly allowing wrong Unicode codes to be pushed into the table.

Alternative replacements

String.toCharArray() String.codePoints() (to which you can append toArray() to get an int[])
String.charAt(pos) String.codePointAt(pos)
String.indexOf(int/char) String.indexOf(String)
iterate with String.length() convert String into an int[] of code points and iterate on those
String.substring() Make sure you don’t cut between a surrogate pair. Or use int[] of code points altogether.
replace(char, char) replaceAll(String, String) and other replace methods using Strings

new String(char[])
new String(char[], offset, count)

new String(int[] codePoints, int offset, int count) with code points
Character methods using type char Character methods using int code points

Surveiller la mortalité « toutes causes » en open source


La transparence est essentielle pour construire une société de confiance. Étant moi-même toujours en doute avec les informations que je rencontre, quelle que soit leur source, je me suis posé beaucoup de questions sur la mortalité depuis le début de l’épidémie de Covid. J’ai d’ailleurs posté des compte-rendus détaillés (et moins détaillés) de ce que je trouvais sur la Covid, ainsi que d’autres informations ici, , et et ailleurs.

L’un des principaux problèmes qu’un citoyen lambda rencontre très vite, c’est la difficulté à évaluer des données nationales qui sont par définition des agrégats, parfois des agrégats d’agrégats… sachant que tout le monde y va de ses propres interprétations. Il me fallait donc repartir de données brutes difficilement falsifiables ou manipulables.

Quelles données ?

Par ailleurs, la comptabilisation des « morts Covid » est toujours biaisée, car comme je l’indique dans cet autre billet, on meurt le plus souvent d’une multitude de facteurs, pas seulement de la Covid (ou d’une autre maladie).

Il ne nous reste donc plus, à nous autres citoyens lambda, la mortalité « toutes causes », qui représente une réalité non biaisée : telle personne est morte à telle endroit à telle date. Or, l’INSEE fournit exactement ces données détaillées. Les données ont toujours 1 mois et demi à 2 mois de retard. Cela permet tout de même de regarder rétrospectivement ce qui s’est passé pour pouvoir anticiper ce qui va arriver. Mais surtout, cela permet de juger du degré de fiabilité de diverses sources d’informations lors des événements passés, et ainsi de se construire un indice de confiance sur telle ou telle source.

J’ai donc développé un programme pour analyser ces données et calculer des courbes de mortalité à partir de ces données brutes. J’ai également mis en place un site pour afficher les résultats, que vous pouvez consulter (le site est chez moi, il est donc possible qu’il ne soit pas toujours disponible et qu’il donne une vue erronée au moment où je fais des mises à jour des données, une fois par mois).

Petit manuel de l’utilisateur

Quelques petites remarques sur le site :

  • les données proviennent directement des données brutes, filtrées (car il y a des doublons dans les fichiers de l’INSEE),
  • le graphe correspond à une année entière, donc les données les plus à gauche correspondent au début de l’année, et celles les plus à droite à la fin de l’année,
  • les différents types d’affichage sont :
    • absolu : c’est le nombre de morts chaque jour,
    • pondéré : le nombre de morts pour 1 million d’habitants, ce qui permet de comparer d’une année sur l’autre puisque la population varie d’une année sur l’autre,
    • étalé : c’est la moyenne glissante du nombre de morts pour 1 million d’habitants sur 10 jours, cela permet d’avoir une version « lissée » des courbes,
  • on peut filtrer par département (ou par pays étranger) pour observer la mortalité dans une région donnée et ainsi cibler des événements locaux,
  • on peut également filtrer par tranches d’âges pour voir l’impact des événements en fonction des âges,
  • on peut filtrer avec un intervalle d’années et/ou de mois, afin de pouvoir comparer des années proches sans le « bruit » des autres années,
  • lorsqu’une seule année est sélectionnée (filtrage de l’année x vers la même année x), l’affichage bascule en mortalité par tranche d’âges, pour analyser un événement en particulier, par exemple.


Dans un souci de transparence et pour faire marcher l’intelligence collective, je publie les sources de mes programmes en open source. Ainsi, chacun peut vérifier mon code, et même l’installer chez soi, le modifier, l’améliorer, et se faire sa propre opinion en toute indépendance.

Le programme java de remplissage de la base de données : https://gitlab.com/jytou/country-death-observe-java

Le site web pour en faire des courbes : https://gitlab.com/jytou/country-death-observe-php

A Single Light to Monitor Them All in Real Time

The context

This week-end’s project was focused around building a simple indicator on my desk that alerts me of any problem at home in real time. That way, I don’t need to regularly check things around. Well, this may sound overkill for many people. But over the years, I’ve written a program to monitor many things around me without the need to perform a regular check myself. The only thing I was missing was some real-time monitor that would alert me whenever something urgently needed my attention.

Computers. I am a computer engineer, and as such I do have a few electronic equipment around. I also monitor the temperature, humidity in an out of the apartment and also water leaks with sensors, as I have some sensitive musical instruments in my home. In the past 2 months, the temperatures here have gone up to 37 degrees Celsius several times. These measurements have helped us, without air conditioning (which makes things worse on the long term), mitigate the heat. We could close everything when outside was hotter than inside, and open in the evening when the temperatures outside were coming close to the temperature inside.

Here are the temperatures my system measured for the last 5 days:

Where I came from

I already had a raspberry pi set up with a tiny monitor showing all the indicators and a big green light (or orange or red) showing the overall status. Something like this:

But this screen consumes 2.5 Watt when it is on, and only 0.15 Watt when it is off. This is a substantial difference, not even mentioning how much wear it causes the screen to be constantly on, just to show a green light!

The solution

A simple solution: use simple LEDs (which consume virtually nothing) controlled by the raspberry. So I built a first circuit to test the whole thing out:

Of course, that also works with the screen off, that’s the goal after all!

A printed circuit

I had to arrange things together on a smaller plate:

… and solder the whole thing together (yes I butchered the soldering, sorry):

That was a mess…

Buddha (you can see his legs on top of the picture) had a hard time coping with the mess, but he played his role perfectly and went into a deep meditation:

And of course I then had to use this very useful piece of equipment for the cables:

Trying it out

The first testing worked out as expected.

I just needed to print a little box for it with the 3D printer:

Final words…

Using a single green LED was not an option. If it was constantly on and green, it could simply mean that the program crashed… which is not good. I could make the LED blink… but any blinking inevitable catches the eye and is bothering. But two green lights alternating doesn’t catch the eye, and ensures that the program is alive.

On the other hand, the error state and the warning state with an orange light are blinking to make sure that my eye will see it:

And now I have an indicator on my desk telling me at all times that everything is ok. Or not.

The managing software is now on my gitlab: https://gitlab.com/jytou/monitoring-and-sensors/

Deuxième vidéo sur la blockchain

Deuxième vidéo d’une série sur la blockchain. Si vous avez entendu parler de cette technologie mais que vous vous posez des tas de questions, cette série est faite pour vous !

Après avoir donné un rapide historique et les divers usages de cette technologie dans la première vidéo, on s’attaque cette fois à ce que c’est un peu plus en détail sans pour autant virer dans la technique.

Visible aussi sur youtube : https://youtu.be/Zhvq0wE8F3Q


Mounting Synology drives on Linux

I’ve just unmounted my drives from my Synology box, replaced by a home-brewed server. I’ll write some other article about the reasons that made me switch. Just in case you wonder, the Synology box is running fine. That’s not the point.

I took the disks from the Synology box and plugged them into a host with simple Linux distrib (in my case, Ubuntu, but that shouldn’t matter).

Just type:

mount /dev/vg1000/lv /mnt

That’s it. You have the file system from your Synology box on your Linux machine. It may come handy in case your box crashed and you are waiting for a new one. In the meantime, you have access to your data.

In case you want to reuse the disks and dispose of them (WARNING: the following will destroy your data on those disks), here is how to do it.

vgremove vg1000

Now check the md volumes that are available and that you didn’t create yourself (just use ls /dev/md12*). Then stop those volumes (replace md12? with the volumes you want to stop if you have additional ones on your system that you obviously don’t want to stop – they won’t stop if they are mounted anyway):

mdadm -S /dev/md12?

Empty the beginning of the related disks, for each disk replace the … by your disk letter:

dd if=/dev/zero of=/dev/sd… bs=1M count=1024

And now you can play around with partitioning etc without being bothered again by vg1000 or mdadm.

Première vidéo d’une série sur la blockchain

Première vidéo d’une série sur la blockchain. Si vous avez entendu parler de cette technologie mais que vous vous posez des tas de questions, cette série est faite pour vous !

Visible aussi sur youtube : https://www.youtube.com/watch?v=6xOyBMkRcHQ

Mind Maps: a Miracle Tool for Writers

This may be a long post, but I hope to keep it entertaining for both writers and non writers. Let me know what you think!

Some history – where I come from

I’ve always loved to write. As a child, I already started writing similar stories than the ones I was reading: children stories. At the time, I was left with that time’s simple tools, a pencil and pieces of paper:

Then, my father seeing that I really loved it, decided to buy me a typewriter (I unfortunately don’t have it anymore, but it was this model):

Besides the fact that it was difficult at first and hurt my fingers since the keys were so hard, it did improve the presentation of my writings by quite a bit, I even allowed myself to make ASCII art even though I didn’t know it would be called that way a decade later:

And when the occasion arose, I had to draw things myself on the paper:

I even went as far as making fake old maps to add to the mystery of the fiction I was writing:

At the time, if you wanted to know about one place where your story was taking place, you basically had to go there and get some documentation:

You could also read lots of books to get a grasp of the place, its atmosphere, its inhabitants and culture, etc. You had to be a real detective.

Then my parents decided that it was time for me to have a computer,

This thing was top notch at the time, it had no hard disk and everything had to be on floppy disks which weren’t so reliable. It was a great improvement on the typewriter, though, and soon MSDOS had no secrets for me. I added a 20 Mb (yes, megabytes) hard disk later which cost me an arm and a leg at the time… I could use Word 2.0 to write, it was great. You could FIX things without typing back a full page! And then, PRINT it and write gibberish on it as much as you wanted to!

Great times. Believe it or not, I still have the files for these books.

Since then, and that’s like 30 years now, nothing much has changed when it comes to the comfort of writing. Of course, you can now travel the world from your desk by watching videos and reading traveler blogs, there is more material around than you can handle anyway.

But the writing, technically? Good ol’ Word. Ah, Libre Office has come around so that you don’t depend on a private company anymore, but that’s pretty much all there is to it.

Of course, in the more recent years, self-publishing has enabled anyone to publish books, while publishing anything was practically impossible before unless an editor accepted to support you.

Tools and Constraints

There are some tools for professional writers. I won’t quote them here because I don’t want to rant, but the added value doesn’t compensate for their price, at least that’s my own opinion.

As a writer, I have quite a number of needs in order to write efficiently:

  • describe the characters in my story, have their personality and picture at hand whenever I need it,
  • describe the places where my story happens, possibly along with pictures, maps or drawings,
  • dynamically design the plot in the most flexible way possible, by quickly arranging events and/or themes seamlessly,
  • have an overview of the entire book the whole time to see where I stand,
  • count words inside chapters to make sure they are roughly balanced (Word documents count the total words of the document, they don’t break the counts into chapters),
  • handling of Table of Contents, presentation of the book, references, footnotes, etc., should be easy and not troublesome, in fact you should never even think about those as it would distract you from writing,
  • navigate efficiently through the book with shortcuts rather than having to scroll pages and pages to reach one chapter,
  • have the finished product in the form of an epub and various pdf formats (one for reading on a screen, one for a small paperback edition with small characters and at least one for a big paperback edition for people with poor vision),
  • manage to have a history of changes.

Frankly, none of the current software can deal with all these constraints easily. Word/LibreOffice documents are a nightmare. LaTeX constantly distracts you from the contents and doesn’t provide an easy way to navigate through the whole document (I wrote my PhD using LaTeX).

Mind Maps are the Perfect Tool

In the meantime, as a computer engineer, I started using Mind Maps at work to organize ideas. Scientific, computer-related ideas.

If you don’t know what a Mind Map is, it’s just a simple tree of ideas such as this:

You organize your ideas in nodes which are broken into smaller nodes as you refine your ideas. These are great for technical planning and thinking.

Some day, I started planning a new book inside a mind map. Just to draw the basic canvas of the story. Then I added my characters into it.

The main plot was in front of me, the characters next to it. Why not write the book inside the mind map? I know that, as soon as you start breaking your ideas into different documents, some of these documents will become out of date very quickly. By writing directly inside the mind map, I had only one document to maintain.

Most Mind Mapping software allows to type some HTML notes inside every node, that’s where I typed the main text of the book. And because it’s HTML, I can add images, put some formatting, bold, etc.

Converting a Mind Map to PDF and EPUB

To my knowledge, there is no converter to create a PDF or an EPUB from a Mind Map. If you think about it, a Mind Map is a simple text document that can be easily parsed, in the meantime libraries to generate PDFs exist, while an EPUB is a simple Zip file with some HTML files inside.

So I wrote a converter in Java, which also counts the number of words in every chapter and sub-chapter.

Thanks to this, I can easily:

  • have everything in one place with all I need visible in one document: the characters and locations along with the basic ideas, the whole book where chapters are nodes and sub-chapters are sub-nodes, and the nodes’ contents is the text of the book itself, so it’s extremely fast and easy to navigate from one part of the book to another,
  • navigate from a character to a given chapter with a simple mouse click,
  • move ideas, events, plots around during the planning phase, while developing characters and locations in the same document,
  • count words in every chapter and sub-chapter with my converter to make sure that things are not totally out of balance,
  • have one single source file for many output formats for the readers, which are even described in the Mind Map,
  • Mind Maps are text files, it is easy to compare a file with a previous backup to see what has changed.

Here is an example of a test mind map that is later transformed into a book (nodes with a yellow icon have notes typed into them):

The generated PDF looks like this:

And the generated Epub is readable on any reader, uploadable to any self-publishing platform.

I hope this converter can help other people as well. Note that its current version as I write this article is rather limited but is perfectly suitable for a simple novel. Its limitations are listed in the description of the tool itself on gitlab.

Did you like this? Let me know in the comments below!

Backups / Part 1

You have precious data

In this digital era, we all have data that we care about. You certainly wouldn’t want to lose most of your earliest baby pictures. 😀

That data is very diverse, both in its nature and in its dynamism. You have static backups such as your baby pictures, but also very dynamic data such as your latest novel. You also have a lot of data that you probably don’t want to back up at all, such as downloaded files and programs. Well, if those files are actually your bank statements, you may want to have a backup in case something goes awfully wrong.

Things go wrong sometimes

Many people store their “stuff” on their computer, and that’s about it. Then one day, the computer crashes. Bad news, the hard disk is dead. Woops.

The thing is, hard disks fail. In 30 years of dealing with computers on a daily basis, I’ve experienced on average one hard disk failure every 2 years, and I don’t even mention the countless floppy disks that died in my hands. 😀 Maybe I’m unlucky.  Maybe I use computers too much.

Regardless, I know people around me who also experienced hard disk failures and were not prepared for them. Some of them took it well, invoking destiny, others didn’t take it so well. But in any case, when it comes to data loss, destiny can be bent. And although I’ve had mostly hard drive failures, SSDs fail too, and in an even worse way since they generally give very little warning signs (if any) and the whole disk is gone at once, whereas on traditional hard drives it may still be possible to retrieve some of the data. USB keys and SD cards are no exceptions, I’ve found they fail quite often, even the ones from established brands.

Most of the time, trained and highly skilled professionals can recover most of the data using specialized equipment. For some examples of what is possible, you can check out this video channel of a very talented data recovery expert for amazing videos. But that comes at a cost. And recovering everything is not always possible.

You can cheat destiny with redundancy!

The good thing about computers is that, unlike paper data, digital stuff can be copied over and over, and that process is very easy and lossless. You just need to use this capability!

The first step towards minimizing the risk of losing data because of a hard disk failure is to set up a RAID (Redundant Array of Independent Disks). The basic idea is to spread your data and duplicate it on several disks, so that if one fails, your data is still on the other disks and you have cheated destiny. We will cover that in the second part of this series.

Redundancy Is Not A Backup

But keep this motto in mind: “Redundancy Is Not A Backup”. You have your array of disks and you can be sure now that even if one hard disk fails, you are still safe. Now, what if a second hard disk fails just after that? What if a power surge fries all your disks? Hardware failures happen, sometimes of something else (motherboard, SATA controller, etc) that even corrupts your data like it happened to this guy. Viruses encrypt all your data and ask for a 1 million $ ransom to get it back. Human error is always possible and you may mistakenly delete some important files. What if your apartment gets robbed? What if it burns or gets flooded?

This is why, along with redundancy, you always NEED backups. You should obviously not store them anywhere near your computer, ideally not even in your home in case something bad happens there. We’ll get into more detail about this in the third part of this series.

Encrypt your backups

Last but not least, as soon as you store your backups outside of your home, then comes the problem of privacy: what if someone comes across your backup and gets access to your data? You may not care about some of it being accessed by strangers, but you will probably want to shield some of your precious files from prying eyes. That will be the fourth and last part of this series.

GANN-EWN / Part 5 / Applying the Neural Network trained by the Genetic Algorithm to the game EWN

If you haven’t read the previous parts, you should start at part 1.

Now that we have a Neural Network architecture and a Genetic Algorithm, we can apply them to the game “Einstein Würfelt Nicht”.

The Parameters of the Problem

There are several parameters that need to be addressed first:

  • how to feed information to the NN and how to get a result,
  • how big should the NN be, how many layers, how many neurons per layer, how many connections between each layer,
  • how to tune the parameters for the GA (population size, mutation rate, etc.).

There are some precedents for every of these 3 points (NNs as well as GAs have been studied), but there is no existing answer for this particular set of problems.

Feeding Information to the NN

The answer to the first question seems trivial, but it is actually not. The first answer that comes to mind is “just feed it the board and the die and get the stone that has to be played and the move to be played”.

This is of course a valid answer, but there are many more:

  • don’t feed the whole board, but rather the positions of the stones along with the die,
  • feed the possible moves instead of the die,
  • or you could use the network as a “value network”, in other words let the network evaluate how favorable a certain position is for the current player. In that case, the algorithm has to simulate every possible move and apply the network on every resulting board.

There are many other ways of feeding information, including feeding redundant information. For instance, you could feed the number of stones left for each player in addition to the board: that’s an information which is obviously useful and that the network could directly use rather than having to calculate it again.

Getting the NN’s Result

The Neural Network can be used to gather very different results, also depending on which information it was given as inputs. Here are a few ways that the network can be used:

  • the number of the tile to be played along with the move to be played,
  • among all the possible moves, get the index of the chosen move,
  • return one output number for each tile and each move, and play the tile that has the highest number along with the move that has the highest number,
  • if used as a value network, just return one value: the fitness of the current position.

Again, there are many ways of using a neural network to play. We could even use two networks: one to choose the tile to play, and then a second one to choose the move. Whatever we choose, we have to keep in mind two main points here:

  • the result can never be an invalid move, which is not always trivial,
  • make sure that results cannot be incoherent. For instance, a valid possibility would be to have two integer outputs: one for the tile to play on the board, one for the move to play, each of them being applied the mathematical “mod” to make sure that the results are within range. But then there might be a discrepancy between the chosen tile and the chosen move. Maybe the move made perfect sense with another tile, but not with the one that was eventually selected.

The Size of the Neural Network

I tend to reason in terms of complexity of the game to address this problem. Just think about “if I had to code a perfect player for this game, how many rules and cases would I have to take into account”.

The answer also depends on what you feed the network. If you feed it a lot of redundant information (for instance, feed it the board and the number of remaining tiles for each player), then the network will have to extract less metadata from the board.

In the case of the game “Einstein Würfelt Nicht”, I chose not to mostly not give any redundant information to the network. Given the size of the board, I believed that a simple network of just a few layers and a couple of hundred neurons would probably do the trick.

Then comes the number of connections between layers. In order to extract as much information as possible from the board, I believed that a fully connected first layer was needed – although I chose not to enforce it, but I gave a sufficient amount of connections for this to happen. So I started off with a first layer of 20 neurons, along with 500 connections between the board (which is a 25 bytes array, with an additional byte for the die). I have also tried other different variants.

The Parameters of the Genetic Algorithm

Population size, mutation and cross-over rates

I started off with a population of 100 individuals and made some tests with 200. In that population, I chose to keep a fair amount of the best individuals, 10 to 30%, without checking their scores. All the others are discarded and replaced with either new individuals, either top individuals that have been mutated and crossed-over.

As for the mutation rate, I made it randomly chosen between 1/1000 and ten to twenty per 1000. That is to say that to create a mutated individual, 1 to 10/20 random mutations are applied for every 1000 bytes of its DNA. Note that with a network of 10000 elements, that’s just a few mutations in the network. A mutation can be a change in an operation, a connection move or a change in parameters such as weight and offset.

As for the crossover rate, I made it from 0.01% to 1%. As we will see later, it wasn’t that successful in the first versions.

Evaluating players

Another important parameter is the accuracy of the evaluation for every individual. In the case of a game, it can be measured by having this individual play many games against other players. Other players may be other individuals of the population and/or a fixed hard-coded player. The more games are played, the more accurate the rating of a player. And this is getting more and more critical as the player improves: at the beginning, the player just needs to improve very basic skills and moves, it is failing often anyway, so it is easy to tell the difference between a good player and a bad one. As it improves, it becomes more and more difficult to find situations in which players can gain an advantage or make a slight mistake.

In the case of EWN, as it is a highly probabilistic game, the number of matches that are required can grow exponentially. Note that there are even a large number of starting positions: 6! x 6! which is roughly 500 thousand permutations. With symmetries we can remove some of them, but there is still a large number of starting positions despite the very simple placing rules. So even if you play 100.000 games for a player, it is still not covering the wide variety of openings. What if your player handles well those 100.000 openings but is totally lame at playing the rest? Not even mentioning the number of possible positions after a few turns.

A good indicator to check whether we have played enough games to correctly rate players is the “stability” of the ranking of players as we continue playing games. As the rankings stabilize (eg for instance the top player remains at the top for quite a long time) we are getting better and better accuracy.

Individuals selection and breeding

As I developed this and started testing it, I realized that the evolution was going very slowly: new individuals were bad in general, with only a few of them reaching the “elite” of the population. That’s because of the randomness of the alterations. We will see later how I tackled that problem.

As I was going forward in this and observing how slow the process was on a CPU, I also started planning to switch the whole evaluation process to the GPU.

GANN-EWN / Part 4 / Developing a Genetic Algorithm from scratch

Welcome to this fourth part of building Neural Networks evolved by Genetic Algorithms. If you haven’t done so yet, I would advise to read the first, second and third parts before reading this article.

Basics about Genetic Algorithms

So, what exactly is a Genetic Algorithm? The name applied to Computer Science sounds both scary and mysterious.

However, it is actually quite simple and refers to Darwin’s theory of evolution which can be summed up in one simple sentence: “The fittest individuals in a given environment have a better chance than other individuals of surviving and having an offspring that will survive”. Additionally, the offspring carries a mutated crossover version of the genes of the parents.

Given this general definition, the transposition to Computer Science is the following:

  • design “individuals” to solve a particular problem, whose parameters are in the form of a “chromosome”, most of the time a simple array of bits,
  • create a “population” of these individuals,
  • evaluate all individuals in the population and sort them by their fitness (e.g. how close they get to solving the problem well),
  • create new individuals by making crossovers and mutations on the best individuals of the current generation,
  • replace the worst individuals in the population by those new individuals,
  • rinse and repeat: go back to evaluating all individuals in the population.

With this in mind, the choice in part 3 to store our neural networks in simple arrays comes into a new light: those arrays are the chromosomes of our individuals.

Our Genetic Algorithm

The genetic algorithm I built went through several phases already.

Here is the first phase:

  1. generate n individuals, each representing a player, note that players may also be non Neural-Network-driven players or players using different types of NNs, we can actually mix players of different types in the population,
  2. make them play against each other a certain number of games,
  3. select the ones with more wins and discard the others,
  4. replace the discarded players either by new random players, either by mutations and crossovers of the best players.
  5. go back to step 2.

Still, with this simple algorithm, there are many possible parameters:

  • the population size,
  • how many individuals to keep? Keep the top x %? Or all the ones managing to have a certain score?
  • how much mutation and crossover percentage should be applied?
  • how many games to play to get a good confidence score on every player?

The supporting UI

To evaluate the impacts of these parameters, I then built an UI on top of this to observe the evolution of the population and see how the best individuals evolve. The UI is made of a simple table, sorted by “performance” (that’s to say the percentage of wins). Every row shows one individual, its ranking, its age (since a single individual can survive for multiple generations), its original parent, and the number of total mutated generations it comes from.

Later, I also added at the bottom of the screen the progression of the score of the best individuals.

Here is a simple screenshot:

The green part represents the individuals that will be selected for the next generation. All individuals in red will be discarded. The gray ones are non NN implementations that can be used as “benchmarks” against the NN implementations. When the first population is created randomly, they are generally beaten very easily by those standard implementations, but we can see that after some iterations, the NNs selected one generation after the other end up beating those standard implementations. We’ll dig into that in the next post, along with the choice of the different parameters.

Next comes part 5.