Learning words in foreign languages

Recently I was asked how many times you should hear a word in a foreign language before it really sticks into your mind.

Sometimes, hearing/reading one word one single time in the right context will imprint it into your mind forever. And sometimes, you will repeat one word 100 times and it will not stick. Spaced repetition is a powerful way to get the words to stick while reducing the number of times you are exposed to each word, but it is not magical either. With the wrong context, you may also fail with spaced repetition.

I learned one thing from decades of studying, it is that context is everything. That’s why trying to immerse yourself in a certain context while learning a language is important. The best of all is to simply be in the country of the language you’re learning. But as it is not always possible, here are a few tricks.

When I use Anki to do spaced repetition, I listen to some music in the language I’m learning while repeating the words. This switches the brain into the mode “oh, that’s this language, okay!”, as well as cheering you up and setting up a mood. You might even want to tap your feet with the rhythm while learning words. And on all my cards, I have an image of something that is characteristic of the country/ies where that language is spoken, as well as some sentences in which the word is used – because learning a word by itself is boring, and learning it within a sentence makes it more interesting. I will make a post later to explain how I did it technically. Associating a picture with the word also helps quite a bit, especially for physical objects.

Teachers know that bored students don’t learn anything. That’s why teachers who make their classes very emotionally alive are more successful than others. There are some very serious scientific studies on this but I’m sure you have in your own experience that teacher who stood above all others because his classes were so lively, funny or exhilarating.

And of course it all depends on the language you’re learning and the language(s) you already know. The learning curve of Japanese or Arabic is obviously much greater for a native monolingual English speaker than the one of German for someone who knows Dutch and Danish.

So there is no “number of times for a word to stick”, it’s all about context!

The Moon

The Moon is back with its normal non-eclipsoid figure. 😀

Here is a picture taken with a Nikon D5100 mounted with a simple Barlow on a Celestron 115/910. I’ve had both the telescope and the camera for a long time now, but never took the next step of taking pictures. Thanks to the recently acquired barlow, this is a dream come true.

This is taken from a balcony in the middle of a big city, not exactly your ideal conditions for taking pictures of the sky especially during a hot summer with lots of temperature differences, but the results are still quite good and I have not applied any software fix on them. Exposure: 1/200 s, sensitivity 1000 ISO. Enjoy!

Note the difference in chromatic aberrations depending on the location of the details mostly due to the barlow lens, the following animation shows the same details taken in the center of the picture compared to the edge (1/100 s, 400 ISO):

Moon Eclipse + Mars / 2018-07-27

The sky was kind enough to let us see the moon eclipse yesterday for a short time, as it was quite cloudy. A very nice experience.

Of course, without forgetting its friend Mars (take your time, it’s an animation) :

The clouds also allowed for some creepy shots, you’d wonder if Freddy Krueger was around.

 

Note that these are raw photos. No filters.

Backups / Part 1

You have precious data

In this digital era, we all have data that we care about. You certainly wouldn’t want to lose most of your earliest baby pictures. 😀

That data is very diverse, both in its nature and in its dynamism. You have static backups such as your baby pictures, but also very dynamic data such as your latest novel. You also have a lot of data that you probably don’t want to back up at all, such as downloaded files and programs. Well, if those files are actually your bank statements, you may want to have a backup in case something goes awfully wrong.

Things go wrong sometimes

Many people store their “stuff” on their computer, and that’s about it. Then one day, the computer crashes. Bad news, the hard disk is dead. Woops.

The thing is, hard disks fail. In 30 years of dealing with computers on a daily basis, I’ve experienced on average one hard disk failure every 2 years, and I don’t even mention the countless floppy disks that died in my hands. 😀 Maybe I’m unlucky.  Maybe I use computers too much.

Regardless, I know people around me who also experienced hard disk failures and were not prepared for them. Some of them took it well, invoking destiny, others didn’t take it so well. But in any case, when it comes to data loss, destiny can be bent. And although I’ve had mostly hard drive failures, SSDs fail too, and in an even worse way since they generally give very little warning signs (if any) and the whole disk is gone at once, whereas on traditional hard drives it may still be possible to retrieve some of the data. USB keys and SD cards are no exceptions, I’ve found they fail quite often, even the ones from established brands.

Most of the time, trained and highly skilled professionals can recover most of the data using specialized equipment. For some examples of what is possible, you can check out this video channel of a very talented data recovery expert for amazing videos. But that comes at a cost. And recovering everything is not always possible.

You can cheat destiny with redundancy!

The good thing about computers is that, unlike paper data, digital stuff can be copied over and over, and that process is very easy and lossless. You just need to use this capability!

The first step towards minimizing the risk of losing data because of a hard disk failure is to set up a RAID (Redundant Array of Independent Disks). The basic idea is to spread your data and duplicate it on several disks, so that if one fails, your data is still on the other disks and you have cheated destiny. We will cover that in the second part of this series.

Redundancy Is Not A Backup

But keep this motto in mind: “Redundancy Is Not A Backup”. You have your array of disks and you can be sure now that even if one hard disk fails, you are still safe. Now, what if a second hard disk fails just after that? What if a power surge fries all your disks? Hardware failures happen, sometimes of something else (motherboard, SATA controller, etc) that even corrupts your data like it happened to this guy. Viruses encrypt all your data and ask for a 1 million $ ransom to get it back. Human error is always possible and you may mistakenly delete some important files. What if your apartment gets robbed? What if it burns or gets flooded?

This is why, along with redundancy, you always NEED backups. You should obviously not store them anywhere near your computer, ideally not even in your home in case something bad happens there. We’ll get into more detail about this in the third part of this series.

Encrypt your backups

Last but not least, as soon as you store your backups outside of your home, then comes the problem of privacy: what if someone comes across your backup and gets access to your data? You may not care about some of it being accessed by strangers, but you will probably want to shield some of your precious files from prying eyes. That will be the fourth and last part of this series.

3D printing – part 1

I have been curious about 3D printing for a few years now but never found the time and courage to finally take the first step and buy a printer. The fact is that the prices were quite discouraging for a simple hobby, at least that’s how I saw it.

I did have some real interest in 3D printing as I actually already printed a few items through a 3D printing website, such as a camera cache for my tablet and a magnet holder for my drinking glass.

While the first one is a classic, you might wonder what the second one is for. I invite you to check the Professor Luc Montagnier’s research on water, it might give you a few clues, and this is just a little experiment of mine. Since I have been drinking that polarized water (for a few years now), I have not been sick, although there are still some things that could be improved in my lifestyle regarding health.

Anyway, back to 3D printing, the cost of delegating the printing to a 3rd party is quite prohibitive. The second piece cost a mere 30 euro + shipping. And I ordered 2 of them. You’d better not make any mistake in the design.

Then, in December 2017, I found a 3D printer kit on Gearbest on sale at 100$. I thought “What the Heck, this is the equivalent of printing 3 of my magnet holders!”. And I had many other projects in mind but I was reluctant to make them because of the inferred cost. So I just bought the printer, although I had some idea that as it was a kit it would require quite a bit of attention and time. Well, that’s an understatement.

The thing is that 3D printing is not yet for everyone. Not only as a bare kit, but also in general. Even high-end 3D printers are still having a lot of issues, from what I read on the Internet. I’m actually glad that I bought a kit:

  • it was very cheap,
  • I got to get familiar with every single part and detail of the printer,
  • if anything fails I always have a fix at hand.

The last point may be the most important of all. If you have a stock printer, when it fails and you don’t want to burn your warranty, you have to go back to the seller or the manufacturer. That’s a lot of cost, time and energy (communicating, explaining your problem, sending it to the post then waiting for it to return, etc.), when you could actually use that time to fix it yourself.

Of course, the thought of printing your own stuff is very thrilling. But do bear in mind that 3D printing is very demanding, and before jumping into it, you should know what you are putting yourself into.

Although I’ve had my printer for just 6 months, I have already printed quite a lot of things. In fact, the printer has been active for almost 2 full months:

And I have also spent some time transforming it, fine-tuning, fixing problems, etc, and it is significantly different from the original kit now:

Anyway, what I can say is that it is also very demanding, in time and energy. Many things can go wrong with that technology, especially if you start from scratch with a kit:

  • there are obviously heated parts (at least 200°C, that’s 392°F), which can be dangerous, including the hazard of burning yourself,
  • mechanical problems including bending parts,
  • precision is key, a fraction of a millimeter can make the difference between failure and success,
  • sensitive electronics (a motherboard, an LCD screen),
  • strong currents that can represent a fire hazard, especially as the basic equipment that comes with the kit is not exactly 100% safe,
  • melted plastic, with all its potential caveats (toxic gas, fluidity, adherence depending on the temperature, stuck nozzle, etc.),
  • bugs in software and firmware,
  • moving parts and wear (soldered wires coming out…), strongly vibrating parts which can cause loose screws, detached pins, etc.),
  • noise issues…

I think you may get my point by now. Every single of these aspects is sensitive enough to cause printing to fail. Just keep this motto in mind: “If it is possible for anything to fail, it WILL fail after some hours of printing.” Thus you want to have everything safely secured so that there no possibility left for it to fail.

In order to achieve this, many different skills are required:

  • feel at home with computers (obviously) and have some minimal knowledge of electronics is definitely a plus,
  • fixing mechanical parts, including very small and detailed pieces,
  • patience (that’s a big one as printing big parts can sometimes take a full day or more – my record is 37 hours),
  • being capable of soldering and making your own wire connectors,
  • unless you only want to print things that have been designed by others, 3D designing skills, so that you can take full advantage of your printer, creativity is definitely a plus here,
  • coping with the constraints inherent to 3D printing (minimal wall thickness, connection thickness and detail, as few hanging parts as possible, etc.),
  • finding an appropriate place to put the printer in your home (if you plan to print ABS you definitely want a ventilated area),
  • evaluating the physical resistance of printed pieces. This point is not a joke, especially as heat can come into play with certain materials:

In this particular case, it was a combination of underestimating the strength of the piece compared to its width and the weight it was supposed to carry, but the color with which I printed it was quite sensitive to the heat from the sun, and that was PLA. After printing it in white and a bit thicker, I don’t have any problem anymore although the temperatures in France are currently reaching 35°C.

So yes, there is a lot of fine-tuning, trial and error in 3D printing. And changing any single thing in your habits, including the filament brand or even color can break a print.

However, all this said, 3D printing is a lot of fun!

Have a look at my thingiverse page where I share pieces that can be of use by other people (which is not always the case: 3D printing is mostly about making pieces that fit exactly your own needs – not necessarily your neighbours’).

GANN-EWN / Part 5 / Applying the Neural Network trained by the Genetic Algorithm to the game EWN

If you haven’t read the previous parts, you should start at part 1.

Now that we have a Neural Network architecture and a Genetic Algorithm, we can apply them to the game “Einstein Würfelt Nicht”.

The Parameters of the Problem

There are several parameters that need to be addressed first:

  • how to feed information to the NN and how to get a result,
  • how big should the NN be, how many layers, how many neurons per layer, how many connections between each layer,
  • how to tune the parameters for the GA (population size, mutation rate, etc.).

There are some precedents for every of these 3 points (NNs as well as GAs have been studied), but there is no existing answer for this particular set of problems.

Feeding Information to the NN

The answer to the first question seems trivial, but it is actually not. The first answer that comes to mind is “just feed it the board and the die and get the stone that has to be played and the move to be played”.

This is of course a valid answer, but there are many more:

  • don’t feed the whole board, but rather the positions of the stones along with the die,
  • feed the possible moves instead of the die,
  • or you could use the network as a “value network”, in other words let the network evaluate how favorable a certain position is for the current player. In that case, the algorithm has to simulate every possible move and apply the network on every resulting board.

There are many other ways of feeding information, including feeding redundant information. For instance, you could feed the number of stones left for each player in addition to the board: that’s an information which is obviously useful and that the network could directly use rather than having to calculate it again.

Getting the NN’s Result

The Neural Network can be used to gather very different results, also depending on which information it was given as inputs. Here are a few ways that the network can be used:

  • the number of the tile to be played along with the move to be played,
  • among all the possible moves, get the index of the chosen move,
  • return one output number for each tile and each move, and play the tile that has the highest number along with the move that has the highest number,
  • if used as a value network, just return one value: the fitness of the current position.

Again, there are many ways of using a neural network to play. We could even use two networks: one to choose the tile to play, and then a second one to choose the move. Whatever we choose, we have to keep in mind two main points here:

  • the result can never be an invalid move, which is not always trivial,
  • make sure that results cannot be incoherent. For instance, a valid possibility would be to have two integer outputs: one for the tile to play on the board, one for the move to play, each of them being applied the mathematical “mod” to make sure that the results are within range. But then there might be a discrepancy between the chosen tile and the chosen move. Maybe the move made perfect sense with another tile, but not with the one that was eventually selected.

The Size of the Neural Network

I tend to reason in terms of complexity of the game to address this problem. Just think about “if I had to code a perfect player for this game, how many rules and cases would I have to take into account”.

The answer also depends on what you feed the network. If you feed it a lot of redundant information (for instance, feed it the board and the number of remaining tiles for each player), then the network will have to extract less metadata from the board.

In the case of the game “Einstein Würfelt Nicht”, I chose not to mostly not give any redundant information to the network. Given the size of the board, I believed that a simple network of just a few layers and a couple of hundred neurons would probably do the trick.

Then comes the number of connections between layers. In order to extract as much information as possible from the board, I believed that a fully connected first layer was needed – although I chose not to enforce it, but I gave a sufficient amount of connections for this to happen. So I started off with a first layer of 20 neurons, along with 500 connections between the board (which is a 25 bytes array, with an additional byte for the die). I have also tried other different variants.

The Parameters of the Genetic Algorithm

Population size, mutation and cross-over rates

I started off with a population of 100 individuals and made some tests with 200. In that population, I chose to keep a fair amount of the best individuals, 10 to 30%, without checking their scores. All the others are discarded and replaced with either new individuals, either top individuals that have been mutated and crossed-over.

As for the mutation rate, I made it randomly chosen between 1/1000 and ten to twenty per 1000. That is to say that to create a mutated individual, 1 to 10/20 random mutations are applied for every 1000 bytes of its DNA. Note that with a network of 10000 elements, that’s just a few mutations in the network. A mutation can be a change in an operation, a connection move or a change in parameters such as weight and offset.

As for the crossover rate, I made it from 0.01% to 1%. As we will see later, it wasn’t that successful in the first versions.

Evaluating players

Another important parameter is the accuracy of the evaluation for every individual. In the case of a game, it can be measured by having this individual play many games against other players. Other players may be other individuals of the population and/or a fixed hard-coded player. The more games are played, the more accurate the rating of a player. And this is getting more and more critical as the player improves: at the beginning, the player just needs to improve very basic skills and moves, it is failing often anyway, so it is easy to tell the difference between a good player and a bad one. As it improves, it becomes more and more difficult to find situations in which players can gain an advantage or make a slight mistake.

In the case of EWN, as it is a highly probabilistic game, the number of matches that are required can grow exponentially. Note that there are even a large number of starting positions: 6! x 6! which is roughly 500 thousand permutations. With symmetries we can remove some of them, but there is still a large number of starting positions despite the very simple placing rules. So even if you play 100.000 games for a player, it is still not covering the wide variety of openings. What if your player handles well those 100.000 openings but is totally lame at playing the rest? Not even mentioning the number of possible positions after a few turns.

A good indicator to check whether we have played enough games to correctly rate players is the “stability” of the ranking of players as we continue playing games. As the rankings stabilize (eg for instance the top player remains at the top for quite a long time) we are getting better and better accuracy.

Individuals selection and breeding

As I developed this and started testing it, I realized that the evolution was going very slowly: new individuals were bad in general, with only a few of them reaching the “elite” of the population. That’s because of the randomness of the alterations. We will see later how I tackled that problem.

As I was going forward in this and observing how slow the process was on a CPU, I also started planning to switch the whole evaluation process to the GPU.

GANN-EWN / Part 4 / Developing a Genetic Algorithm from scratch

Welcome to this fourth part of building Neural Networks evolved by Genetic Algorithms. If you haven’t done so yet, I would advise to read the first, second and third parts before reading this article.

Basics about Genetic Algorithms

So, what exactly is a Genetic Algorithm? The name applied to Computer Science sounds both scary and mysterious.

However, it is actually quite simple and refers to Darwin’s theory of evolution which can be summed up in one simple sentence: “The fittest individuals in a given environment have a better chance than other individuals of surviving and having an offspring that will survive”. Additionally, the offspring carries a mutated crossover version of the genes of the parents.

Given this general definition, the transposition to Computer Science is the following:

  • design “individuals” to solve a particular problem, whose parameters are in the form of a “chromosome”, most of the time a simple array of bits,
  • create a “population” of these individuals,
  • evaluate all individuals in the population and sort them by their fitness (e.g. how close they get to solving the problem well),
  • create new individuals by making crossovers and mutations on the best individuals of the current generation,
  • replace the worst individuals in the population by those new individuals,
  • rinse and repeat: go back to evaluating all individuals in the population.

With this in mind, the choice in part 3 to store our neural networks in simple arrays comes into a new light: those arrays are the chromosomes of our individuals.

Our Genetic Algorithm

The genetic algorithm I built went through several phases already.

Here is the first phase:

  1. generate n individuals, each representing a player, note that players may also be non Neural-Network-driven players or players using different types of NNs, we can actually mix players of different types in the population,
  2. make them play against each other a certain number of games,
  3. select the ones with more wins and discard the others,
  4. replace the discarded players either by new random players, either by mutations and crossovers of the best players.
  5. go back to step 2.

Still, with this simple algorithm, there are many possible parameters:

  • the population size,
  • how many individuals to keep? Keep the top x %? Or all the ones managing to have a certain score?
  • how much mutation and crossover percentage should be applied?
  • how many games to play to get a good confidence score on every player?

The supporting UI

To evaluate the impacts of these parameters, I then built an UI on top of this to observe the evolution of the population and see how the best individuals evolve. The UI is made of a simple table, sorted by “performance” (that’s to say the percentage of wins). Every row shows one individual, its ranking, its age (since a single individual can survive for multiple generations), its original parent, and the number of total mutated generations it comes from.

Later, I also added at the bottom of the screen the progression of the score of the best individuals.

Here is a simple screenshot:

The green part represents the individuals that will be selected for the next generation. All individuals in red will be discarded. The gray ones are non NN implementations that can be used as “benchmarks” against the NN implementations. When the first population is created randomly, they are generally beaten very easily by those standard implementations, but we can see that after some iterations, the NNs selected one generation after the other end up beating those standard implementations. We’ll dig into that in the next post, along with the choice of the different parameters.

Next comes part 5.

Installing a working Python environment and Silkaj on a Raspberry Pi 3 with Raspbian Jessie

Raspberries are awesome. But setting up things can sometimes be a little messy. I wanted to install a working version of Silkaj (see the Duniter project, if you don’t know them yet, check them out, they rock!) and here is a full tutorial to get you going.

Requirements:

  • a Raspberry Pi (mine is a version 3 but it should probably work on a 2 as well),
  • Raspbian Jessie, but it would probably work on any other raspbian or even any Debian-based distribution,
  • networking (obviously).

Required packages

You will need to have libsodium and libffi already installed, as well as libssl and its development package, otherwise install them:

sudo apt-get install libsodium13 libsodium-dev libffi6 libffi-dev libssl-dev

Note that you do need the development packages because python’s installer will need to compile some dependencies with them later.

Check where libffi.pc has been installed:

find /usr -name "*libffi.pc*"

You need to add the path for libffi.pc to the python environment variable PKG_CONFIG_PATH. Check if that variable is empty or not, if it’s not empty, you need to APPEND the following instead of overwriting it of course (change the location of libffi.pc to the result of your previous command):

export PKG_CONFIG_PATH=/usr/lib/arm-linux-gnueabihf/pkgconfig/libffi.pc

Installing Python 3.6 and pipenv

Because silkaj and its dependencies doesn’t run well with older versions of python, you need to install Python 3.6. Here is the recipe:

wget https://www.python.org/ftp/python/3.6.0/Python-3.6.0.tgz
tar xzvf Python-3.6.0.tgz
cd Python-3.6.0/
./configure
make
sudo make install

The following needs to be done as root or with sudo (unless you want to install for your user only):

sudo python3.6 -m pip install --upgrade pip
sudo python3.6 -m pip install pynacl
sudo python3.6 -m pip install pipenv

Get Silkaj and prepare it

Type the following in your shell with any user:

git clone https://git.duniter.org/clients/python/silkaj.git
cd silkaj
pipenv install
pipenv shell

This last command actually starts a new shell in which silkaj can be run.

That’s it! You’re ready to run silkaj now:

./silkaj

GANN-EWN / Part 3 / building a Neural Network from scratch

Welcome to the third part of the GANN-EWN (Genetic Algorithms for Neural Networks applied to the game “Einstein Würfelt Nicht”) series! If you haven’t read it yet, it’s probably time to read the first part and/or the second part.

In this part, we’ll start building a Neural Network from scratch.

I assume that you’re familiar with Neural Networks, which are basically a simplified stereotype of what people thought neurons in the brain were like in the 1950’s. 🙂 Since then, biological research has evolved a lot and we know now that real neurons and their interactions are much more complex than the general models of Computer Neural Networks that are around.

Anyway, there are many types of neural networks around, all used for different tasks. My attempt is to build a very generic neural network structure that can then be used in various environments, typically to play different games.

Important structural decisions

Since the early days of neural network research, the transfer functions used in artificial neurons have been carefully chosen to enable backpropagation. In here, though, we want to use our neural network through Genetic Algorithms by selecting the best neural networks within a population, and we won’t necessarily be using backpropagation to make the network “learn”. So we can use any kinds of transfer functions we want.

At first, I also wondered if I would be using totally random graphs for my network structure, but it poses problems such as cycles which require more computational power to deal with so I stayed with a “simple” layer architecture. But because I wanted to code a network that would be as generic as possible, the only constraints I kept was to have a structure with layers: links between neurons of different layers can be as random as possible, with the only requirement that every neuron should have at least one input and one output (a neuron without any input or output would simply be useless and would waste resources unnecessarily). On the other hand, I didn’t put any restrictions on duplicate links from one neuron to another. After all, it could be useful to have the sinus of something and combine it with its raw value.

Another important decision was to limit to the strict minimum the manipulation of floating point values. There are two reasons for this, which are both linked to the projected use of GPUs for my research, and based on my own experience of using GPUs. The first one is that GPUs in general are not so great, performance-wise, with floating point values. The second is that the results of floating point operations may differ between CPU operations and GPU operations, due to rounding and approximating some functions on the GPU. Besides, most GPUs have limited support for double floating point operations, which makes portability some kind of a pain. We will actually discuss that in more detail in another article.

So I decided to take only integers as neuron and link input and output values, as well as the weights and shifts that could be applied on all operations. However, some operations will need some floating point numbers, such as sigmoid and tangent functions, but the results will be transformed back into integers as soon as possible so that floats are used as little as possible.

Description of neurons and links

Besides, as I didn’t stick with “simple” linear or sigmoid functions for my neurons, the links themselves also have interesting properties. Here is a simple figure showing the basics of my neurons and links:

So the output of every link is of the form: out = unary_op ( (in + shift) x weight )

As for computing the result of a neuron, I used what is generally referred to as “genetic programming” where the genetic algorithm doesn’t mutates “parameters” for a program, but rather a tree of instructions (a program). So in my case, a neuron contains a tree of mathematical operations to be applied on the inputs.

So the output of a neuron having n inputs is:

  • the input itself if n==1
  • the result of a tree of n-1 binary operations

For instance, the tree for the previous picture is the following:

The unary operations I have chosen are: id (identity), sinus, log, square, square root, tan, and negative.

The roughly described binary operations are:

  • add (actually average, we’ll see why later),
  • divide (leave numerator unchanged if denominator is 0),
  • subtract,
  • multiply (with an adjusting weight),
  • max and min,
  • an hyperbolic paraboloid and its reverse.

With this in mind, it is actually not difficult to switch this network to any type of network, including simple linear functions. Just limit the allowed operations to “+” and we’re pretty good to go. The same goes with links, just restrict them to use “identity” as their unary operations and set the weights as desired, just leave the offset (often called “bias” in neural network computing).

Besides, as the link structure is not constrained either, we can create any type of network we want by forcing the links to be in a certain way when we build the network.

Integers and range

There is one remaining problem here: if you store everything as integers and never stop adding and multiplying numbers, you’ll overflow at some point.

So I kept all operations to be within a certain range and added weights and min/max to make sure that all numbers would be kept within that range. How to define the range? It has to be as big as possible to allow for enough precision (obviously if we keep only integer values from -10 to 10, we won’t be able to store Pi using a good precision, but if we go up to a million, then we can store 3.14159, which is already not that bad). On the other hand, the range should allow not to lose precision while doing operations such as multiplying integers. Integers in Java or OpenCl have a rough range of -2 billion to 2 billion so 10.000 should do it (the square is 100 million, perfectly in range).

So within the network, all values circulating should be within the -10.000 to 10.000 range, which means that some adjustments had to be made on the different binary and unary operations, in order to always keep results within that range. For instance, adding is converted to an average, to make sure the result is still within the desired range.

Encoding and storing the networks

Because the ultimate goal is to run this on GPUs, the networks cannot be programmed as Objects. Instead, plain arrays of integers are used. One other advantage is that it is very simple to store an array of integers and it could be manipulated in any programming language (provided that we code the program that would interpret this array).

Here is the description of the array describing a network so far:

  • [0]: the number of layers L, including the output which is considered as a layer, but excluding the input,
  • [1 : L]: the number of neurons in each layer (including the output layer, but excluding the input),
  • [L+1 : 2xL]: the number of links between each layer (at position L+1, we have the number of links between the input and the first hidden layer),
  • [2xL+1:…]: all link parameters (we will go over them shortly),
  • after the links: all binary operations for the links, for every layer connectivity there are as many as links minus one.

Every link contains 5 integers and is described as follows:

  • the number of the source neuron in the source layer (this number is unique within the network, so the number of the first neuron of the first hidden layer is equal to the number of inputs),
  • the number of the destination neuron in the destination layer,
  • the applied offset on incoming values,
  • the applied weight applied on the result of the offset,
  • the code for the unary operation applied to that result.

Let’s calculate the size of an array needed to encode the following network: 20 connections between inputs and the first hidden layer, 10 neurons in the first hidden layer, 30 connections from the first hidden layer to the second hidden layer, 4 neurons in the second hidden layer, 10 connections from the second hidden layer to the 2 outputs.

1 + 2 x 2 + 5 x 50 + 19 + 29 = 303 elements

For now, the networks are stored within a simple local H2 database.

Watching it in action

Finally, I built a small UI to watch how the network behaves when its inputs change. Later, it might come in handy to see how the network behaves during a real game:

The inputs are on the left, and can be entered either numerically or from sliders, and the outputs are read on the right (this one has only two outputs). The two hidden layers (containing 4 and 3 neurons) are clearly visible. Besides, every link has the offset represented as a bar (red means negative, green means positive, full gray is 0) and the weights are represented as filled circles.

In the next article, we’ll build the genetic algorithm part of the project.

The next post of the series is here.

GANN-EWN / Part 2 / Building a Hand-Crafted Player For EWN

Welcome to this second part of the GANN-EWN (Genetic Algorithms for Neural Networks applied to the game “Einstein Würfelt Nicht”) series! If you haven’t read it yet, it’s probably time to read the first part.

The game is quite simple: at the beginning of the game every player has 6 pieces labeled from 1 to 6, and every player throws a die at every turn. Depending on the result of the throw, the player may move one or two of his own pieces. The goal is to reach the opposite side of the board (diagonally) or capture all opponent’s pieces.

The rules with the dice are simple as well. If the die corresponds exactly to one of his pieces, he has to play that piece in any direction forward (the three squares: up, up left, or left), like in here where the die shows “5”:

If no piece with the same number are on the board, then one of the two pieces that are the closest to the die can be played. For instance, if the die was “2”, then the “1” and “3” can be played here:

Again, if the die is “3” then on this board the “2” or the “5” can be played:

Note that if there was no “2” on this board, then only the “5” could be played since there is no lower number than “3” on the board.

The first thing to do was to code a simple board implementing the rules of the game, which is quite straightforward. I chose Java, which is probably the programming language I’m most comfortable with, but also for another reason, which I will develop later.

To have something to compare my neural networks to, and also to test my implementation of the EWN Board and game logic, my first first step was to implement a simple (and I can call it quite “naive”) implementation of a player, based on my very limited experience of the game. Remember that I just discovered EWN, so after having a look at some games from the best players, I realized that one common strategy is to “eat up” some of your own pieces early in the game, to give momentum to your remaining pieces. Although this is a “classic” strategy, it does have one drawback: you opponent may be able to take all your remaining pieces and you lose. Nevertheless, I decided to implement this rule, along with a few others. Here is the list of rules that are hard-coded in this first player:

  • if one move is a winning move (capture the last opponent’s piece or reach the opposite side of the board), then play it,
  • if an opponent piece is close to winning (basically at a certain distance from the goal), always take it,
  • always take your own pieces if they are within a certain number (typically from 2 to 5) if you have more than a certain amount of pieces (you don’t want to take your pieces if you have too few of them, that could be suicide in some situations),
  • move forward always the most advanced piece (rushing towards the goal),
  • preferably move forward-up rather than simply forward or up.

And that’s it!

So yes, this is a very simple player, but one funny fact is that this player beats me consistently! That’s how weak I am at EWN as a player.

But as those rules can have parameters (how far must an adversary piece so that we capture it? etc.), I ran a simple computer simulation using a range of possible parameters (including totally disabling a rule). And those simulation showed me that, by far, the best parameters are:

  • capture pieces that are at most at a distance of 2 from the goal, but actually this rule is not very important,
  • take your own pieces from 2 to 5 until you have only 2 pieces left,
  • move always forward-up,
  • do advance the most advanced piece.

Of course, every single rule taken on its own may have some exceptions, but my “simple player” doesn’t go that far.

I also created a “random” player which picks moves at random, again to have something to compare to that doesn’t vary much in its crappiness. 🙂 And indeed my simple player was definitely better than the random one, at least.

I published this first player on the littlegolem site, with a different user than my regular user.

This is when I discovered that the site was actually crawling with bots playing EWN! Nice, the challenge was getting interesting! That’s when I met with a German researcher who also wrote a bot to play EWN on littlegolem. He even made part of his PhD on this project, so needless to say that my own simple player was quite ridiculous compared to his, and got a very bad beating!

This is when I realized that EWN is actually a statistics-driven game, since you make a choice depending on the probabilities of the next dice throws. You have a “2” ahead and want to give it momentum to reach the opposite side of the board faster? Capture your own “1”, “3” and possibly “4” and “5” pieces. The enemy has one piece slightly ahead but that piece has a 1/6 chance to be picked by the dice? Just ignore it. But you will capture that piece if it has a 1/2 chance of being played. As the Wikipedia page points out, one of the first approach is to calculate probability tables for each piece to reach the goal. All the dynamics of that game are based on probabilities. Well, almost. So that makes it a very good candidate for a Monte-Carlo type approach.

As a reminder, the general principle behind the Monte-Carlo approach is: “at a given turn, for each possible move play as many random games as possible and play the one that leads to victory more often than the others.”

Brutal. But very effective and the easiest “AI” to program. I guess. At least on a CPU.

So I implemented a simple Monte-Carlo (that I will abbreviate as “MC” now on) algorithm and launched it against my simple player. And sure enough, the MC player won, even if it was just playing 1000 games with random moves to decide which move to pick. And I’m not speaking about a “little better” here. The MC player won almost 70 games out of 100.

Then I thought: picking random moves is not really a likely outcome for a normal game, so the results for the MC player are somehow skewed. What if I use my simple player instead of a random player to play the games?

No surprise, using the simple player within the MC player improved the results against the simple player by quite a bit. But against the random MC player, the difference was not that big, it just improved it slightly, but not really significantly, only by 2.5 to 3% which is not bad but not as much as I would have expected. Note that running this simulation on a CPU is costing a lot of cycles. 1000 games with each roughly 15 to 20 moves. And at each turn, play 1000 games for every possible move (there are 1 to 6 possible moves). That’s a lot of moves to run! I’ve accelerated that to run on a GPU, but that will come later.

And still, the MC player lost to the German PhD’s carefully hand-crafted player. Well if a simple MC approach would do it, that’s what he would have picked in the first place!

This is it for this part. Next, we’ll start building a Neural Network with its associated Genetic Algorithm.

Part 3 is here.