Two small robots on stage

Trying to make a Youtube video with my two robots was a difficult experience. Lat time I had to synchronize the movements of both robots and four different cameras (one for each robot, one watching from above and one recording my desktop).

Needless to say, this took me several hours in failed recording attempts. And just when it was getting better, the batteries for one of my robots went dead.

Wouldn’t it be great if the bots can record a video by themselves?

This interesting thought kept returning to me every time I thought about a new video. Almost a month later, and a few modifications on my controlling software, I finally got it working.

You can watch my first attempt in this video:

 

The robots are moving according to a predefined scenario and my only job was to hide from sight and point the camera (and drink some beers).

The setup

I connected the robots and my laptop in a cluster using a WiFi network. Every machine has information about what functionality is deployed on the others and can access all of the available controls.

I added some modifications to the control station allowing it to see all of the connected machines as one complex computer. In this concrete setup, the robots have only the clustering module and all of the motor controllers active on the Raspberry Pis.

The cluster node on the laptop has some additional improvements:

They can speak!

MarryTTS is a popular open source text to speech software written in Java. There are also multiple voices available and most of them are free (If you use it, make sure to check the licenses of the voices you select).

In this setup I have installed the TTS software on the laptop. The original idea was to install it on the robots themselves, so they can speak without the laptop. This however required a lot of resources that my machines currently don’t have. Instead I decided that it will be a good opportunity to play with a distributed system.

The Boss

The final improvement is a function written in Javascript that serves as an orchestrator. It is located in the control station and  is able to access all of the modules by using the clustering protocol. Its main purpose is to run predefined scenarios and make my life easier when recording videos:


Example scenario:

second 1 – Access the first robot and start moving forward
second 4 – Stop the movement of the first robot
second 6 – Turn the camera of the second robot to the left
second 10 –  Say something cool


Points of improvement

It may all sound great, but I still had to make a few recording attempts until I got a satisfying video. The reason is that my robots don’t have any feedback sensors (yet).

How is this a problem? Well, the control signals are simple commands like “move forwards with speed X”. However, without a feedback, the robot have no idea what the actual speed is. The wheels might have hit an obstacle, the batteries might have went dead and much more.

Even when we have perfect conditions, there are small variations in the movements. A simple example is when executing a rotate command. The robot might turn 91 degrees to the left instead of 90. This might seem insignificant at first, but have in mind that small errors are added up. After a fеw turns, the robot might end up in a completely different position than expected.

The solution to this problem is adding some sensor feedback, so the robot knows its actual movements and can calculate it’s exact location, thus enabling it to regulate it’s movement.

When one programming language is not enough

Ever needed to use multiple programming languages in the same project?

When working on a project with limited scope, like web serves API layer, single page web application or even the firmware of an embedded controller, you can stick to your selected language. In the worst case you will have to expose some kind of interface or API so your code can be integrated with other components.

However, when the scope of the project increase, this might no longer be valid. Ever heared the dreaded words “Great job for implementing the driver for this camera. Now I want to control it over the web” or “Man your web dashboard software is great. I want to monitor my automated greenhouse with it”?

Remember the horrible feeling after realizing that this new requirements increase the scope of your project in a way that your current technology stack cannot handle. And the client doesn’t seems to care. After all he doesn’t care about frameworks or computer languages. He mostly care about the results.

Depending on the team size and your position, the decision what to do may lay on your shoulders. So … what are the options?

In this article I will give you an overview of the most common ways to combine the strength of multiple programming languages.

There are few main ways to integrate multiple languages to work together:

Microservices

This is a really popular architecture, used and preferred by many developers. The project is divided into few smaller projects (services). Each of these services is deployed independently and runs in a isolated process in the operating system. It communicates with the other services through a contract. A strict definitions of supported messages structures and data types.

A Microservice can expose multiple different functions. Each of them having specific arguments, return data type specification, and a messaging format (JSON, SOAP, custom messaging protocol). As long as the service implements the contract, it doesn’t matter what is the language that it is written in, nor the other technologies used.
 
When using the network to communicate, the services don’t care if they are deployed on single computer or on multiple machines. It really shines best when used on cloud environments.
Any functionality can be exposed as a service: “payment processing module”, “speech recognition module”, “key/value storage” and so on.
 
Pros
  • The services are decoupled. you can add any implementation and use any language as long as you stay true to the contract They are easily scalable.
  • You can run all your services on 1 computer, or split them into multiple machine.
  • Without adding too much complexity you can run multiple instances of one service as a cluster. Using this, you can handle huge workloads.
Cons
  • Large latencies

Imagine a scenario where we have multiple nested service calls. Service 1 calls Service 2 and Service 2 in terms calls Service 3. We will have to wait a bit to get the end result of the call to service 1. This might be fine for applications that can afford to wait, but it is a deal breaker when we need ultra fast responses.

Since the network has a limitation in speed, in order to increase the throughput, you can use shared memory instead of network to communicate. This greatly increase the amount of data that can be transferred between the services. The limitation of this is that it only works if the processes are on the same machine.
 
And just to add some numbers to this. As part of one hobby project, I had to create a fast data transfer between processes that work on the same machine. With just a slightly optimized memory based communication library in Java I was able to achieve a transfer speed up to 6 millions messages per second. Each one of them 20 bytes of data that was serialized, transmitted and deserialized at the other end. This makes for around 120 million transferred bytes per second. Just for a simple comparison, my LAN card transfer data at speed up to 100Mbit/s which is around 12 million bytes.
 
To use the best of both worlds, sometimes we can use communication libraries. Some of them can be configured to use shared memory when communicating with processes on the same machine and network, when communicating with remote computer. This however will add more dependencies to your project.
 
In case ultra low latency in not one of the virtues of the project, I would go for the microservices option. However, latency is a sensitive topic when it comes to robotics and systems that have to work in “real-time”.
Using Libraries
If we want to use external functions written in other languages we can call them from our code using libraries. There are a few alternatives as how this might happen:

static libraries

c  / c++ – The compilation of the system languages results in native code. This makes their integration easy as long as you have the correct header files and link your code with the external libraries. Most compilers usually have a way to define the dependencies of your code to the needed external libraries.

Some of the interpreted languages nowadays are compiled to an intermediate language. Then executed on virtual machines for performance reasons. If two languages can be compiled to the same intermediate language, this helps a lot when integrating them. An example for such intermediate format is the Java bytecode.
 
Except for Java, there are a few more programming languages that can be compiled to JVM bytecode. Examples of such are Scala and Python (the Jython project). You can easily use them together as they are compiled to the same format (a .class file)

Shared libraries

Shared Libraries are libraries that can be linked to any program at run-time. They provide means to use code that can be loaded anywhere in the memory. Once loaded, the shared library code can be used by any number of programs.

Connecting system languages like c/c++ to a native shared library can be done easily when you have the library functions and structure definitions in header files. The compiler and linker just need to know what functions it is calling and the used data types and structures.

Connecting a interpreted language and a native one in the form of a shared library is a bit more difficult, but it can be achieved with using

Bindings

A binding from a programming language to a library or operating system service can be described as a glue code that acts as a bridge between them. It allows to call a function from the desired library, directly from our code.

There are a lot of examples that can be seen in practice: Java  bindings for OpenCV (computer vision library), Java bindings for Serial Port communication (in libraries such as JSSC). Even some languages and platforms rely on system code. Node.js and Python rely heavily on C/C++ libraries.
 
Pros
Since all of the code is run in a single process, the function execution doesn’t depend on switching to another process or network communication. With this the response time is reduced to only the time needed to run the actual code.
 
Cons
  • It can only be used on a single machine.
  • Provide additional dependencies (correct versions of the bindings for the corresponding version of the library).
  • Less portable (libraries are platform dependant).
  • Not all languages can be used to write libraries. (usually low level languages such as C/C++ are used to create the libraries and high level languages such as Java/Python are consuming them).
  • Too much bindings can result in plunging the project into a dependency hell. A project is dependant on all of the libraries, all of the binding to the libraries and all of their dependencies. You can guess what happens when these dependencies become too much to handle.
  • The glue code and the libraries are executed in the same process. So in order to run our project onto multiple machines, we have to rely on a different method for the inter machine communications.
I would only use a solution like this in a case when ultra low latency is a must.
Hybrid solutions

We can combine the previous technologies in different ways, in order to meet our requirements. Just keep in mind that we get the best of both worlds, but we also have to beware from the worst of both worlds.

A tip for the data types

One tricky problem is the mapping of data types between different languages and platforms. When a symbolic message format such as JSON or XML is used, mapping data will hardly be a problem. That is not the case for binary messages or a shared memory structures. A single byte that is not aligned properly will compromise the entire message. When working with binary data, it is essential to take into account the binary data formats used by the platforms.

In conclusion

This was a brief overview on the most common ways to use multiple languages together. To put it in real action, you will have to read the specifications of your languages of choice and select the best approach.

With all being said, I am pretty eager to test a specific hybrid architecture in the next version of my robot. What is your next multi-language project?

Two small robots

A few days ago I had the last components of my machines installed and the last pieces of code downloaded on the on-board computers.  Seeing my creations finally moving without the need of external cables is quite a rewarding sensation. Well, thoughts like “will it be in one piece after the first test run” crossed my mind a couple of times. In the end it was all good (except for a few screws that keep unscrewing in time no matter to what lengths I have gone to keep them in their places).

The only thing left to do was a test drive, and so i did it.

The test run

I have done some tests using the web controls on all separate elements and didn’t expect any surprises, so I directly jumped to try to First Person Control mode. In this mode, all the standard commands are connected to shortcut keys on the keyboard and the movements of the mouse. Similar to a computer game, the arrow keys are used to move the chassis and the mouse is used to control camera direction.

Doing some basic actions with the machine was easy. Just move the mouse pointer to the left, and the camera will turn left. Move it in circles and the pan/tilt mechanism will move in circles. However, making some complex and meaningful actions have proven to be a bit more challenging.

One reason for that is my poor piloting skill. Another one is the 150-200 milliseconds lag between entering the command and receiving the visual feedback from the camera. When moving slowly, the lag is not a problem.  But as the speed increase, the pilot have to take into account that anything he/she sees in the camera panel is about 200 milliseconds in the past. In other words, when you give the stop command, the machine might already be falling from the table.

Same problem goes for the camera as well. The servos on the pan/tilt mechanism can rotate quite fast. When they are at full speed, moving from one end of the limited rotation space to the other can happen a few times in a second. Event thought I have limited the rotating speed to about 100 degrees a second, the lag can still result in about 20 degrees more rotation until the stop command is given.

In retrospect

I’d like to think of these little machines as my “Hello World!” in the world of robotics. In other words, it’s just a starting point. During my work on both machines, I learned a lot. But in order to move to the next level, there are still many things to be done and a lot of new skills to be gained:

  • Fix the model – With the current programming model and used technologies, there is a large lag between commands and actions. Another problem that I am facing is the slow performance (able to process up to 15 images from the camera at resolution 320 * 200). And let’s not forget the horrible control station. To fix this, I will have to rethink a lot of the software.
  • Research the existing software – I have written most of the code in this project from scratch. However, there are a lot of existing open source libraries that can greatly help with the development of robots. One logical next step is to research them.
  • Last but not least – To make an autopilot mode that follows different predefined scenarios. As seen in the video, I am a horrible pilot.  Having an option to record and then replay all of the movements from a session, will be very helpful in a various situations. The one that is on top of my mind is shooting a new video. With this they wont have to rely on me to control them (yep … even they think it’s a bad idea).

Awakening of a robot

About two months ago I held in my hands something that could barely be called a robot. At that time it was just a shell showing no signs of “life”. Just Something constructed from some components laying on my shelf.

I started working on the hardware and software separately. The connections between the two was done gradually. At the beginning it was just to test some isolated parts of the machine, a servo, camera, Arduino to Raspberry communication. At some point I realized that I was holding something that can actually be called a robot. Well, just an awakening one.

First steps

So what do you do, when your neighbourhood friendly robot awakens? Connect to it and start playing of course. In this case the connection is working by using the Raspberry Pi 3 on board Wi-Fi module. With just opening a random web browser and typing the IP address of the robot  we connect to the web server running inside.

Hopefully a few seconds will be enough to recover from the initial shock caused by the horrible sight of the user interface (Will make it better at some point I promise).  After this we get the first data from the machine. A list of all of the available modules on the robot is displayed in the right section of the web page.

The interesting ones are:

  • Camera0 module – This is the module that handle the Raspberry Pi camera. With the current programming the system supports about 15 frames per second on 320 x 200 resolution or 5 frames per second on 640 x 400.
  • ChassisControl – This module is used to control the movement of the chassis. Basically it translates commands such as: forward, backwards, turn left or right. Then generates the control signals for the continuous rotating servos which are moving the wheels.
  • Servo3/Servo4 – Those are the modules that control the Pan/Tilt mechanism servos.

One of the best features about the control station is that we can customize its layout. All we need to do is drag the modules from the menu section and drop them in one of the cells of the table. Doing so will generate a control interface for the selected modules in the target locations. The user then will be able to control and give commands to all controllable values of this module. Two such examples are the move “forward” for the chassis controller and “rotate to degrees” for the camera servos. There is also support for read only data (that can be read from the robot sensors). Example for this is the stream from the camera module.

Different control options

Having the eyes of an ex gamer, I thought it would be cool feature to add a mode for a first person control. Using the keyboard to control the moving direction of the chassis and the mouse to control the turret just like a game. Right now it’s a bit of a bumpy ride since the system have about 100-150 milliseconds delay before a command reach it’s destination. I really hope to improve this in the next versions.

The infamous first post

… The module was entering the last phase of the landing. The speed was reduced enough by the reverse thrusters and now it was only meters from the surface. The board computer changed it’s mode and start calculating and careful corrections in the last steps.

A few seconds later, the module was sitting proud on the surface. The mission control computer, activated self diagnostic sequence for the module and the precious cargo. The diagnostic software took about a year to be developed but was active for only 7 seconds and resulted in 14 green lights. The importance of those green lights was great. They were the thing that indicated that the next stage of the plan can commence.
The signal reached the module, as well as the other hundreds of modules scattered on the foreign surface. The door of every module opened and the machines until now transported only as cargo, awakened to life. They disembarked out of their transports and looked at the lifeless wasteland with their cold robotic eyes.

Even thought they shared their mind, each of them had a specialized construction, sensors and tools to help them achieve their combined purpose. In less than 2 years, in place of this barren terrain, there will be a completely habitable small city that will accept the first colonists …

Back to reality …

I have always been fascinated with these sci-fi visions of the future. In particular  the artificial intelligence and robotics. There are a lot of things i want to try in these fields, so i started with building my own robots (or at least attempt to do so).

It started as one robot that have a Raspberry Pi as its main computer and Arduino Uno board for control of the low-level electronics, sensors and motors. Soon after, i used some spare parts and build a second one. Since i am a software guy (really suck at electronics), i started with the software. It is almost identical for both machines with the major difference in the chassis drivers:

  • Low level drivers for sensors and motors control.
  • Installed OpenCv to handle all image processing tasks.
  • Embedded web server that allows to control the robots directly in the browser.

As i mentioned previously i am not a hardware guy, so the hardware is built by using ready-made components and kits. The machines have different chassis and components, however both of them have the same functionality in common:

  • Raspberry pi 2/3 – This is the main brain of the robot. The raspberry have enough processing power for fairly complex tasks such as image processing and neural networks.
  • Arduino Uno – I am using Arduino to control the low level electronics, sensors and motors.
  • Pi camera – Both of the machines have pan/tilt turret and a Pi camera attached to it.
  • Wi-Fi – Not much to be said here. We want to control the robots remotely.
    Whats next?

    Make this pile of components actually work together. Right now all of the components are working and can be controlled separately. However there are still some issues with them working together, and the next step is to tune the software and hardware for both machines (hopefully without blowing anything up).