SALESmanago Marketing Automation

Technology and infrastructure in SALESmanago

 

SALESmanago is a cloud AI & Machine Learning powered marketing automation platform used by over 10 000 companies in 40 countries including Lacoste, Yves Rocher, Starbucks and huge number of Small & Medium Businesses. In Financial Times Fast 1000, SALESmanago ranks as #26 fastest growing company in Europe and fastest growing European martech platform.

SALESmanago customers build complete behavioural and transactional profiles of over 500 million customers and use this data to personalize marketing in all channels including email marketing, dynamic website content, mobile, social media, ad networks and direct sales.

In SALESmanago, we process and store huge amounts of data. Every day we send 50 mln emails – it’s 600 messages every second. Every day we register 75 mln visits on our customers’ websites. The behavioral data gathered in our databases has more than several hundred terabytes. Our databases processes much over 25,000 transactions per second. Only in 2018 we send to internet 2 284 057 743 907 420 bytes, i.e. 2.1 PiB.

Handling so large volumes of data, processing it in real time, and storing it requires adopting very advanced, stable and well-developed technologies. In SALESmanago we work with open-source enterprise solutions, well-proven tools that brings great scaling possibilities. Moreover, qualified experts are responsible for maintaining and developing the system.

Among many technologies used in such a large project it’s worth to name the crucial ones.

The Application

The whole SALESmanago app is built on technologies based on and derived from Java language. Java is a concurrent, class-based, object-oriented, general-purpose computer-programming language. It was developed by the team led by James Gosling at Sun Microsystems. Java is a language for developing applications that are typically compiled to bytecode that can run on any Java virtual machine. One of its characteristics is strong typing. The language derives much of its original features from SmallTalk (virtual machine, memory management) and C++ (syntax and keywords).

The SALESmanago app is based on Spring framework. Spring Framework is a platform that aims to simplify the process of enterprise software development in Java/J2EE technology. Spring is composed of multiple project created for developing apps in Java. The core of Spring is the dependency inversion container that manages the components and its relationships. It makes possible to automatically detect these relationships mostly without the developer’s participation. The Spring history dates back to 2002 when the main platform for creating apps in Java was J2EE 1.3 with quite disliked EJB 2.0 technology. The first version appeared in 2003. Everything started with the book Expert One-on-One J2EE Design and Development written by Rod Johnson, the ‘godfather’ of Spring who began his work with Juergen Hoeller and Yann Caroff.

Hibernate ORM – SALESmanago uses this framework to Object-Relational Mapping. Additionally, it increases the efficiency of operations on the database thanks to buffering and reducing the number of queries sent. The main initiator and the leader of the project is Gavin King.

MySQL is an open source relational database management system developed by Oracle. Before, for longer period of time, the system had been created by a Swedish company MySQL AB. It was bought on January 16, 2008 by Sun Microsystems, Inc., and next, Oracle bought Sun Microsystems, Inc. on January 27, 2010. MySQL aimed at its speed rather than its compatibility with SQL standard – for quite a long time MySQL did not process entities, what was the main disadvantage of the system in opinion of its strong opponents.

PostgreSQL, also called Postgres, is one of three most popular Relational Database Management Systems (RDBMS). Other two systems are MySQL and Firebird. It was created at the University of California, Berkeley and then known as Ingres. As it developed and its functionality increased, the name was first changed into Postgres95 and finally into PostgreSQL. It refers to the prototype and indicates compliance with SQL standard. Now, database implements most of SQL:2011 standard.

We use stable and well-established technologies, however at the same time we are not afraid of modern tools which provide more opportunities for processing large volumes of data. In our projects we use technologies which gain popularity fast.

Apache Kafka – a message broker available as open-source software. The project is written in Scala and developed by Apache Software Foundation. Its aim is to enable handling real-time data from numerous nodes. Unified service of clickstreams was created to enable greater band with and reduction of delays. The project was significantly influenced by transaction logs. Apache Kafka was designed in Linkedin and next, shared as an open source at the beginning of 2011. The project left Apache Incubator on October 23, 2012. In November 2014 few engineers working on Kafka in Linkedin, founded Confluent – a company which has strong links with Kafka.

Apache Flink is a platform for stream-processing with an open source which provides great opportunities to process real-time data in a way that prevents them from damages in the scale of millions of events per one second.

Among other interesting technologies we use, there are Python together with machine learning library – for example NumPy and image processing in OpenCV with a production start up in Apache Spark cluster.

Production Environment

All our services work in Linux environment. Linux is an example of free and open-source software (FLOSS): its source code can be used, modified and shared freely. The first version of Linux kernel was made available to the public on September 17, 1991. Linux is commonly used for instance in server environment which is commercially supported by big computer companies as: IBM, Oracle, DEll, Microsoft, Hewlett-Packard, Red Hat and Novell. Linux runs on a wide range of computer devices, including desktops, supercomputers and embedded systems as mobile phones, routers and tv sets.

Apache is an open-source HTTP server available for many operating systems. It is the most popular HTTP server in the Internet – in August 2015 it was estimated to serve more than 37% of websites. Apache guarantees multithreading, scalability and safety. Apache server is based on a HTTP server code written by Rob McCool, NCSA.

Apache Tomcat is a web container developed by Apache. As an application container, it is a server which enables starting web applications in Java Servlets and Java Server Pages (JSP) technologies. Apache Tomcat is one of the most popular web containers.

SALESmanago sends tens of millions of e-mails a day. It would not be possible without a mail server. Postfix is a open-source mail transfer agent (MTA) for unix-like systems which is responsible for delivering mails. It was originally written by Wietse Venema at the IBM Research Center and currently it is released under the IBM Public License which is a free software license.

Tools

In IT tools are selected by their users. Comfort and work quality are essential. This is why IntelliJ and GIT seem to be an obvious choice.

IntelliJ IDEA – a commercial Java integrated development environment (IDE) for JetBrains. According to programmers surveyed by ZeroTurnaround in one of the biggest surveys of this kind, IntelliJ IDEA, followed by Eclipse, is the most popular development environment. IntelliJ IDEA developers consider it to be the most intelligent Java IDE and it must be said: there is something about it. Auto prompts, offered by this development environment, can be a positive surprise. IntelliJ prioritises productivity and work without using a computer mouse.

Git – a distributed version control system. It was designed by Linus Torvalds for development of Linux kernel. Git is free and open-source software distributed under the terms of the GNU General Public License version 2. Git development began after free access to BitKeeper, a management system still at that moment used to develop Linux, was withdrawn for open-source projects. In consequence, Trovalds was looking for a version control system that can be used instead of BitKeeper.

Grafana is a platform for analysis and visualisation of data. Thanks to the support of numerous databases (for example Influxdb, Elasticsearch, AWS Cloudwatch or Zabbix – also described in this article), Grafana can function as a centralised tool to visualise various metrics.

Elasticsearch is a full-text search engine. Elasticsearch itself is a database which bases on Apache Lucene. This combination provides us with an access to the greatest tool which enables searching, grouping and filtering huge databases and it is a near real-time process.

Obviously, the technologies described above are not the only ones we use. SALESmanago Marketing Automation system relies on most modern technological solutions in the world so as to provide best performance and reliability of the system.