I’ll Be A Kind Of Data Scientist

I have recently started working as a “Senior Customer Adoption Engineer”. This is a kind of data scientist who helps to gain information about how modern technologies are used. So better decisions can be made about future software developments. This in turn benefits our customers. I am very happy about this new task. I think it is very important. Let me explain why I think so.

Data can save lives

When we hear about large amounts of data, this is often accompanied by concerns about our privacy. But as is so often the case, there is another side to it. Large amounts of data can help us make vital decisions. This is particularly visible during the current corona pandemic. The more information we have about the status of the infections, the more accurate and reliable this data is, the better decisions we can make. Decisions that can save many lives.

That’s why scientists are working feverishly to increase testing capacities for the corona virus. That’s why politicians and decision-makers in Germany and many other countries listen to scientists. The scientists’ recommendations can be made on the basis of a good database.

Data in product developments and new technologies

My employer VMware delivers new technologies to better create, run and use applications. Product managers and business units must make decisions about what features are needed, or what improvements are important or needed quickly. Some decisions are made on instinct, but they are often based on existing data.

There are business figures such as revenue, licenses sold per solution, pipeline, and other relatively easy to obtain information. But there is also another side: What are customers doing with the technologies? How are they used? What benefits are particularly important, what are the difficulties? We have a lot of customers, so we use analysis of support requests, but also anonymously sent information from our products, if the customers allow it. You can get interesting data through these automated channels.

But really valuable are the data that reflect the subjective perception of the users. How well do the new technologies and solutions perform the tasks for which they are intended? I am now working on improving this kind of research in my new team. There’s a wonderful book on this topic that I would recommend if you are interested in gaining customer information: “Lean Customer Development” by Cindy Alvarez. The subtitle is “Build Products Your Customers Will Buy”, and it’s full of insights and practical tips on how to approach customers, be it via email, phone interviews or on-site interviews. The book is valuable for start-ups and also for established companies.

I am working on ways to provide useful information that generates valuable insights. Our customers should benefit from this because they get even better solutions, but it should also make the work of our product management easier. It should help them to make the right decisions.

The Eel and The Discipline of Small Steps

Have you ever tried to hold on to a live eel? You’ll hardly ever succeed. I grew up in Northern Germany at the Steinhuder Meer, and there are eels there. With my school class, I went to an eel smokehouse once, and we were allowed to try to hold an eel. It slips through your fingers. Nobody could hold it for more than a few seconds.

Sometimes it seems to me that the business value of a new technical solution is like an eel. You’ve invested millions in new software or services, and in the end you’re not sure whether this investment has delivered measurable added value for your own company. This seems to be a trend across all industries, but especially in modern IT such as cloud computing, IT departments have a hard time. Vendors are reacting with new roles such as Customer Success Manager. A search on LinkedIn for this job title yields 65,671 hits today. These people help customers to realize the added value of a solution.

In an ideal world, a product delivers its business value after installation and everyone is happy. But the world is not ideal. Especially solutions that involve change of operational processes, that are supposed to deliver particularly high added value, require a change in user behavior. That starts at the Apple Retail Store, where you can get a demonstration of how to make the transition to Apple products work. But this is even more true in large companies. This is often referred to as operational transformation, the change in IT operations.

That’s why I’m a big fan of small steps: Think big and start small. If the value of a great idea is visible in a first implementation after a short time, then the IT manager can provide management with more reliable predictions about future business value.

Look for a concrete use case that is close to the business. Define how you want to measure success. Pay special attention to how you want to measure the success of a business transformation. And don’t wait too long until the first milestone is reached.

When there is a special relationship of trust between customer and supplier, sometimes very large projects are initiated and new investments are made before the previous project has delivered measurable business value. This can work, but in the long run it is a risk for both sides. Think about the discipline of small steps.

An AI Playground

This week I was invited to the official opening ceremony of the ARIC (Artificial Intelligence Center) Hamburg. The ARIC brings together companies, start-ups, research institutes, banks and politics to initiate AI-based projects and establish AI solutions on the market. Besides good conversations, I experienced interesting presentations introducing AI projects.

A very large established finance company uses AI in two ways. There are short-term (in 1 to 3 years duration) projects in which modern applications and new user interfaces are developed. In the long term, in cooperation with ARIC, completely new business areas are tackled and the old processes are fundamentally improved, e.g. in the analysis of legal documents.

A communications company presented how they use AI to evaluate and optimize the efficiency and reach of marketing methods. A consulting firm showed how AI in image analysis can be used to categorize defects in aircraft engines much faster.

There are many ideas on how AI can drive new business, and yet it seemed a bit like a playground to me. This is not meant negatively. It’s about playful experimentation. There will be many more experiments to try. And it’s about starting on a small scale and proving the value of AI solutions, as I wrote earlier.

The more AI-based business models work, the more new ideas are coming up. I can imagine that AI will become much more interesting for many companies. And faster than you might think.

Truly Intelligent Machines

The definition of artificial intelligence can be vague. Sometimes it seems to be just brute force number crunching. There, more and more computing power is used to create a behavior that seems to show intelligence. But if we look behind the scenes of Deep Blue and other supercomputers that master games like chess or go, these are special cases where knowledge is optimized in a clearly defined area.

Human intelligence is much more creative and adaptable. It is prepared for every eventuality in our lives, much more than any computer.

And this is exactly where the 15-year-old classic by Jeff Hawkins and Sandra Blakeslee comes in: “On Intelligence” is a book in which we learn in great detail how the human brain works, how the neocortex is structured, how we use it to remember things, and how we make decisions. And it is precisely this biological template that the authors use to give us clues as to how to build truly intelligent computers.

A colleague and friend recommended this book to me, and I can only pass on this recommendation. Even if the predictions of 15 years ago did not really come true, it is still an enlightening reading.

“The most powerful things are simple,” Jeff writes in the prologue. He’s right, you might just think of the iPhone. So this book presents a simple and straightforward theory of intelligence. It is very profound when the individual cells and cell regions in the brain are explained how they interact and how information is stored and retrieved. Yes, you should concentrate while reading, but is it also understandable for non-neuro-scientists.

Now, if a machine uses this behaviour of the human brain, then it is really intelligent. Jeff assumes in this book that in 10 years (that would be 2015) such intelligent machines will exist. But in the next sentence he gets more cautious because it might take longer.

Jeff calls for the construction of such machines, which have the human neocortex of the brain as a model. In the book there are some examples, e.g. how such machines communicate and capture the world’s weather in a level of detail that seems impossible today. Do we really want that? I’m not sure that’s a good idea. And I haven’t heard anything more about such machines.

Anyway, I recommend the book “On Intelligence” to anyone interested in intelligent computers. You’ll have more respect for your brain after reading it.

Investing in AI and The Role of VMware

At the NORTEC 2020 trade fair for Manufacturers I was invited to a round table discussion about the introduction of AI in the manufacturing industry. Large companies, universities and local business leaders explored how to use AI to drive innovation and create a business plan for it. I was asked to present the role of virtualization for AI/ML projects, and a data scientist was interested (and surprised) by the performance benefits of virtualization as described in the VROOM blog article How Does Project Pacific Deliver 8% Better Performance Than Bare Metal?

Several representatives and local executives from private and family businesses discussed their business. Small and private companies are driving the economy in Northern Germany, where there is not a single DAX company, but many small and medium-sized companies. I was surprised to learn that these smaller companies increase their turnover much faster and more strongly than the large public companies. The consensus was that long-term investments exceed short-term investments. Public companies must take shareholder value into account and provide quarterly figures. Many decisions are made to increase short-term revenues. Smaller companies have a time horizon of 10 to 20 years for their investments, resulting in a more stable and reliable business. They work over many generations.

This has an interesting influence on their AI strategy. These companies cannot afford large investments, so we have discussed joint projects with students from local universities. These entrepreneurs cannot risk investing large sums of money because they have to control the risks. But they are very interested in AI and there are first companies that are starting to get value out of AI. But they are only at the beginning. Another stumbling block is the concerns about using the public cloud for AI projects, especially in terms of compliance and intellectual property protection. As a result, they will want to run AI/ML software in their local data centers or locations. The amount of investment is often only around €10,000 to €15,000 for hardware, so at first I thought this was too uninteresting for cloud infrastructure providers like VMware, who tend to support larger projects. But I was asked about the virtualization of AI/ML workloads because almost everyone has had good experiences with VMware vSphere (or VMware Workstation). In addition, universities and research institutions like DESY have to cover completely different dimensions, which can make infrastructure projects with virtualization interesting.

Unexpected Side Effects

In the podcast Die Maschine: Kontrolle ist gut, KI ist besser (in German language, by the radio station Deutschlandfunk) a scary fictional story is told from the 21st minute on:

An artificial intelligence has been developed that controls and executes all drug shipments worldwide. Because this was so critical, a special algorithm was chosen to ensure that individual population groups are not disadvantaged under guarantee, an algorithm which is always 100% politically correct.

When the artificial intelligence was activated, things went well at first, but then the number of deaths of diabetics increased in the rich countries. Insulin is lacking everywhere in the hospitals of the industrialised countries. How could this happen?

Well, the system worked exactly as it was designed. However, the artificial intelligence took into account the need for drugs worldwide. But there were not enough drugs like insulin for everyone on earth. Underserved areas, especially in Asia and Africa, received more drugs from the artificial intelligence, while the rich countries received less. Thus the shortage was distributed evenly across the globe.

It is a similar dilemma to two burning houses with people trapped inside, but you only have enough helpers to fight one fire and save the people, not both fires. What are you doing?

These are ethical questions that an artificial intelligence cannot answer automatically. So when artificial intelligence is used to sustain life, we should look very closely. And well-intentioned is certainly not always well done, as the story of the drugs shows.

ML Job Interview

There are many variants of this joke floating around It is so cool. Here is my favorite which I found on Twitter:

Interviewer: What’s your biggest strength?

Me: Expert in Machine learning

Interviewer: What’s 9 + 10?

Me: 5.

Interviewer: Nope. 19.

Me: It’s 14.

Interviewer: Wrong. 19.

Me: It’s 19.

Interviewer: What’s 2 + 2?

Me: 19.

Interviewer: You’ll overfit right in!

Like a Broken Marriage

There are cases where I think IT and business are like a broken marriage, and my work is that of a family therapist. What makes me think so?

Well, at a meeting of IT specialists I once asked which of the IT experts has the pressure to provide infrastructure faster. No one had come forward. They all said their work was okay.

A week later at the Machine Learning Conference I asked a few people where they run their applications. They said in the public cloud. When I asked if they were considering doing the same in their own data center, they gave me a big look: “I would never ask my own IT department if they were running machine learning applications. They are way too slow to deploy.”

No wonder IT staff feel no pressure, that business doesn’t even ask for faster deployment because they have given up.

It’s like in a marriage where the spouses have given up communicating. If you want to solve this, it’s hard work.

Now there are certainly IT departments that are working well with their customers in their respective business areas. But, dear IT people, are you sure that you know all the requirements of the business units? Are they still talking to you, or have they given up? It might be a good idea to validate assumptions explicitly. Maybe there is still unused potential for improvement. And dear business departments, have you asked your IT department lately if they could react faster? Perhaps you have overlooked potential in your own company?

If IT reacts too slowly to business requirements, then it has very little to do with technology, it’s all about processes and team structures. And above all it is about communication. Maybe you are getting help to get communication back on track.

Between Hype And Extraordinary Benefits

Last week I was at the ML Conference in Berlin. The conference was very instructive for me. The lectures were about the technology behind machine learning, data science, security and algorithms. But then it was also discussed how to set up an ML project, which mistakes can be made, and why container technologies like Docker or Kubernetes make ML projects much easier. Most exciting were the lectures about concrete applications of machine learning and artificial intelligence. And of course the discussions with other participants did not come too short.

My Conclusion

Most companies are currently experimenting with machine learning. There are only a few ideas on how these new technologies can bring real business benefits today. The hype in the media – both positive and negative – was far away. At the same time, everyone I spoke to agreed that in a few years, any company will have to use machine learning to stay in the market. It will be the lowest common denominator.

It will be used both in the public cloud and in the company’s own data center. The biggest problem in your own data center seems to be the slow processes. “It takes far too long for me to get a system set up. I’m much faster in the public cloud,” several people told me.

From VMware’s (my employer’s) point of view, I noticed that some participants knew us, but none associated VMware with machine learning and artificial intelligence. VMware offers interesting options when it comes to the flexibility and security of ML applications. Well, I’ll deal with that another time.

Here are two more highlights from my point of view:

How we avoid maritime accidents

The Keynote Data to the Rescue! Preventing Accidents at Sea by Dr. Yonit Hoffman of the Israeli startup Windward inspired me. Did you know that 10% of all ships have an accident once a year? With big data analysis and machine learning, Windward has developed models to predict and prevent accidents. 150 million data points on ship positions, ship types, weather information, water depths, proximity to ports, and many more have been brought together and translated into deep knowledge and understanding of what happens at sea. The lecture was really captivating.

The results have led to a very fruitful and interesting cooperation with insurance companies. Imagine which worlds meet. Maritime insurance companies have been working in a similar way for 350 years – often with paper. Now something new is emerging. And it is clearly driven by business. This is how I imagine the future of machine learning.

Rainwater drainage pipes for China

The most exciting conversation was with an employee of a company that manufactures sanitary systems for large buildings and rainwater drainage pipes. A special challenge are large flat roofs, e.g. for industrial plants or shops. The drainage pipes must have exactly the right diameter. If they are too small, too little water flows off. If they are too large, there is no suction effect. With the optimum diameter of the pipes, a suction is created in the pipe, which optimally drains off the rainwater.

I have learned that many years of professional experience are necessary to design these optimal pipes. This is a problem when you want to enter new markets. Now the company has managed to determine optimum rainwater pipe diameters much more quickly using machine learning. Now, for the first time, it is possible to enter new markets such as China without having employees there who have decades of experience.

If machine learning and artificial intelligence already deliver value in this extraordinary way today, then we will certainly see even more examples in the future. The limits are our creativity and imagination. Knowledge of technology is not enough. This is only a necessary, but not a sufficient condition for the successful use of machine learning. Identifying business ideas is the real challenge.

Good Code is like Good Literature

The basis of many innovations is software. “Software is eating the world”, said Marc Andreessen already in August 2011. Software is also the basis of all machine learning. And what does software consist of? It is code. Now, I think it’s important to understand what good code is.

A great colleague recommended the book “A Philosophy of Software Design” by John Ousterhout. The author’s name looked familiar to me. But I didn’t immediately know why. Only in a later chapter, in which John wrote about the Tcl scripting language, did it become clear to me. I had used the same author’s book “Tcl and the Tk Toolkit” a lot in the 1990s. I then developed Linux device drivers (loadable modules for the Slackware Linux 0.99 kernel) for a measurement system in my physics studies, and I used Tcl/Tk for the application interface.

“A Philosophy of Software Design” is cool. It reminded me very much of another book I had read this summer: “On Writing Well” by William Zinsser from the early 1970s. The book explains how to write texts of all kinds. Be clear and use short sentences. Write for yourself, as you would like to read it. Good code is like good literature – it’s timeless. The software design tips are like tips for writing any kind of literature – keep it simple and avoid complexity. Define good interfaces with the right level of abstraction. Think of other developers who will work with the code after you.

These rules haven’t changed for decades. And even the seemingly agile and fast innovations end up abiding by these rules because they are based on code. And ideally on good code.