Today will be a more technical episode, but if you want to remain relevant as an event professional, you do need to know what big data is. That is why I've chosen this topic.
Hi Ben, welcome to our studio.
Today's topic is big data. It's a big buzz word. But what is it about?
Well, big data is actually not a technology or a solution. It's more a definition of a problem domain. And the problem is that you have a data problem, that you cannot solve anymore, using traditional systems like regular relational databases. And that problem can have multiple causes.
It can be that you have more data than you can cope with, within your traditional database system.
That it grows...
The volume of the data.
Yes, the volume of your data, it grows into the gigabytes, terabytes of data and you want to process that faster.
It can also be a problem that you have your data coming in faster or you want to process it faster. And more in real-time than your used to.
It can be a problem that you want to model or query your data differently. That you want to look more at the relations between your data, more than joining all the data all together.
And all of these problems, they require different solutions.
It can be that you, if you want to handle more volume, that you have look into a big data system. Where you can process your data distributedly, across multiple machines at a time.
When you want to look at your relations more, you have to look at graph databases, for example.
For streaming solutions, you have technical topics like Kafka.
But they all require a different type of solution. So, it's not really about only the volume of your data.
It's not about the variety of your data, but it's a problem with your data. Where you have to reside to new systems, than opposed to what your used to.
And that, for me, is a problem definition of big data.
Do you have some examples of this kind of problems?
Yes, if you look at the volume of data, you typically look at high volumes of data in forms of log data for example. Or tracking data from your website, for example. And these log systems, they create very, very large files, typically going to gigabytes a day.
And if you want to extract meaning out of that data, what you can do is wait for your system to process that file in a whole. But it might take a long time. Or you can distribute that load across multiple machines at a time. And if you have 100 machines, working at the same time on the same file, in theory, your program goes 100 times faster.
But also, tracking, behavioural data tracking, sensor data, things like that.
And why should we care, in the event industry, about big data.
Well, you can use it in the event industry, to get to know your customer better. You can use it to track your customers throughout your event. You can use it to give them more meaningful recommendations, for example. You can use it to track the relationships of your customers to your event or to other events. So, it's usually used to know your customer better.
So, for example, on a big congress, you track customers in real time, what they're doing. And based on what they're doing, you give suggestions for next sessions they should attend, for example.
Yes, exactly. And you can combine more and more sources.
You can use the data of where they are walking around.
But if you can collect more data, for example from the things they are looking at. You can even use eye tracking, for example. You can combine that data, with the data of where the people are walking. And see where they are walking, where they are stopping. And what they are looking at, exactly. For example. And use that in your advantage.
Today we've talked about collecting the data. The big data problem.
Next episode we will be talking about machine learning. And that's than more about what to do with the data.
Okay Ben, thank you very much for coming over.
And you at home, thank you for watching our show. I hope to see you next time.