In a previous article about predictive analytics >> that only a couple of close friends read, I stated that such procedure can actually convert data into money. This time, I bring a practical study case to demonstrate that I was not overstating. Well, I shouldn’t be demonstrating anything, as anybody working in airlines and large hotel chains had that very fact scientifically proven long ago. But apparently “lessen” sectors in the travel industry still believe “data analysis” is nonsensical, un-businesslike stuff reserved for the IT bunch. Thus, this is just an exercise of self-affirmation more than anything else.
Hope somebody find it at least slightly thought-provoking or useful for their own company.
A foreword: it’s not “A.I.”, actually
Apologies to the engineers and experts that might be accidentally reading this text. I cringe too every time I see “AI” on this kind of post, but I needed some sort of click bait to get al least three or four readers from my industry. Let’s call things by their proper name, then: what we are using to gauge our predictions is not “A.I.” but one of its many facets, a method called machine learning. It is a flavour of intelligence because, after all, the technique involves a form of learning. We have a look at the data and, if it’s good and there’s enough of it, we can create a “model” that would be used to predict outcomes. Then an algorithm is build (or recycled) to “perform” the prediction model, which will be tested in several ways and environments. Finally, the outcome is verified: either the prediction was accurate (close enough to real-life results) or wrong. Both results are good, though, because the “machine” will take note and learn from it, so next time it will fare better. And the next time even better, and so on.
Just like you and I should do…
The subject
Bear in mind that a project like this can be conducted for any business implementing a booking engine. Our lab rat was an Italian B2B wholesaler, with data collection that goes for over a year (the bare minimum to get acceptable predictions). Their customer backbone is made by loyal retailers; we focused in a few of them with enough data to check if the predictions would be reliable. In this study case we’ll present outcomes from a single agency for simplicity; our wholesaler’s dashboard would include the same analysis for all their clients.
The data
We had to combine two datasets coming from the same database, because the vast majority of booking engines don’t provide data from searches logs, it must be exported separately. Aside from that, the dataset size was rather small and there was no technical challenge in terms of ETL, just a few corrections here and there. Data originated in travel-related systems is usually well structured. Alas, it’s almost never properly stored and managed… Anyway, please note that this study’s dataset goes from May 2017 to May 2018. We also used a dummy dataset, as training data. As further testing means, sometimes we did run the algorithms “backwards” (I’ll dispense with the explanation), to verify if the predictions would deliver results similar to the actual bookings done in past months. It turns out they did, with an acceptable error margin.
The goal
We wanted to know what our loyal retailers would be buying next, spending how much, when and for which destinations. The main focus was on “multi” trips, that is, bookings which include more than one city and/or hotel and/or transportation means. Obviously, that’s of special interest being the most profitable type of booking, but we checked all type of services bought, especially hotel + flight.
The tools
Forget it, I’m not going to support your DIY little project. Besides, even if you had infinite resources available to buy all the fancy software and powerful computers, you’ll need a proper data scientist to work out the right flow and use the correct algorithms (or adapt existing ones) …. And you’ll need somebody to help the data scientist come out with the right predictive model and inferences. That would be me (call me a data executive, if you will).
The outcome
Once there’s enough data to play with, correlations and constructs are easy to come by. Not all those would be useful, though, especially for predictive means. In short, whoever does the analysis must know what to look for (the data executive). See why getting the tools and the data scientists is not enough?
For instructional purposes, we’ll concentrate on three aspects: expectations, timelines and chances of getting “multi” trips booked.
· General expectations: we took total pax as an interesting figure, perhaps not for this study case in particular, but DMCs would definitely find it useful. Even more useful would be the “type of passenger” graph, which was quite accurate: it shows a surge of children to be expected during August, which is normal for the Italian market. Net profits expected, on the other hand, is a sought-after indicator. In this case it shows a negative trend, but it doesn’t mean there’s a problem, as we’ll see on the next analysis.
· Timelines: there are a few fascinating findings here. This agency usually books between one and four months before travel date, however it looks for dates well over a year from now… That’s why the profit expectation from the previous analysis has a negative trend: they rarely book with a lot of advance, which is normal for this agency and its market. Besides, the analysis is a couple of months old and we did run that particular forecast up to June: if the timeline went to September’s end, the trend would have been definitely positive, as August is Italy’s main holyday month. The “Bookings predicted and chances” graph proved to be rather conservative. The surge on reservations done in March is due to Eastern’s holiday, plus special summer offers. The red line there shows the “confidence” the system has in its prediction: the higher the curve, the more confident it is. For past months (previous to June 2018) as I mentioned predictions were conservative, because the actual number of bookings was 5% to 10% higher (except March). Lower confidence means that there was not enough data, in general, to be confident. Hey, it’s a newborn algorithm: give the machine time to learn! Next year, with more data, it will do better. Granted!
· “Multi” bookings: all the above could be performed specifically for this kind of trips, of course. Moreover, here we compared the monthly chances of “multi” trip being booked instead of fligh+hotel packages. Finally, we tried predicting how much in advance and which months would “multi” trips would be booked with higher chances.
I’ll leave the destination demand forecasting results for now: there’s not much sense in anticipating the preferred resorts from a single agency, but the whole customer base offers a pretty clear idea of what the market wants in the near future. A mesmerizing topic that deserves its own post.
Key question: are these predictions reliable?
Key answer: depends. The main problem here is the amount of data: even if we have an historical of over a year of transactions, the actual number of bookings from a single agency isn’t that big, so the error margin is not acceptable in some cases. The number of passengers booked, for instance, is a simple prediction, and if I owned a DMC I could go on with the estimated traffic to organize forthcoming transfers. I could also blindly trust sales and demand forecasts done this way, if I was a hotel’s revenue manager. But I wouldn’t bet on charter allotment calculations, not yet. Forecasting which type of trip will be booked several months in advance from historical and search data is tricky. Nobody tried it as far as we know, and we couldn’t find any academic content to give us guidance.
Bookings and searches transaction numbers won’t be a problem when we get our dirty hands on a huge bedbank’s dataset: they have hundreds of bookings, billions or searches per day. I am looking forward to start that project (my data boffins are salivating at the prospect!)
Back to our study case, we used a confidence interval and procedure that we esteemed valid, yet it can be argued that our machine learning algorithms based on training data might not be accurate enough. Perhaps time-series forecasting would work better: that’s something we’ll try next time. My data boffins mentioned arcane methods such as Generalized Autoregressive Conditional Heteroskedasticity (no kidding), Bayesian-based models, and the like. I’ll publish our findings in due time. Bottom line:
[ctt template=”3″ link=”1d73U” via=”yes” ]Our ML predictions are so far correct qualitatively, slightly off quantitatively. No crystal ball yet but getting closer.[/ctt]
Conclusions & a rant
Anybody working with data has a natural, inherent honesty that prevents them from presenting manufactured results or crappy interpretations. Besides, I run projects for my clients as if the outcomes would be vital for me, so I’m not going to ice the cake, affirming that our predictions are a magic window that shows the future exactly as it will be. Even so, this approach is by far much more accurate than any of the commonly used methods in our industry (namely, Excel). Can you imagine all of the above done in spreadsheets? No, you can’t.
Still, hotel chains and even expensive revenue management systems are grounding their demand and sales forecasts on historical data only, with age-old procedures. It amazes me! Eventually the accommodation industry will update its toolbox, but tour operators and bed banks that are entering the forecasting dome should ditch simple statistical forecasting methods as the weapon of choice. Why using a knife when you have a rail-gun available?
Moreover, the beauty of modern forecasting methodology is that it may not be limited exclusively to endogenous data. How about adding to the mix weather parameters, air traffic, official arrival figures by destination, etc.? I am not speculating here; I find it utterly surprising they’re not doing it already!
Don’t get me wrong, this is not bragging at all. Rather, it’s a half-backed cathartic effort that will not attest I’m a visionary genius… Quite the opposite, it proves I might be making the same mistake I already made three times in my long career. Seems my brain machine is unable to learn that being a tech pioneer with no money to back up the marketing crap, equals very limited success, in a best-case scenario.
One day, when A.I-driven forecasting in travel businesses will be mainstream, I’ll rejoice on my own silliness. Right now, I am getting the same puzzled looks I got back in the day, when tried to explain the benefits of an online booking engine to hoteliers. And feeling the same damn frustration again.
Thanks for reading, excuse my rantings.
Marcello Bresin
Comments
One response to “Using AI to forecast sales and demand”
Excellent article