Machine learning predictions based on smaller sets of data and variables

Machine learning: also for smaller data sets? Thanks to the Devoteam machine learning solution on a small set of variables, municipalities can now realign policy and avoid the costs of rebalancing their budgets.

Devoteam, like the rest of the world, usually applies machine learning to big data. This includes not only many observations but many variables too. When applying machine learning, it is often advantageous to take all available variables into account, because most machine learning methods are algorithms which select the right drivers for prediction purposes by themselves. How about applying machine learning to a small set of variables? Devoteam proved it to be possible!

A Devoteam machine learning solution on small data

The associated client, a medium-sized Dutch municipality, wanted proof that the prediction of the public budgets for one of the social benefits is manageable, paired with a special interest in machine learning. Although this client had a lot of interesting and meaningful socio-economic and demographic features at its disposal, Devoteam had access to a few very basic and anonymized variables only. As machine learning is often associated with big data (made up of many variables), the main challenge was to see whether machine learning would work with the use of a small number of variables.

“As machine learning is often associated with big data (made up of many variables), the main challenge was to see whether machine learning would work with the use of a small number of variables.”

Devoteam accepted this challenge with both hands and successfully delivered very realistic predictions, in range of what all stakeholders had expected in advance. Accuracy of the model on the test sets ranged from 95% upwards. All this based on a – comparatively – very small set of original variables, together with machine learning.

Data-driven policy as the municipalities’ future goal

The client’s request was a consequence of a new approach released by the federation of municipalities in The Netherlands. According to this approach municipalities are meant to enhance data-driven policy decisions with the goal to increase effectiveness of public processes and financial resources. One of these improvements in effectiveness was to be able to draw up future budget demands more precisely. This additional forecast information had to be generated by a profound prediction mechanism based on already available data collected in past years. Applying machine learning prediction methods was the innovation that brought more precise budget estimates for this municipality.

machine-learning-data-sets-variables-mdm-devoteam

The lack of information causes social costs and expensive follow-ups

As a local government entity is bound by law to process requests for social benefits made by local inhabitants, the real amount of budget needed naturally depends on the extent of real demand for approved benefit requests. Consequently, the real amount of budget required may – in the course of the budget period – appear to differ from the budget plan. When drawing up budgets without a confident forecast of the required budget, budget plans will likely be under- or overshot. With undershot budgets, the public financial resources are inefficiently allocated (other important public tasks could have been financed!). Overshot budgets generate a financial gap to meet the demands for (regular and justified) social concerns. As a consequence, time-consuming and costly political dcision-making is needed to rebalance public household funds.

Machine learning solution avoids costs and inefficiencies

In order to avoid these costs and inefficiencies, municipalities may need to tackle it using another instrument. As the client’s interest in prediction of budgets with machine learning was growing, Devoteam came into play and developed a solution that not only allows municipalities to avoid the costs of inefficiencies and rebalancing budgets, but also allows them to react with an adaptive policy on predicted but undesired future developments of budget demands through changing conditions.

Devoteam began with a simple analysis of part of one of this municipality’s public budgets: in this case the budget for private household support (“huishoudelijke hulp”). This household support budget is part of all the public tasks related to social projects.

The Devoteam machine learning solution has managed to achieve reasonable predictions for private household support and high model accuracies on the test sets. These predictions can be used to draw up budget for private household support more precisely and therefore avoid costs and inefficiencies as mentioned above. In short, a great opportunity to align internal procedures to the possibilities offered by digital and machine learning trends.

The cost of the solution is rapidly amortized after multiple usage on several social projects

If a partial budget of one of these social projects can be confidently predicted, it might be valid for other social project budgets too. Moreover, the machine learning solution developed by Devoteam for household support is applicable to many of the municipality’s social projects such as youth care and social support. As the developed solution can be used in multiple ways for municipalities, the investment into the development of machine learning forecast models is more likely to be amortized at an early stage.

Individual privacy is not affected

When governmental organizations can avoid public costs and therefore avoid public resources, the results of this analysis benefit all members of the society. To achieve this social efficiency gain, the analyzing entity is fine with using fully anonymized and partly aggregated data, whereas the outcomes and subsequent advice of the analysis are on a totally aggregated level and will not affect individual privacy. Therefore, privacy is fully guaranteed throughout the whole analysis process

Machine learning predictions are also feasible with only small sets of features/variables

As Devoteam experienced high quality results in the example presented, and proved that machine learning can work on a small set of variables as well, this does not specifically mean that machine learning works on all smaller data sets. If your company faces problems with predictions, Devoteam can assist by applying machine learning to your data, bringing significant improvements to your organization’s processes and boosting decision making in the long run.

What is your challenge? Discussion inspires!

For questions about this article, you can contact Marc Bovy, Manager Data & Analytics (marc.bovy@devoteam.com), or Sjoerd Veen, Account Manager Government (sjoerd.veen@devoteam.com).

devoteam