We had the pleasure and opportunity to have a chat with Tim Pijl about his experience working at Finaps and especially about his experience with working with Federated Learning, a machine learning technique that trains an algorithm across multiple decentralized edge devices.
Could you introduce yourself?
Probably best to start with my name when introducing myself; Tim Pijl. I have been working at Finaps for roughly five years, where I started as the first Data Analytics Engineer. Over the years, we have expanded our expertise in this domain and now have a full team of Data Analytics Engineers. Currently, I’m the team lead of ‘Team Data’, working with 6 direct colleagues, developing data driven solutions (software).
Why did you decide to work at Finaps?
What got me hooked was an inhouse day at the Finaps office in –I think– September 2016. The two main reasons I applied were the work itself and the people. The reasons I am still working at Finaps are many; the projects, the culture, the people, the flexibility, and my current responsibilities. I feel that over the last five years, I have grown tremendously and there is so much more to learn that I’m nowhere ready to leave yet.
What sets Finaps apart from other Fintech companies?
To be honest, I’m not entirely sure, as I haven’t worked at other Fintech companies. But if I must guess, it’s the people working at Finaps. It’s hard to describe in words, but working in a team with like-minded people, who are constantly challenging you –while also having a dose of friendly banter– is inspiring.
There seems to be a larger focus on Software as a Service (SaaS) from different corners of the market, considering Finaps’ investment in software, what do you think of this trend?
Software as a Service has its pros and cons, obviously. I would say it is a good solution for some projects/organizations that want an out of the box, relatively quick solution to a standardized problem. However, keep in mind that some solutions require more flexibility; not every organization has the same wants and needs. If you want something that perfectly fits your organization, a SaaS solution is not the answer.
Can you explain what a federated learning process is?
Federated learning is a way of performing distributed machine learning. Instead of moving data from the clients’ devices and servers to a central location to perform model training (which we call centralized learning), the models are trained on the clients’ devices themselves and subsequently combined in a central orchestrator (usually, by averaging all individual model weights). The process of training and recombining is repeated until model convergence. This method allows you to defer most of the computational and hardware-related costs to the client, but more importantly it provides a large degree of data privacy since data never leaves the client’s systems.
How do you use the federated learning process?
Federated learning has several advantages, most importantly data privacy for the clients. This makes federated learning a natural choice for performing machine learning with sensitive data, such as healthcare. People are (rightfully) becoming more and more aware of their privacy rights and there are strict regulations and guidelines in place that dictate whether people’s data can be used for training machine learning models, especially in the context of medical data. Using federated learning allows the use of data that normally would be off-limits, since the data itself is never actually visible to any outside party, not even the owner of the federated learning process! Another use case lies in “sharing” data across parties that have a vested interest in keeping their data to themselves (think banks, hospitals, and pharmaceutical companies). Federated learning trains a model using the data of each participant and thereby can train a model that outperforms any model that the participant’s would’ve been able to train by themselves; a win-win situation.
Why did you decide to use it?
Federated learning appeared on our radar when we started the MIT R&D project together with Bubl, a company which aims to provide personal, secure, and private ‘data vaults’ with a focus on storing medical data. Part of this project was to investigate whether any data analysis and/or machine learning could be performed on the data in the vaults, without compromising privacy. Federated learning was a perfect fit for this use case, which we later even enhanced with other privacy preserving techniques (e.g., differential privacy).
What were the biggest challenges you faced during this project and how did you solve them?
The main challenge was in finding an open-source federated learning framework to fit with our needs and set up the required infrastructure surrounding the process. Since we were mostly working towards a proof of concept, we went with a federated learning framework written in Python: Flower (flwr). This allowed us to quickly get up to speed and get something working. Later, we ran into the challenge of the tiny computational budgets available on the vaults, forcing us to spend time on minimizing the number of required libraries for our setup.
What areas does Finaps think have the most potential for growth when federated learning is deployed?
As mentioned before, federated learning has many advantages when working in an environment where data-privacy is of utmost importance. That’s why we see most potential for growth in areas that match that description; healthcare and finance.