We believe that data science can play a fundamental role in improving the livelihoods of the poorest. Data for Workforce Nurturing (D4WN) is a project co-created and implemented by a multinational consortium led by Fundación Capital (Latin America - Mozambique), UX Information Technologies (Mozambique), and Data Elevates (United States), who jointly have more than 30 years of experience in ICT4D projects. These implementing organizations exist to create positive social and economic impact in the lives of the poorest by leveraging the most ubiquitous information and communication technology devices available - mobile phones - and applying data science techniques to the information generated by these devices.
D4WN is a purpose-built database and data toolkit that aggregates user generated data from Biscate, a USSD/SMS on-demand work marketplace, and Com-Hector, a WhatsApp virtual assistant for on-demand workers in Mozambique. D4WN analyzes the data inputs from these two platforms using data science techniques including data visualizations, machine learning, and recommendation algorithms, and creates business intelligence and data insights to improve the performance and service delivery of both platforms. Concurrently, it also provides tailor-made insights to help improve decision making, career choices, and personal development for users (informal workers) of these services - it empowers them.
We are a social enterprise advancing economic citizenship globally and at scale with a systems change approach. We work in three ecosystems: financial health, sustainable livelihoods, and natural capital, leveraging our expertise in digital solutions, data science and using a gender transformative lens.
We develop interoperable data platforms that connect feature phones, smartphones, and web platforms to deliver inclusive digital services that empower underserved individuals and their communities through access to information and opportunities for personal and professional development in Mozambique.
We use data science to drive the development of underserved communities by training the next generation of data scientists, building demand and capacity for data use in social impact organizations, and connecting data professionals to market opportunities that are in line with their goals.
In Mozambique, 96% of all workers are employed in the informal sector, one of the highest rates of informality in the world. In fact, recent studies have shown that in the past two decades there are only 18.200 new jobs created in the formal sector every year, while 500.000 young people enter the job market in the same period - only 3.64% of those entering the job market can expect a formal sector job, while 96.36% have no alternative than joining the informal labor sector. Nearly 20% of those in the informal sector are service workers, representing the most common employment for low-income, urban youth. However, informal workers face greater rates of poverty, increased economic insecurity, and fewer opportunities for upward mobility. They lack information about trends and opportunities in the labor market, making informed decisions difficult. Compounding these challenges, the informal labor market is notoriously scarce in data sources, hampering any effort of data-driven policy making.
Informal workers live in a survival economy, relying on on-demand services to generate earnings to support their families, so every day is a struggle to put food on the table. They rely mostly on acquaintances to find job opportunities, and roam the streets posting small makeshift ads with their profession and contact information wherever they can. This is a labor force that spends most of their time looking for opportunities, not working, so the majority cannot afford to specialize in any particular skill or trade - a carpenter is often also a plumber, an electrician, and a street vendor - they become a “jack of all trades and a master of none”. Trapped in a cycle of poverty with no actionable information, they survive each day at a time without being able to make career decisions or long term plans - the future does not exist for those who can only afford to live in the present. These are some of the reasons why we chose Mozambique for our pilot, as we believe data science can disrupt this cycle.
Would you like to be updated on the work we are doing and/or become involved with this project? Subscribe to our mailing list to get in touch!Subscribe
D4WN was co-created by Fundación Capital, UX and Data Elevates to empower workers by sharing informal labor market insights and personal advice derived from data generated by two local digital platforms, Biscate and Com-Hector. The project is being funded by data.org, a platform for partnerships to build the field of data science for social impact. We believe that data science can help generate actionable insights for workers, data narratives for stakeholders to engage in data driven decision making for labor policies, and improve the design and development of digital platforms that can service informal workers and improve their financial outcomes.
D4WN analyzes the data outputs from Biscate and Com-Hector, and translates them into tailor-made insights to workers, to help improve decision making, career choices, and personal development. By increasing access to information that workers themselves are generating, we are taking the blindfold that workers have about the informal labour market in their vicinity. Ultimately, our goal is to improve informal workers’ livelihoods (we seek a 10% increase in their income and to double the participation of women in the platform). We want workers to thrive so that future generations can benefit from more personal and professional development opportunities.
On-demand work platform that uses USSD and SMS protocols to connect workers to clients.
• basic phones
• no internet access
• no app required
• zero rated
• mobile coverage
Server and database that collects user generated data from Biscate and ConHector.
• database and APIs
• data analysis
• machine learning
• recommendation algorithms
• predictive modelling
Virtual assistant that uses WhatsApp to provide custom insights for workers.
• smart phones
• internet access
• whatsapp required
• data package
• mobile data coverage
Biscate is a USSD/SMS based on-demand job marketplace in Mozambique. This platform allows workers with no internet access to use their own mobile devices (feature or smart phones) to dial a free shortcode (*777#) and create a profile that includes their profession, location, years of experience, education, and demographic information. Concurrently, clients can dial into this service and request the contact information of registered workers in a specific profession and location. By matching clients to workers Biscate facilitates work opportunities for the informal labor market while collecting feedback from the users to ensure a more transparent and secure process. Additionally, Biscate stores all the data on supply and demand of informal services and securely shares it with D4WN to be processed into insights that further optimize both the platform and the workers that benefit from it.
Com-Hector is a WhatsApp virtual assistant for on-demand workers in Mozambique. With real-time Biscate data processed by D4WN, it gives updated insights to workers on labour trends (what trades have more demand than others in their area, so workers can makes better decisions on their career pathways) and salary benchmarking (so that workers can have a benchmark to know how much to charge or expect to earn). After registration, users get a business card with their personal details, which they can use to advertise their services and ask clients to share via WhatsApp with others. Com-Hector also gives recommendations to users about how to increase their chances of getting clients on Biscate, how to have more safety at work and even connects workers who are interested in mentoring and those who want to learn.
At its core, D4WN is a data-sharing platform to enable the creative testing and deployment of advanced analytics and data science techniques. D4WN houses over five years of data, covering the more than 50,000 informal workers on Biscate. All this data is pipelined through an AWS databasing solution optimised for analytics. Program managers, product teams, analysts, and data scientists are empowered to explore the data with real-time insights through custom visualizations, queries, and data views tailored to each audience, with the ultimate goal of better understanding our users and creating.
Particularly relevant findings are curated by our content teams and made available to end users through USSD or WhatsApp menus. We know for example at which time informal workers are more likely to be called by clients, or which trades have more demand than others depending on location. This data is valuable to workers who don’t know when peak hours are or who might have more than one trade and do not know which one is having more demand in the market. By using free and inclusive digital channels (e.g. USSD and SMS), D4WN ensures that those that cannot afford data costs can also benefit from these data insights.
D4WN is about empowering informal workers. Because of that we believe that they need to be at the centre of our design, to make the best data tools for them. Therefore, focus groups and interviews are part of our routine. We have used them to validate our ideas and also our MVPs, and we have obtained important insights that have helped us pivot our project in the direction that workers need. For example, after a few focus groups we saw how important it was for workers to have more than one trade on Biscate. Hence, even though we were not expecting to make these modifications, we changed our original plans and adjusted the platform in accordance. Also, as much as we believe in the power of data for impact, we also believe in the power of “asking our clients”.
After all, our work is social, human, and relational. Focus groups and interviews get us in touch with our clients, informal workers, to cater to them our data tools and all other aspects of our project. However, focus groups can only take us so far to increase the empowerment of our users, as they utilise our systems. We believe that our data tools must also be supervised from an ethics standpoint, in order to maximise that empowerment, as we democratise access to the data that users themselves create. In this sense, we are creating an ethics committee that can hold us accountable to the results of our actions. This external ethical review of our project will help us redress any situations that might be going on, and that neither our data analysis, nor our “ask-our-client” approach might be picking up on.
I ask my customers how much they are willing to pay for my service, that is how I set my prices. I would like to learn how to calculate my prices better.
I have never used Biscate as a customer - only as a worker. I did not think that I could also be a customer - don't you need money to use it as a customer?
Many customers are not serious people. I spend my money on transport to come meet them and they only want a quote, they don't have a job for me.
I struggled to choose my profession. I do many things, because that is the way to find more opportunities. Biscate should allow us to choose multiple professions.
Sometimes I receive a message from Biscate saying that a client requested my number but I never receive a call. This makes me anxious because I wait, and wait, and nothing.
These are some of the interesting things we are finding about during our data analysis, and will update this section regularly. Feel free to get in touch with us through our mailing list subscription if you want to contribute to and/or tackle any of these challenges.
Inequality Analysis - Uncovering Unexpected System Design Biases
After some analyses, we realized that there might be unexpected issues with our on-demand job marketplace, Biscate. Despite the number of requests being four times superior to the number of workers, we found that many of the registered workers had never received a request (around 54%); and even the ones that had received one had to wait an excessive amount of time (50% of workers with requests waited over 7 months before receiving their first request). These surprising results led to the following research question: “How are the requests distributed among the workers on the platform?”. Through an analysis in Python, we created three graphs that help us understand the root of the above-mentioned issues.
The first graph shows what percentage of total requests were owned by each percent of the top workers of the platform. It reveals a shocking reality: the current request distribution system is highly unequal. As we can see, the top 1% of workers receive 28% of all the requests on the platform. Then, the percentages of the total requests possessed by the following groups of top users keep decreasing drastically. This decrease goes on until the 45% group, then the following groups have all received 0% of total requests, no request at all.
In this second analysis, we wanted to discover the average number of requests per worker inside of each group, hence the below graph. Again, the results are conspicuous, there is a high level of inequality in the request distribution. The workers in the top 1% have received on average 120 requests each. Thus, we see that the top 1% is receiving an extreme number of requests compared to the global mean of 4 requests per worker and the median of 0 requests per user. This median of 0 requests per user indicates that over 50% of all the workers never received a request.
The two previous analyses have the same insight implied: the current request distribution system is highly unequal as only a small portion of the workers benefit from the Biscate platform. It is then clear that we need to develop a fairer request distribution system. The first step of this process is to be able to measure the level of inequality of the system and monitor its evolution after the implementation of adjustments. Thus, we decided to adopt the Gini coefficient, an economics concept used to measure inequality. The Gini coefficient is represented by the area between the cumulative share of requests and the Perfect Equality. For our system, the Gini coefficient is 0.84, close to the Perfect Inequality level (value of 1). Also, we can also see that 54% of all the workers have 0% of the total requests, while 87% of all the workers only receive 20% of all the requests.
To better understand the reason behind this unequal distribution of opportunities (requests) between all workers registered on Biscate it's important to analyze which design features led to these results, and we believe that the culprit is the way clients request the worker’s contact. On Biscate a client selects a location and profession (e.g. Plumber in Maputo), and is given a list of all workers registered in those two categories. Furthermore, this list is organized by number of requests and ratings, meaning that the workers with most requests, and therefore, ratings, are on top of the list - they become super workers that receive most of the requests - while new workers with no requests end up in the bottom. To further complicate this, USSD, the technology that powers the platform has a character limitation per each menu it displays, so only 8 workers are available on each menu displayed, and clients have to navigate to the second, third, and so forth menus in order to find more workers. These extra steps inevitably lead to a decline in requests for users further down the list, leading to the unequal distribution of opportunities (requests). This is similar to what happens on Google, where page 1 captures a whopping 95% of all traffic, while page 2 is jokingly referred to by SEO professionals as “the best place to hide a dead body”.
This analysis perfectly illustrates the importance of data deep dives for improving the social impact of services like Biscate. Prior to D4WN, the Biscate team was just looking at the global mean of requests per user as a key performance indicator, and was quite pleased that the average worker received 4 requests from the platform. The problem here is that averages can hide what you need to know, especially when used in a KPI, and this analysis highlighted how design decisions can lead to unintended consequences - in this case, they created a biased platform. We are now busy developing a matchmaking algorithm that will distribute opportunities more evenly across all workers, an attempt to remove selection bias and striving for perfect equality in opportunity distribution. Furthermore, the Gini coefficient will now become a KPI of Biscate to be monitored through time, because social impact is only achieved if everyone has an equal opportunity in the system. This does not mean that highly rated workers will be penalized, it will simply provide more opportunities to new workers to provide their services and improve their ratings. In the end, it will be up to the clients to choose which variables are most important in the hiring process, and not for the platform to condition this choice.
Gender Analysis - Understanding Gender Bias in the Marketplace