9 August 2021
LONDON, UK – Synthesized, the leading all-in-one DataOps platform, has today released the world’s first data centric open-source library for identifying data bias and driving fairness in decision-making.
FairLens is an open-source Python library which allows data scientists to automatically discover hidden biases and measure fairness in data.
FairLens is instantly accessible and available in GitHub where Python developers can integrate FairLens into their workflow, collaborate or help develop new features.
Denis Borovikov, co-founder and chief technology officer at Synthesized, said: “The goal of FairLens is to enable data practitioners to gain a deeper understanding of their data, and to help ensure fair and ethical use of data in analysis and data science tasks.
“FairLens is a mathematical framework that we have developed as a starting point for discovering data bias. Synthesized is now calling on developers to get involved to expand ways in which bias and fairness can be tackled and potentially enhance and improve FairLens”.
Nicolai Baldin, co-founder and chief executive of Synthesized said: “Many data science models rely on biased and skewed datasets. What we have created, with FairLens, is a way to highlight the data bias so it’s easily discovered and visualised. While data bias is still a taboo subject for many companies and industries, what FairLens enables is a behind-the-scenes discovery of data bias, which can then be mitigated.”
“With the help of the developer and data science communities, and our machine learning technology, we can build and enhance FairLens collectively, driving adoption across organisations and communities and by doing so, we hope to make a valuable impact on society.”
Synthesized’s package provides multiple metrics to measure biases across a range of protected characteristics such as age, race, sex and other groupings. With limited, poor-quality or skewed datasets; data-driven applications often fail to achieve their intended purpose as they are inherently biased.
FairLens can be tried in Synthesized’s SDK Colab alongside other platform functionalities. It comes with core features which allow users to:
Measure bias – FairLens can be used to measure the extent and significance of biases in datasets using a wide range of metrics
Identify sensitive attributes – Data Scientists can automatically identify and flag sensitive columns and hidden correlations between columns to protect sensitive attributes
Visualise bias – For example visualize the distribution of a variable with respect to different sensitive demographics, or a correlation heat-map.
Score fairness – Data Scientists can highlight hidden biases and correlations within a dataset by selecting a target variable