After formulating an experimental plan, we needed to have some datasets to run those experiments on. There are some requirements that classify a good dataset for this task: it must be non-trivial in size, needs to include a protective attribute (such as gender or race), and it must have some true ranking. While it is possible to find a datasets that satisfies two out of the three requirements, it becomes difficult to satisfy all of the requirements.
Encompassing a dataset from Rankit, list of Fifa 2018 players was added as a possibility. It contained 17981 players, has a protective attribute (age and nationality), and potentially has a true ranking. The true ranking can be based on goals scored, or money earned by the player. Some ranking data can also be found over this datasets, although it does not encompass all of the players.
Hospitals and doctors also have a significant amount of data, both attribute and entity-vise. However, finding a true ranking might prove to be impossible.
Taking into account the feedback received from the our mentors, we updated the section analyzing the outcome of the online user study. We updated the machine learning section to include more references and added more charts to the whole paper.
The team also discussed the next steps and observed new features to be implemented.
With the Rankit paper submitted, it was time for me to change gears and dive into the research with MaryAnn.
For me, this week revolved around getting up to speed with the fair ranking research. I read over the current in progress fair ranking paper and attended meetings where MaryAnn and Caitlin helped familiarize me with the code base and research that they've completed.