Code of Ethics

What is this project?

Data for Democracy is partnering with Bloomberg and BrightHive to develop a code of ethics for data scientists. This code will aim to define values and priorities for overall ethical behavior, in order to guide a data scientist in being a thoughtful, responsible agent of change. The code of ethics is being developed through a community-driven approach.

By hosting discussions among data scientists, we hope to better capture the diverse interests, needs, and concerns that are at play in the community, and put together a code that is truly created by data scientists, for data scientists.

How Can You Contribute Your Ideas?

In order to be as open and tranparent in this process as possible, we are leveraging the Github open source platform to collect ideas and suggestions from the community as a whole. This provides a quick and easy way for you to do the following-

1. *Join our Ethics Team and submit titles or links of literature/resources* that you have found useful in thinking about ethics and data science. You can do so at this link, or even through the widget embedded in the sidebar of this webpage. A Github ID is required, it's free and you just need an email address to join.

2. *Submit comments on what matters to you* and what you think is important to consider when creating a data science code of ethics. You can do so at this link.

3. *Browse all the suggestions and comments* submitted by your fellow community members. All results of these surveys can be viewed by the public, including you.

4. *Indicate which suggestions and comments you agree with* through voting. The setup of All Our Ideas presents you with a pair of suggestions and then allows you to vote for the one you prefer. However, please keep in mind that none of these suggestions are mutually exclusive; we are not pitting ideas against each other or using the number of votes to eliminate suggestions. We are simply using this as one convenient metric to determine which ideas have the most resonance in the community, or which resources/literature have been useful to a large number of people. If both suggestions presented are important to you, feel free to click the "I can't decide" button. If you have a more detailed response or would like to express your thoughts on someone else's idea, you can submit this comment as a new idea of your own.

Who is a “data scientist”?

Currently, we broadly define “data scientists” to include students, civic technologists, professionals working for data-oriented firms, and others. At this stage of the project, we are interested in scoping out the concerns of anyone who interacts with data, including data users/consumers, creators, and analysts/practitioners. As the code of conduct is actually written, responsibilities will be more clearly defined.

What has been done so far?

We conducted a preliminary scan in the Data for Democracy community, by posting discussion questions on Slack and Twitter, and collecting feedback and input from our 2,000-plus members. We then identified recurring themes that our community members highlighted as important, and arranged these in a systematic framework.

The key areas of concern identified were as follows-

The data itself

Includes overall practices in collecting, storing, and distributing data, as well as understanding and minimizing intrinsic bias in data.

Questions and problems

Includes identifying valuable and relevant problems to work on, and working with pre existing resources and parties in those fields.

Algorithms and models

Includes understanding and minimizing bias in algorithms and models, and working responsibly with black-box algorithms.

Technological products and applications

Includes taking responsibility for how one’s research is applied, and identifying and guarding against the potential for misuse.

Community

Includes fostering a data science community that is inclusive and deliberately promotes equity and representation, as well as finding ethical, non-invasive ways to track progress toward this goal.

What comes next?

We will use the findings from this preliminary scan to frame and develop more targeted discussion questions, which will be posed to the greater data science community. We hope to reach 100,000 data scientists, and gather more community input and existing examples of similar work.

How can you take part

As an individual data scientist - Join the conversation by responding to the discussion questions that will be posted on our Twitter in the coming months, and joining the dedicated Slack channel at p-code-of-ethics. Email us at team@datafordemocracy.org to request a Slack invitation

As an organization/institution

We will be conducting biweekly virtual focus groups throughout October and November. These will be working sessions discussing the five highlighted areas of concern, as well as other questions and concerns that have been raised. You can send a representative to participate in these discussions via conference call. Contact Natalie Evans Harris if you’re interested in partnering with us.

As anyone who has made use of or worked on a code of ethics

Please contact us at any of the aforementioned channels, if you would like to share any literature or content that you already have! We’re aware that many tech groups have already been working on codes of conduct, and we’d love to amplify these voices and build on these efforts, rather than trying to reinvent the wheel.

Want to learn more about this project?

Bloomberg press release

Bloomberg blog post

Q and A with D4D lead

Project Leads:
Lilian Huang

Code of Ethics

Code of Ethics

What is this project?

How Can You Contribute Your Ideas?

Who is a “data scientist”?

What has been done so far?

The data itself

Questions and problems

Algorithms and models

Technological products and applications

Community

What comes next?

How can you take part

Want to learn more about this project?

Click the button to get involved with the project!

Sign up to receive updates on the Code of Ethics initiative!