Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Trust But Verify: A Guide to Algorithms and the Law, Study notes of Algorithms and Programming

Associate Professor of Law and Ethics, Georgia Institute of Technology, Scheller College of Business; J.D., Yale Law School; Affiliated Fellow, Yale Law ...

Typology: Study notes

2022/2023

Uploaded on 05/11/2023

borich
borich 🇬🇧

4.3

(26)

293 documents

1 / 64

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
TRUST BUT VERIFY:
A GUIDE TO ALGORITHMS AND THE LAW
Deven R. Desai* and Joshua A. Kroll**
TABLE OF CONTENTS
I. INTRODUCTION .................................................................................. 2
II. ALGORITHMS: THE CONCERNS ........................................................ 6
A. Transparency and Accountability: Two Complementary
Views .......................................................................................... 7
B. Algorithms, Public Sector Concerns .......................................... 12
C. Algorithms, Private Sector Concerns ........................................ 16
III. ALGORITHMS: A PRIMER .............................................................. 23
IV. TO HALT OR NOT TO HALT .......................................................... 29
A. Undecidability and the Halting Problem ................................... 30
B. The Halting Problem Applied to Algorithmic
Transparency ........................................................................... 32
V. PRACTICAL SOLUTIONS FROM COMPUTER SCIENCE ...................... 35
A. Testing and Evaluating Algorithms ............................................ 36
1. White-Box Testing .................................................................. 37
2. Black-Box Testing .................................................................. 38
3. A Third Way: Ex-Post Analysis and Oversight ...................... 39
B. Dynamic Systems and the Limits of Ex-Post Testing ................. 41
VI. A TAXONOMY OF POTENTIAL SOLUTIONS ................................... 42
A. Public Systems ............................................................................ 43
B. Private Systems .......................................................................... 45
1. Explicitly Regulated Industries ............................................... 46
2. Building Trust: Implicitly Regulated Industries or
Activities ........................................................................... 48
3. The Challenge of Dynamic Systems ....................................... 49
* Associate Professor of Law and Ethics, Georgia Institute of Technology, Scheller College
of Business; J.D., Yale Law School; Affiliated Fellow, Yale Law Information Society Project;
former Academic Research Counsel, Google, Inc. I, and this Article, have benefitted from dis-
cussions with and input from Solon Barocas, Ariel Feldman, Brett Frischmann, Andrew Selbst,
and Peter Swire, and from attendees at Privacy Law Scholars Conference, 2016 at George
Washington University Law and at the Law and Ethics of Big Data Colloquium at University of
Indiana, Bloomington, Kelley School of Business. I thank Jason Hyatt for excellent research
assistance. This Article was supported in part by summer research funding from the Scheller
College of Business and an unrestricted gift to the Georgia Tech Research Institute by Google,
Inc. The views expressed herein are those of the author alone and do n ot necessarily reflect the
view of those who helped with and supported this work.
** Postdoctoral Research Scholar, UC Berkeley School of Information.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40

Partial preview of the text

Download Trust But Verify: A Guide to Algorithms and the Law and more Study notes Algorithms and Programming in PDF only on Docsity!

TRUST BUT VERIFY:

A G UIDE TO ALGORITHMS AND THE LAW

Deven R. Desai * and Joshua A. Kroll **

TABLE OF CONTENTS

I. INTRODUCTION .................................................................................. 2

II. ALGORITHMS : T HE C ONCERNS ........................................................ 6

A. Transparency and Accountability: Two Complementary

Views .......................................................................................... 7

B. Algorithms, Public Sector Concerns .......................................... 12

C. Algorithms, Private Sector Concerns ........................................ 16

III. ALGORITHMS: A PRIMER .............................................................. 23

IV. T O HALT OR NOT T O HALT .......................................................... 29

A. Undecidability and the Halting Problem ................................... 30

B. The Halting Problem Applied to Algorithmic

Transparency ........................................................................... 32

V. PRACTICAL SOLUTIONS FROM C OMPUTER SCIENCE ...................... 35

A. Testing and Evaluating Algorithms ............................................ 36

1. White-Box Testing .................................................................. 37

2. Black-Box Testing .................................................................. 38

3. A Third Way: Ex-Post Analysis and Oversight ...................... 39

B. Dynamic Systems and the Limits of Ex-Post Testing ................. 41

VI. A T AXONOMY OF POTENTIAL SOLUTIONS ................................... 42

A. Public Systems ............................................................................ 43

B. Private Systems .......................................................................... 45

1. Explicitly Regulated Industries ............................................... 46

2. Building Trust: Implicitly Regulated Industries or

Activities ........................................................................... 48

3. The Challenge of Dynamic Systems ....................................... 49

  • Associate Professor of Law and Ethics, Georgia Institute of Technology, Scheller College of Business; J.D., Yale Law School; Affiliated Fellow, Yale Law Information Society Project; former Academic Research Counsel, Google, Inc. I, and this Article, have benefitted from dis- cussions with and input from Solon Barocas, Ariel Feldman, Brett Frischmann, Andrew Selbst, and Peter Swire, and from attendees at Privacy Law Scholars Conference, 2016 at George Washington University Law and at the Law and Ethics of Big Data Colloquium at University of Indiana, Bloomington, Kelley School of Business. I thank Jason Hyatt for excellent research assistance. This Article was supported in part by summer research funding from the Scheller College of Business and an unrestricted gift to the Georgia Tech Research Institute by Google, Inc. The views expressed herein are those of the author alone and do not necessarily reflect the view of those who helped with and supported this work.

** Postdoctoral Research Scholar, UC Berkeley School of Information.

2 Harvard Journal of Law & Technology [Vol. 31

C. Legislative Changes to Improve Accountability ........................ 55

VII. C ONCLUSION ............................................................................... 63

I. I NTRODUCTION

ACCORDING TO MY DEFINITION , A NUMBER IS COMPUTABLE IF ITS

DECIMAL CAN BE WRITTEN DOWN BY A MACHINE.

— ALAN TURING^1

IN 1953, HENRY R ICE PROVED THE FOLLOWING EXTREMELY POWERFUL

THEOREM , WHICH ESSENTIALLY STATES THAT EVERY INTERESTING

QUESTION ABOUT THE LANGUAGE ACCEPTED BY A TURING MACHINE IS

UNDECIDABLE.

— J EFF ERICKSON^2

THE NEXT TIME YOU HEAR SOMEONE TALKING ABOUT ALGORITHMS ,

REPLACE THE TERM WITH “GOD ” AND ASK YOURSELF IF THE MEANING

CHANGES. OUR SUPPOSEDLY ALGORITHMIC CULTURE IS NOT A

MATERIAL PHENOMENON SO MUCH AS A DEVOTIONAL ONE , A

SUPPLICATION MADE TO THE COMPUTERS PEOPLE HAVE ALLOWED TO REPLACE GODS IN THEIR MINDS, EVEN AS THEY SIMULTANEOUSLY CLAIM THAT SCIENCE HAS MADE US IMPERVIOUS TO RELIGION.

— IAN B OGOST^3

Someone is denied a job.^4 A family cannot get a loan for a car or a

house.^5 Someone else is put on a no-fly list.^6 A single mother is denied

federal benefits.^7 None of these people knows why that happened other

than the decision was processed through some software.^8 Someone

commandeers a car, controls its brakes, and even drives away.^9 A car

  1. A. M. Turing, On Computable Numbers, with an Application to the Entscheidungsprob- lem , 42 P ROC. LONDON MATHEMATICAL S OC’Y 230, 230 (1936).
  2. J EFF ERICKSON , MODELS OF C OMPUTATION 10 (2015) (ebook).
  3. Ian Bogost, The Cathedral of Computation , THE ATLANTIC (Jan. 15, 2015), http://www.theatlantic.com/technology/archive/2015/01/the-cathedral-of-computation/384300/ [https://perma.cc/AA6T-3FWV].
  4. See, e.g. , F RANK P ASQUALE , THE B LACK BOX SOCIETY: THE SECRET ALGORITHMS THAT C ONTROL MONEY AND INFORMATION 34–35 (2015) (describing use of software and online data to make hiring decisions).
  5. See, e.g. , id. at 4–5 (discussing use of predictive analytics in credit scoring and loan deci- sions).
  6. See, e.g. , Danielle Keats Citron, Technological Due Process , 85 WASH. U. L. R EV. 1249, 1256–57 (2008).
  7. See, e.g. , id.
  8. See, e.g. , PASQUALE , supra note 4 , at 4–5 (explaining that one “will never understand ex- actly how [one’s credit score] was calculated”); infra Part II.
  9. At least two groups have shown ways to take over a Tesla and open its doors, open its sun- roof, and enable keyless driving so the car could be stolen. See Davis Z. Morris, Tesla Stealing Hack Is About Much More than Tesla , F ORTUNE (Nov. 26, 2016), http://fortune.

4 Harvard Journal of Law & Technology [Vol. 31

learning (“ML”) (an area of computer science that uses the automated

discovery of correlations and patterns to define decision policies) might

allow those who use such techniques to wield power in ways society

prohibits or should disfavor, but which society would not be able to de-

tect.^18 Further, if a computer yields undesired results, its programmers

may say that the system was not designed to act that way.^19

The standard solution to this general problem is a call for transpar-

ency, which in this context has been called “algorithmic transparency.”^20

We argue that although the problems are real, the proposed solution will

not work for important computer science reasons. Nonetheless there is,

and we offer, a way to mitigate these problems so that society can con-

tinue to benefit from software innovations.

Put simply, current calls for algorithmic transparency misunderstand

the nature of computer systems. This misunderstanding may flow in part

from the religious, devotional culture around algorithms, where algo-

rithms might as well be God.^21 Both critics and advocates can stray into

uncritical deference to the idea that big data and the algorithms used to

process the data are somehow infallible science. We believe this prob-

lem is aggravated because although algorithms are decidedly not mysti-

cal things or dark magic, algorithms are not well understood outside the

technical community.^22

Put differently, transparency is a powerful concept and has its place.

After all, who can argue against sunlight? And yet, to an extent, we do

exactly that, because from a technical perspective, general calls to ex-

pose algorithms to the sun or to conduct audits will not only fail to de-

liver critics’ desired results but also may create the illusion of clarity in

  1. See, e.g. , F ED. TRADE C OMM ’N , B IG DATA : A TOOL FOR INCLUSION OR EXCLUSION? 1 (2016); C TR. FOR INTERNET & HUMAN R IGHTS , supra note 16 , at 1; PASQUALE, supra note 4 , at
  2. Cf. Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact , 104 C ALIF. L. R EV. 671, 674–75 (2016) (explaining that algorithms may unintentionally increase discrimina- tion because of problems with data mining).
  3. S ee, e.g. , Katherine Noyes, The FTC Is Worried About Algorithmic Transparency, and You Should Be Too , PC WORLD (Apr. 9, 2015, 8:36 AM), http://www.pcworld.com/ article/2908372/the-ftc-is-worried-about-algorithmic-transparency-and-you-should-be- too.html [https://perma.cc/N3Z2-5M3E] (discussing Christian Sandvig’s view that transparency may not be viable because of the complexity of some algorithms and the data needed to test the algorithms). For Sandvig’s position on algorithmic transparency, see Christian Sandvig et al., Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Plat- forms , (May 22, 2014), http://www-personal.umich.edu/~csandvig/ research/Auditing%20Algorithms%20--%20Sandvig%20--%20ICA%202014% 20Data%20and%20Discrimination%20Preconference.pdf [https://perma.cc/GJS4-YWP3] (us- ing “social scientific study” auditing to investigate algorithmically driven platforms).
  4. See Bogost, supra note 3.
  5. See id. (“The next time you hear someone talking about algorithms, replace the term with ‘God’ and ask yourself if the meaning changes. Our supposedly algorithmic culture is not a material phenomenon so much as a devotional one.”); see also Joshua A. Kroll et al., Accounta- ble Algorithms , 165 U. P A. L. R EV. 633, 640 n.14 (2016) (“The term ‘algorithm’ is assigned disparate technical meaning in the literatures of computer science and other fields... .”).

No. 1] Trust But Verify 5

cases where clarity is not possible.^23 For example, as discussed infra ,

fundamental limitations on the analysis of software meaningfully limit

the interpretability of even full disclosures of software source code. This

Article thus examines the idea of algorithmic transparency, offers a pri-

mer on algorithms as a way to bridge this gap, and presents concrete

options for managing problems automated decision-making presents to

society.

This Article begins with a discussion of the law and policy concerns

over software systems that have been raised so far and some of the pro-

posed approaches to addressing these concerns. This discussion shows

that there are many different issues at play, and many of those issues are

proxies for concerns about power and inequality in general, not software

specifically. After setting out an understanding of the claimed problems,

the Article turns to some fundamental questions about computer science,

such as what an algorithm is and whether policy can be general enough

to cover all software in the same way.^24 Having set out a brief primer on

the underlying computer science, the Article addresses the question of

determining what a piece of software will do when it is run. It turns out

that it is impossible to determine this reliably and for all programs. With

that in mind, the Article reviews the way in which computer scientists

have addressed this problem. Using that foundation, we offer recom-

mendations on how to regulate public and private sector uses of soft-

ware, and propose a legislative change to protect whistleblowers and

allow a public interest cause of action as a way to aid in increasing de-

tection of overt misdeeds in designing software. In short, a better under-

standing of how programs work and how computer scientists address the

  1. It is common for open-source advocates to cite “Linus’s Law” — the dictum that “[g]iven enough eyeballs, all bugs are shallow” — meaning that transparency of code to a suffi- cient number of experts implies that any problem will seem obvious to someone and can be remedied. ERIC S. R AYMOND , THE C ATHEDRAL AND THE B AZAAR 9 (1999). However, many pieces of open-source software struggle to get the attention of enough skilled reviewers, creat- ing what has been called a “major eyeball shortage.” Edward W. Felten & Joshua A. Kroll, Heartbleed Shows Government Must Lead on Internet Security , S CI. AM. (July 1, 2014), https://www.scientificamerican.com/article/heartbleed-shows-government-must-lead- on-internet-security1/ [https://perma.cc/X7WD-CUZJ]. The much-publicized “Heartbleed” vulnerability provides an object lesson: a trivially simple coding error in a widely used security tool exposed the private information of the majority of web servers on the Internet, including encryption keys, website passwords, and sensitive user information, for several widely used web applications. For a detailed account, see Zakir Durumeric et al., The Matter of Heartbleed , 14 ACM INTERNET MEASUREMENT CONF. 475 (2014). Contrast the situation in systems that use machine learning, where the rule that is being applied by a model may not be understandable, even if the system’s developer knows that the model is performing well. See Jatinder Singh et al., Responsibility & Machine Learning: Part of a Process , (Oct. 27, 2016), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2860048 [https://perma.cc/ EDG2-TNTF] (“[A]lgorithmic selection not only impacts the quality of the ML model, but the degree to which the inner workings of the ML algorithm and learned model can be interpreted and controlled depends on the technique used.” (emphasis omitted)).
  2. See, e.g. , Sandvig et al., supra note 20 , at 3.

No. 1] Trust But Verify 7

that use computers. Despite commenters’ different methods and con-

cerns, transparency is often raised as a key part of managing this new

order, because one “cannot access critical features of [its] decision-

making processes.”^30 And yet consensus on what sort of scrutiny is

needed, whether different areas affected by computers require different

solutions, and whether software, other factors, or both are the cause of

the claimed problems, is lacking. These issues arise in both the public

and private sector context. We start by examining transparency and ac-

countability from a legal-political view and a computer science view,

proceed to some examples of the public sector concerns, and then turn to

private sector ones.

A. Transparency and Accountability: Two Complementary Views

Both legal-political and computer science scholars wish to ensure

that automated decision systems are not enabling misdeeds or generating

undesired outcomes. Both fields use the terms transparency and account-

ability, but have different meanings and functions for them. This vocab-

ulary clash can muddy the understanding of what is desired as an end

result and of how to create systems to realize whatever end result is

agreed upon. As such, this section parses the different, yet related, as-

pects of the terms.

There is a deep, unstated and powerful view in the law and much of

society in general; the builder of the proverbial better mousetrap will

know precisely how it was built and what will happen when one presses

a trigger or button in the invention.^31 The device will do the same thing

over and over until the springs wear out. The related presumption is that

if a layperson who did not build the mousetrap has the plans, that person,

perhaps with the aid of a hired expert, will be able to understand how the

mousetrap works and probe the plans for flaws.^32 The same, so reasons

the law, must be true of software. As we shall see, in many ways, it is,

and in some ways, it is not. Nonetheless, the key idea is that seeing the

  1. P ASQUALE , supra note 4 , at 17. The call for or desire to have transparency about the inner workings of automated decision systems as a way to resolve issues around outcomes from those systems can be strong. For example, Professor Latanya Sweeney has done work on racial dis- crimination and advertising. See Sweeney, supra note 13. As Professor Cynthia Dwork noted when interviewed about that work, “[t]he examples described in that paper raise questions about how things are done in practice.” See Claire Cain Miller, Algorithms and Bias, Q&A with Cyn- thia Dwork , N.Y. T IMES: THE UPS HOT (Aug. 10, 2015), https://www. nytimes.com/2015/08/11/upshot/algorithms-and-bias-q-and-a-with-cynthia-dwork.html (last visited Dec. 19, 2017). We note this point only to indicate the draw of transparency, not to argue that Professor Sweeney advocates one way or the other on that strategy.
  2. Cf. DAVID NYE , TECHNOLOGY MATTERS: Q UESTIONS TO LIVE WITH 162 (2006) (“By c. 1900 [people] increasingly demanded scientific explanations.... Engineers could explain how a bridge failed or what component of a steam engine was responsible for a burst boiler.”)
  3. Id. (“Formerly inscrutable events became legible to the safety engineer, the tort lawyer, the insurance agent, and the judge.”).

8 Harvard Journal of Law & Technology [Vol. 31

internals of a system leads to understanding of the workings of that sys-

tem and the consequences associated to the system’s operation. This is

the core of the law’s conception of transparency, and it is deployed to

serve many goals.

Transparency has been proposed as a solution to mitigating possible

undesired outcomes from automated decision-making. Even before to-

day’s fascination with big data, algorithms, and automated systems, legal

scholars such as Paul Schwartz and Danielle Citron identified important

ways in which data processing and software used in the administrative

state can undermine or take away due process rights.^33 A related fear is

that the human designer of a program could have bad intent and seek to

discriminate, suppress speech, or engage in some other prohibited act.^34

Transparency in this context is the claim that someone “ought to be able

to ‘look under the hood’ of highly advanced technologies like... algo-

rithms”^35 as a way to police such behavior.^36 It is the idea that with the

plans to the mousetrap, one can identify flaws, willful errors, and per-

haps ferret out undesired and possibly unintended results such as dis-

crimination in a hiring decision.^37 Thus, different critics audit parts of

systems, decry apparent discrimination, and want to hold someone re-

sponsible for bad outcomes.^38 This approach is tantamount to saying that

the way to get proof that the algorithm is not designed to engage in, nor

has parts of it that lead to, discrimination or other undesired or prohibit-

ed acts is to review its internals.^39 From this proof, then, comes the abil-

ity to hold the system’s creator or operator accountable.

Put differently, if one can see into the system and identify bad be-

haviors and outcomes, one can hold someone accountable in the legal-

political sense of the word. Accountability as a legal-political concept is

about openness regarding the way government operates so that the peo-

ple may know what its representatives are doing, hold government re-

sponsible, and participate in government by voting, expressing their

views to representatives and regulators, or influencing policymaking in a

  1. Paul Schwartz, Data Processing and Government Administration: The Failure of the American Legal Response to the Computer , 43 H ASTINGS L.J. 1321, 1343–74 (1992); Citron, supra note 6 , at 1281–88 (discussing how automation threatens citizens’ due process rights such as notice and opportunity to be heard).
  2. See infra Part II.B.
  3. P ASQUALE , supra note 4 , at 165; cf. Citron, supra note 6 , at 1308 (“Automated systems must be designed with transparency and accountability as their primary objectives, so as to prevent inadvertent and procedurally defective rulemaking.... [V]endors should release sys- tems’ source codes to the public. Opening up the source code would reveal how a system works.”).
  4. But see Noyes, supra note 20.
  5. Cf. Barocas & Selbst, supra note 19 , at 694 (arguing that “[c]urrent antidiscrimination law is not well equipped to address the cases of discrimination stemming from” data mining and related algorithmic practices in employment).
  6. See infra Section II.B.
  7. See infra Section II.B.

10 Harvard Journal of Law & Technology [Vol. 31

socially important computer systems have a large range of possible in-

puts and outputs, social science auditing methods can only test “a small

subset of those potential inputs.”^50 As a legal matter, using such methods

to determine whether a prohibited practice has occurred to an actionable

extent, presents problems, because it is difficult to measure which inputs

are the important ones to test.^51

In addition, handing over code often will not enable the political ac-

countability results those in favor of so-called algorithmic transparency

desire.^52 For example, simply disclosing or open-sourcing source code

does nothing to show that the disclosed software was used in any partic-

ular decision unless that decision can be perfectly replicated from the

disclosures. And that is rarely the case. With this in mind, we turn to the

computer science view of accountability.

Computer science accountability, by contrast, is a technical concept

about making sure that software produces evidence allowing oversight

and verification of whether it is operating within agreed-upon rules.^53

For example, if one cannot determine whether a credit bureau adhered to

the Fair Credit Reporting Act’s restrictions on data gathering and use,

one would have trouble detecting whether the credit bureau disobeyed

the law.^54 Similarly, if one wishes to ensure that a system for counting

votes or allocating visas in a lottery is doing what it is supposed to do,

one needs a meaningful technical way to look under the hood.^55 Thus,

“technical accountability” requires that systems generate reliable evi-

dence to verify the system functioned correctly. Such evidence must

have integrity, which in this context means the production of “a tamper-

evident record that provides non-repudiable evidence” of relevant ac-

tions by the automated system.^56 Such evidence would provide records

of what actions were taken and why, with a focus on how that evidence

will be used to hold the system’s creators or operators accountable for

those actions. Of course, evidence that satisfies the requirements of

technical accountability by itself does not provide legal-political ac-

  1. Kroll et al., supra note 22 , at 650.
  2. See infra Part IV.
  3. See infra Part IV.
  4. See, e.g. , Andreas Haeberlen, Petr Kuznetsov & Peter Druschel, PeerReview: Practical Accountability for Distributed Systems , 41 ACM SIGOPS OPERATING S YS. R EV. 175, 175 (2007) (“[A]n accountable system maintains a tamper-evident record that provides non- repudiable evidence of all nodes’ actions. Based on this record, a faulty node whose observable behavior deviates from that of a correct node can be detected eventually. At the same time, a correct node can defend itself against any false accusations.”); Kroll et al., supra note 22 , at 662–65.
  5. Cf. Daniel J. Weitzner et al., supra note 40 , at 86 (advocating for technical architecture that enables detecting whether someone has complied with laws governing data use such as in the credit bureau context).
  6. See, e.g. , Halderman, supra note 11.
  7. Haeberlen et al., supra note 53 , at 175.

No. 1] Trust But Verify 11

countability.^57 Rather, technical accountability enables legal-political

accountability by providing a way to understand whether, how, to what

extent, and why misdeeds occurred, as well as who (or what part of a

system) is responsible for them. It is then up to political and legal pro-

cesses to use that evidence to hold actors responsible, meting out pun-

ishments when warranted. And as Kroll et al. note, where rules cannot

be agreed upon in advance because of the necessary ambiguities in poli-

cymaking, oversight may be necessary as the mechanism for defining the

specific contours of rules ex post, even as system designers do their best

to comply with their view of the rules ex ante.^58

In short, technical accountability is a necessary step to enable politi-

cal accountability. For it is only after one has verifiable evidence, rele-

vant to the inquiry at hand, that one can have the possibility of holding

both public and private actors who use automated systems responsible

for their actions.^59

None of the above means those who use computerized decision-

making are ungovernable, but it does require that we understand what is

and is not possible when we seek to regulate or monitor the use of these

technologies.^60 Many of the current calls for transparency as a way to

regulate automation do not address such limits, and so they may come

up short on providing the sort of legal-political accountability they de-

sire, and which we also support.^61 Instead, as software (and especially

machine learning systems, which separate the creation of algorithms and

rules from human design and implementation) continues to grow in im-

portance, society may find, and we argue, that identifying harms, prohib-

iting outcomes, and banning undesirable uses is a more promising

path.^62 In addition, in some cases, we argue that society may require that

software be built to certain specifications that can be tested or verified.

  1. See Weitzner, et al., supra note 40 , at 84 (“Transparency and accountability make bad acts visible to all concerned. However, visibility alone does not guarantee compliance.”).
  2. Kroll et al., supra note 22 , at 678.
  3. See Weitzner, et al., supra note 40 , at 84 (“[W]e are all aware that we may be held ac- countable through a process that looks back through the records of our actions and assesses them against the rules.”).
  4. See infra Parts III, IV.
  5. As one group of computer scientists has noted within machine learning, “[s]ome ML al- gorithms are more amenable to meaningful inspection... and management than others.” Singh et al., supra note 23 , at 4 (offering that decision tree, naïve Bayes, and rule learners were the most interpretable, [ k -nearest neighbors] was in the middle, and neural networks and support vector machines were the least interpretable).
  6. Cf. FED. TRADE COMM’N, supra note 18 , at 5–12 (acknowledging potential beneficial and negative outcomes from using data analytics and noting that it is the use of data and data analyt- ics in certain areas such as housing, credit, and employment that triggers concerns and potential liability, not the use of data analytics alone).

No. 1] Trust But Verify 13

and data processing in that one needs to show a “connection between a

particular decision, given the factual context, and the accomplishment of

one or more of the decision maker’s goals.”^70 The dignity element re-

quires that those who are subject to such a process know or understand

what reasons are behind a decision.^71 Without that knowledge or under-

standing those subject to the decision-making process lose self-worth,

and over time the legitimacy of the system will be in doubt, because of

the lack of understanding and loss of dignity.^72

Danielle Citron’s work also calls out the way that computers have

been used in the administrative state, focusing on due process con-

cerns.^73 She describes the “automated administrative state”^74 as using

software to determine whether someone should receive “Medicaid, food

stamp, and welfare” benefits, be on a no fly list, or be identified as ow-

ing child support.^75 According to Citron, “[a]utomation jeopardizes the

due process safeguards owed individuals and destroys the twentieth-

century assumption that policymaking will be channeled through partic-

ipatory procedures that significantly reduce the risk that an arbitrary rule

will be adopted.”^76

Although these scholars use different metrics to argue that the use of

software and computers is a problem, both identify the problem sphere

as the administrative state.^77 And both Schwartz and Citron look to

transparency as part of how to address whether the state uses data and

software-based processes in a way that hinders the ability to know what

is happening within the system.^78 To be clear, given the nature of the

administrative state, both scholars are correct that transparency is a nor-

mal, required part of due process. That is why Citron’s point-by-point

examination of the Administrative Procedure Act (“APA”), the Freedom

  1. MASHAW, supra note 46 , at 49.
  2. See Schwartz, supra note 33 , at 1348–49.
  3. See id.
  4. Citron, supra note 6 , at 1256–57.
  5. Id. at 1281.
  6. Id. at 1256–57.
  7. Id. at 1281.
  8. Scholars have examined software and accountability in the computer science context and found that the administrative state is a prime example of where algorithmic governance is need- ed. See Kroll et al., supra note 22 at 674–76 (using government visa lottery programs as an example where the use of algorithms intersects with the application of specific rules for deci- sion-making that affect individual rights); see also Michael Veale, Logics and Practices of Transparency and Opacity in Real-World Applications of Public Sector Machine Learning , 4 WORKSHOP ON F AIRNESS ACCOUNTABILITY & TRANSPARENCY MACHINE LEARNING , at 2 (2017), https://arxiv.org/pdf/1706.09249.pdf [https://perma.cc/8X2L-RXZE]. (explaining how developers of public-sector machine learning tools use transparency and accountability measures to build consensus and approval for those tools among both colleagues and people at large).
  9. Schwartz, supra note 33 , at 1375 (calling for “[t]he maintenance of transparent infor- mation processing systems”); Citron, supra note 6 , at 1295 (noting lack of ability for “meaning- ful review” of rules and system put in place to deliver administrative state services).

14 Harvard Journal of Law & Technology [Vol. 31

of Information Act (“FOIA”), and similar state laws is powerful.^79 The

APA requires that new rules undergo notice and comment,^80 and FOIA

requires that the public have access to “basic information about the con-

duct of agencies.”^81 But opaque code and the process behind developing

code challenge the way in which “procedural due process and formal

and informal rulemaking provided a common structure for debating and

addressing concerns about the propriety of administrative actions.”^82

Thus, these problems force the question of what is, as Citron puts it,

“Technological Due Process?” Citron offers that “[a]utomated systems

must be designed with transparency and accountability as their primary

objectives, so as to prevent inadvertent and procedurally defective rule-

making.”^83 Of late, legislators have started to look to public participation

before software is built for public systems and to open code to facilitate

“algorithmic audits” so people can test how an algorithm operates.^84

As discussed infra , we hope to add to the idea of accountability, and

to what Jenna Burrell has explained are the problems of opacity that

arise for algorithms that operate at a certain scale of application and the

limits of interpretability that go with that scale.^85 Specifically, we seek

to challenge whether disclosure of software source code and data alone

facilitates debating the function of that code. In that sense, Mashaw,

Schwartz, and Citron, in different, connected ways, raise deep questions

about whether and how the use of software by the state is compatible

with due process.

Two other examples, voting machines and the auto industry, illus-

trate a different, but related, public sector concern: verifying that a sys-

tem is accurate in implementing its goals and works as desired. Voting

machines track votes, and so the process in which the machines are used

must be accurate about at least four things. First, the process must be

accurate about whether someone voted. That is, one might try to hijack

an election by making it seem like someone, or many people voted,

when in fact they never voted at all. Typically, humans handle this step

by asking voters to sign logbooks. Second, voting machines themselves

need to verify that the person voting was eligible to vote before record-

  1. Citron, supra note 6 , at 1288.
  2. See id. at 1289–91.
  3. Id. at 1291.
  4. Id. at 1252.
  5. Id. at 1308.
  6. See, e.g. , Jim Dwyer, Showing the Algorithms Behind New York City Services , N.Y. TIMES (Aug. 24, 2017), https://www.nytimes.com/2017/08/24/nyregion/showing-the- algorithms-behind-new-york-city-services.html (last visited Dec. 19, 2017) (discussing a New York city councilman’s bill to mandate that computer code used for government decision mak- ing be open for inspection and testing).
  7. Jenna Burrell, How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms , B IG DATA & S OC’Y , Jan.–June 2016, at 5, 9, http://journals.sagepub.com/ doi/pdf/10.1177/2053951715622512 [https://perma.cc/UQG8-NL52].

16 Harvard Journal of Law & Technology [Vol. 31

Second, as Tesla and other car makers offer cars that are networked

so that software can be updated after the car is purchased, the integrity

and inviolability of software increases in importance. An automaker may

claim that a car’s logs show that the car performed as promised, but reg-

ulators will need ways to verify that the software and logs have not been

altered.^91 For example, Tesla now updates its cars regularly, claiming

that these updates improve performance such as the range its cars can

drive on a full battery charge.^92 Whether those updates are accurate,

comport with the company’s public claims of performance, and whether

they adhere to safety regulations need to be tracked.^93 Furthermore, as

self-driving or autonomous cars continue to be put on the road and

evolve, regulating their software will be even more important.^94 For ex-

ample, if there is a standard for the way a self-driving car brakes or

avoids another car or avoids a person, what happens when the automaker

pushes an update to the fleet? How can regulators be sure that the updat-

ed software complies with the standard? The automaker’s change affects

not only the user but also others on the road. The automaker may, in

good faith, assert that the update is within the standards already ap-

proved, but society, and more specifically the regulating agency, needs a

way to verify that claim. Further, regulators may want to ensure that on-

ly approved, standards-compliant updates can be installed in vehicles

already on the road.

C. Algorithms, Private Sector Concerns

Although the private sector is regulated differently than the public

sector, calls for transparency as it relates to software-based decision-

making in the private sector abound. For example, in light of the im-

portance of recent technologies, Frank Pasquale has argued that the code

for important software such as Google’s search algorithm or a broadband

carrier’s method for network management “should be transparent to

some entity capable of detecting” the potential misdeeds or harms these

services may create.^95 In the same vein, other studies and investigations

  1. See infra Section V.A. 3 (explaining audit logs and verification of such logs).
  2. See Davies, supra note 12.
  3. Issues with cars’ software updates and security have been revealed in at least two cases. See Morris, supra note 9 (reporting that a security group took over a Tesla, opened its doors, opened its sunroof, and enabled keyless driving so the car could be driven away or stolen); Peterson, supra note 9 (describing researchers able to take over the braking system and more from 12 miles away).
  4. See, e.g. , Bryant Walker Smith, Automated Driving and Product Liability , 2017 MICH. S T. L. R EV. 1, 45–49 (examining product liability and compliance issues raised by software in self-driving cars).
  5. Frank Pasquale, Beyond Innovation and Competition: The Need for Qualified Transpar- ency in Internet Intermediaries , 104 NW. U. L. REV. 1, 166 (2010). Faced with the challenges of data processing and computation a quarter century ago, Paul Schwartz argued that a key factor in managing problems from those practices required “the establishment of a government body

No. 1] Trust But Verify 17

have identified a range of examples where software was part of unde-

sired or troubling outcomes and have called for methods to detect such

issues.

One important area of concern is whether certain software is ena-

bling or aggravating illegal discrimination on the basis of a protected

attribute such as race or gender. A study by Professor Latanya Sweeney

looked at online search and advertising to test whether a search for “ra-

cially associated names” returned “ads suggestive of an arrest record.”^96

The study rejected the hypothesis “that no difference exists” in the de-

livery of such ads because searches for “black-identifying first names”

yielded an ad for a company that sold public records and included the

word “arrest” in the ad text for “a greater percentage of ads... than

[searches] for white-identifying first names.”^97 According to Sweeney,

this finding intersects with discrimination problems, because when one

competes for “an award, a scholarship, an appointment, a promotion, or

a new job... or [is] engaged in any one of hundreds of circumstances

for which someone wants to learn more about you,” ads appear in online

searches.^98 Another study by Datta et al. on searches, webpage visitation

history, and advertising found that when ad preference settings were set

to female, a user saw “fewer instances of an ad related to high-paying

jobs than [when preferences were set]... to male.”^99 The specific ad

was for a career coaching service promising to aid someone in obtaining

a job that paid more than $200,000 a year.^100 These studies have identi-

fied some outcomes that may not meet the legal^101 or normative defini-

capable of studying the effects and implications [of software-based decisions].” Schwartz, su- pra note 33 , at 1379. That approach was part of addressing state actions, and the approach looked at transparency as a feature to limit government action and to make the system “open and understandable to the data subject.” Id. at 1376. The connection between Pasquale and Schwartz is conceptual: both seek transparency as a way to enable a third party to aid in scruti- ny and to aid the ability to challenge a practice.

  1. Sweeney, supra note 13 , at (^) 52.
  2. Id. at 51.
  3. Id. at 44.
  4. Amit Datta, Michael Carl Tschantz & Anupam Datta, Automated Experiments on Ad Pri- vacy Settings , PROC. ON P RIVACY ENHANCING TECHS., Apr. 2015, at 92, 92.
  5. Id. at 102.
  6. As Peter Swire has observed in an initial investigation of online, data-driven marketing, several statutes prohibit discrimination in specific sectors such as lending, housing, and em- ployment. P ETER S WIRE , LESSONS FROM F AIR LENDING FOR F AIR MARKETING AND BIG DATA 1–4 (2014). These statutes apply to online practices but how they apply for each sector and which practices within each sector are prohibited is not settled. See id. at 8–10. Sweeney’s study may not fit into these sectoral approaches as they appear to be about an indirect, yet possibly powerful, way to affect hiring decisions. That is, the ads at issue in Sweeney’s study are not about an employment opportunity; rather they may affect an employer’s impression of or deci- sion about someone without being the explicit criteria on which the decision is made. In con- trast, the employment ads in the other study fall under Title VII, which governs employment ads. Yet, as Swire explains even when an advertisement falls under a statute, “what would meet the statutory requirement that the advertisement ‘indicates any preference, limitation, or dis- crimination’ concerning a protected class” is unclear. Id. at 9 (quoting 42 U.S.C. § 3604(c)

No. 1] Trust But Verify 19

critiques about opacity, fairness, and due process as those who currently

question the use of algorithms make.

Academics are not the only ones to think about software and society.

Journalists have also investigated the use of automation with similar re-

sults and conclusions. Rather than investigating questions about ad net-

works, where several actors are involved, and each may or may not be

responsible for outcomes, journalists and computer scientists have

looked at personalization of commerce and search features to see wheth-

er a single actor’s implementation of an algorithm poses problems.

An investigation by Wall Street Journal reporters found that the e-

commerce they examined lends itself to a legal practice known to econ-

omists as price discrimination — the practice of trying to match the price

for a good or service to specific market segments or people.^109 Several

companies “were consistently adjusting prices and displaying different

product offers based on a range of characteristics that could be discov-

ered about the user.”^110 For example, Staples, Inc., the office supply

company, charged different prices for the same item depending on where

Staples thought the consumer was.^111 Although the practice of altering

prices based on the geography of the shopper or whether a good is

online or in-store is common, the practice can reinforce inequality if it

allows retailers to charge lower prices to those who live in ZIP codes

with higher weighted average income and charge higher prices to those

in ZIP codes with lower weighted average income.^112 Even if one ac-

cepts the argument that a retailer accounts for different costs at local,

physical stores, costs associated with physical retail stores should not be

an issue if the orders are fulfilled and shipped from a central warehouse.

Although the outcomes of price discrimination would seem to indicate

that inequality could be reinforced, price discrimination is not illegal. If

personalization is used, however, by a credit card or other regulated fi-

nancial company to steer people to more expensive financial services

based on race, gender, or other protected class status, price discrimina-

tion becomes prohibited discrimination.^113 As such regulation is trig-

gered, it becomes necessary to understand the system.

  1. Jennifer Valentino-Devries, Jeremy Singer-Vine & Ashkan Soltani, Websites Vary Pric- es, Deals Based on Users’ Information , WALL S T. J. (Dec. 24, 2012), http://www.wsj.com/news/articles/SB10001424127887323777204578189391813881534 (last visited Dec. 19, 2017).
  2. Id. (“The Journal identified several companies, including Staples, Discover Financial Services, Rosetta Stone Inc. and Home Depot Inc., that [engaged in such activities].”).
  3. Id.
  4. Id.
  5. See S WIRE , supra note 101 , at 7–8 (discussing Fair Housing Act prohibition on “steer- ing”). Steering is the practice of “deliberately guiding loan applicants or potential purchasers toward or away from certain types of loans or geographic areas because of race.” FED. RESERVE B D ., F EDERAL F AIR LENDING R EGULATIONS AND S TATUTES : F AIR HOUSING ACT 3 (2016), https://www.federalreserve.gov/boarddocs/supmanual/cch/fair_lend_fhact.pdf [https://perma.cc/UB2T-ZGS8].

20 Harvard Journal of Law & Technology [Vol. 31

Another investigation by Nicholas Diakopoulos tried to test the al-

gorithms behind the autocomplete feature for Google and Bing’s search

services to see how each one handled searches for sensitive topics such

as “illicit sex” and “violence.”^114 At the time of the report, Bing’s auto-

complete did not offer autocomplete suggestions for “homosexual,” and

both Bing and Google did not offer autocomplete suggestions for a large

number of “110 sex-related words.”^115 According to the author, this

point raises the specter of censorship, because “we look to algorithms to

enforce morality.”^116

This position is puzzling because whether society should look to al-

gorithms or to a company’s manual choice over a blacklist for morality

enforcement is answered, “No,” by most who discuss the issue, includ-

ing us.^117 In addition, whether society truly “look[s] to algorithms to

enforce morality” is unclear.^118 One may believe algorithms should be

constructed to provide moral guidance or enforce a given morality. Al-

ternatively, one may claim that moral choices are vested with a system’s

users and that the system itself should be neutral, allowing all types of

use.^119 Either position demands certain outcomes from computer systems

such as search engines, and so defers to algorithms. That is the mistake.

In this context, the platforms at issue exercise discretion in organizing

and displaying information, and that organization flows from tools —

algorithms, blacklists, human moderators, and more — used in combina-

tion. In that sense, looking to algorithms or vesting agency with the

range of tools platforms use to enforce morality is a type of “worship”

that reverses the skepticism of the Enlightenment.^120 Asking algorithms

to enforce morality is not only a type of idolatry; it also presumes we

know whose morality they enforce and can define what moral outcomes

  1. Nicholas Diakopoulos, Sex, Violence, and Autocomplete Algorithms , S LATE (Aug. 2, 2013, 11:43 AM), http://www.slate.com/articles/technology/future_tense/2013/08/ words_banned_from_bing_and_google_s_autocomplete_algorithms.html (last visited Dec. 19, 2017).
  2. Id.
  3. Id.
  4. See, e.g. , Bogost, supra note 3 ; Deven R. Desai, Exploration and Exploitation: An Es- say on (Machine) Learning, Algorithms, and Information Provision , 47 LOY. U. C HI. L.J. 541, 578 (2015).
  5. Diakopolous, supra note 114.
  6. The debates around hate speech and filter bubbles capture this problem, as one has to choose what to show or not show. See Desai, supra note 117 , at 561–62 (discussing difficulties of determining who gets to decide what information users see). The same applies to the move to have platforms such as Facebook regulate fake news or hate speech as shown by Germany’s recent law that requires firms to remove and block such content or face fines up to fifty million Euros. See Germany Approves Plans to Fine Social Media Firms up to €50m , T HE GUARDIAN (June 30, 2017, 7:14 AM), https://www.theguardian.com/media/ 2017/jun/30/germany-approves-plans-to-fine-social-media-firms-up-to-50m [https://perma.cc/SK9D-V43W].
  7. See Bogost, supra note 3.