[Theory] 5/28 IDEAL Workshop: CS+Law: Definitions for Algorithms

Thu May 20 20:24:28 CDT 2021

Upcoming IDEAL Workshop:
CS+Law: Definitions for Algorithms

Please join us for an upcoming IDEAL Workshop: CS+Law: Definitions for
Algorithms
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=f7421d5fa0&e=01c4c9f9ea>
on
Friday, May 28th from 11:00 am - 3:00 pm CDT.

*About this series*

The IDEAL workshop series brings in experts on topics related to the
foundations of data science to present their perspective and research on a
common theme. This virtual workshop will feature four talks and discussion
sessions. Details about our speakers and their talks can be found below.

This workshop is part of the Special Quarter on Data Science and Law
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=97f8a0863d&e=01c4c9f9ea>,
co-organized by Northwestern Professors Jason D. Hartline
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=a78e99606b&e=01c4c9f9ea>
 and Daniel W. Linna Jr.
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=050cc0f43c&e=01c4c9f9ea>

*Synopsis*

There is great potential for and significant challenge to the use
algorithms in legal contexts.  Computer scientists and legal scholars need
to collaborate to determine appropriate definitions for algorithms that
capture notions like bias, fairness, justice, accountability,
explainability, verification, certification, auditability, contestability,
etc.  These definitions then need to be integrated into algorithm design
and analysis frameworks.  This workshop focuses on developing technical
definitions for concepts like these and the technical challenges for
algorithm design that they introduce.

*Logistics*

   - Date: Friday, May 28,11:00 am to 3:00 pm Central Time (Chicago time)
   - Location: Talks and General Q&A on Zoom; small-group discussions
   during intermissions on Gather.Town
   - Free Registration: Attendees must register
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=68ca9ce8ce&e=01c4c9f9ea>
to
   receive receive information for joining. Login information for Gather.Town
   will be provided via email.

*Speakers*

   - Ran Canetti
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=32604809db&e=01c4c9f9ea>
(Boston
   University)
   - Jonathan Shafer
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=4c500739a1&e=01c4c9f9ea>
(University
   of California, Berkeley)
   - Aloni Cohen
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=267c97851f&e=01c4c9f9ea>
(Boston
   University)
   - Omer Reingold
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=e8f9f49949&e=01c4c9f9ea>
    (Stanford University)

*Schedule*

   - *11:00-11:10*: Introduction (Jason Hartline and Dan Linna)
   - *11:10-11:40*: Ran Canetti
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=d73e06a661&e=01c4c9f9ea>

   - *11:40-12:10*: Jonathan Shafer
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=6a26157240&e=01c4c9f9ea>

   - *12:10-1:00*: Extended Q&A on gather.town
   - *1:00-1:30*: Aloni Cohen
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=ec3420c69c&e=01c4c9f9ea>
   - *1:30-2:00*: Omer Reingold
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=1dc3b1ff72&e=01c4c9f9ea>
   - *2:00-3:00*: Extended Q&A on gather.town

*Abstracts *

*Speaker:* Ran Canetti
*Title:* *Verification Dilemmas, Law, and the Promise of Zero-Knowledge
Proofs*
*Abstract:* Individuals expose personally identifying information to access
a website or qualify for a loan, undermining privacy and security. Firms
share proprietary information in dealmaking negotiations; if the deal
fails, the negotiating partner may use that information to compete.
Regulators that comply with public transparency and oversight requirements
can risk subjecting algorithmic governance tools to gaming that destroys
their efficacy. Litigants might have to reveal trade secrets in court
proceedings to prove a claim or defense. Such “verification dilemmas,” or
costly choices between opportunities that require the verification of some
fact, and risks of exposing sensitive information in order to perform
verification, appear across the legal landscape. Yet, existing legal
responses to them are imperfect. Legal responses often depend on ex post
litigation remedies that are prohibitively expensive for those most in
need, or that fail to address abuses of information entirely.

Zero-knowledge proofs (ZKPs)—a class of cryptographic protocols that allow
one party to verify a fact or characteristic of secret information without
revealing the actual secret—can help solve these verification dilemmas.
ZKPs have recently demonstrated their mettle, for example, by providing the
privacy backbone for the blockchain. Yet they have received scant notice in
the legal literature. This Article fills that gap by providing the first
deep dive into ZKPs’ broad relevance for law. It explains ZKPs’ conceptual
power and technical operation to a legal audience. It then demonstrates
how, and that, ZKPs can be applied as a governance tool to transform
verification dilemmas in multiple legal contexts. Finally, the Article
surfaces, and provides a framework to address, the policy issues implicated
by the potential substitution of ZKP governance tools in place of existing
law and practice.

*Speaker: *Jonathan Shafer
*Title: **Interactive Proofs for Verifying Machine Learning*
*Abstract:* We consider the following question: using a source of labeled
data and interaction with an untrusted prover, what is the complexity of
verifying that a given hypothesis is “approximately correct”? We study
interactive proof systems for PAC verification, where a verifier that
interacts with a prover is required to accept good hypotheses, and reject
bad hypotheses. Both the verifier and the prover are efficient and have
access to labeled data samples from an unknown distribution. We are
interested in cases where the verifier can use significantly less data than
is required for (agnostic) PAC learning, or use a substantially cheaper
data source (e.g., using only random samples for verification, even though
learning requires membership queries). We believe that today, when data and
data-driven algorithms are quickly gaining prominence, the question of
verifying purported outcomes of data analyses is very well-motivated. We
show three main results. First, we prove that for a specific hypothesis
class, verification is significantly cheaper than learning in terms of
sample complexity, even if the verifier engages with the prover only in a
single-round (NP-like) protocol. Moreover, for this class we prove that
single-round verification is also significantly cheaper than testing
closeness to the class. Second, for the broad class of Fourier-sparse
boolean functions, we show a multi-round (IP-like) verification protocol,
where the prover uses membership queries, and the verifier is able to
assess the result while only using random samples. Third, we show that
verification is not always more efficient. Namely, we show a class of
functions where verification requires as many samples as learning does, up
to a logarithmic factor.

*Speaker: *Aloni Cohen
*Title:* *Towards formalizing the GDPR’s notion of singling out*
*Abstract:* There is a significant conceptual gap between legal and
mathematical thinking around data privacy. The effect is uncertainty as to
which technical offerings meet legal standards. This uncertainty is
exacerbated by a litany of successful privacy attacks demonstrating that
traditional statistical disclosure limitation techniques often fall short
of the privacy envisioned by regulators. We define “predicate singling
out,” a type of privacy attack intended to capture the concept of singling
out appearing in the General Data Protection Regulation (GDPR). An
adversary predicate singles out a dataset x using the output of a
data-release mechanism M(x) if it finds a predicate p matching exactly one
row in x with probability much better than a statistical baseline. A
data-release mechanism that precludes such attacks is “secure against
predicate singling out” (PSO secure). We argue that PSO security is a
mathematical concept with legal consequences. Any data-release mechanism
that purports to “render anonymous” personal data under the GDPR must
prevent singling out and, hence, must be PSO secure. We analyze the
properties of PSO security, showing that it fails to compose. Namely, a
combination of more than logarithmically many exact counts, each
individually PSO secure, facilitates predicate singling out. Finally, we
ask whether differential privacy and kanonymity are PSO secure. Leveraging
a connection to statistical generalization, we show that differential
privacy implies PSO security. However, and in contrast with current legal
guidance, kanonymity does not: There exists a simple predicate singling out
attack under mild assumptions on the k-anonymizer and the data distribution.

*Speaker: *Omer Reingold
*Title:* *Outcome Indistinguishability*
*Abstract:* Prediction algorithms assign numbers to individuals that are
popularly understood as individual “probabilities”—what is the probability
of 5-year survival after cancer diagnosis?—and which increasingly form the
basis for life-altering decisions. Drawing on an understanding of
computational indistinguishability developed in complexity theory and
cryptography, we introduce Outcome Indistinguishability. Predictors that
are Outcome Indistinguishable yield a generative model for outcomes that
cannot be efficiently refuted on the basis of the real-life observations
produced by Nature. We investigate a hierarchy of Outcome
Indistinguishability definitions, whose stringency increases with the
degree to which distinguishers may access the predictor in question. Our
findings reveal that Outcome Indistinguishability behaves qualitatively
differently than previously studied notions of indistinguishability. First,
we provide constructions at all levels of the hierarchy. Then, leveraging
recently-developed machinery for proving average-case fine-grained
hardness, we obtain lower bounds on the complexity of the more stringent
forms of Outcome Indistinguishability. This hardness result provides the
first scientific grounds for the political argument that, when inspecting
algorithmic risk prediction instruments, auditors should be granted oracle
access to the algorithm, not simply historical predictions.
Hope to see you all there virtually!

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 517*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20210520/de47a898/attachment-0001.html>