[Theory] TODAY: 5/28 IDEAL Workshop: CS+Law: Definitions for Algorithms

Fri May 28 10:40:56 CDT 2021

Upcoming IDEAL Workshop:
CS+Law: Definitions for Algorithms

Please join us for an upcoming IDEAL Workshop: CS+Law: Definitions for
Algorithms
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=f7421d5fa0&e=01c4c9f9ea>
on
Friday, May 28th from 11:00 am - 3:00 pm CDT.

*About this series*

The IDEAL workshop series brings in experts on topics related to the
foundations of data science to present their perspective and research on a
common theme. This virtual workshop will feature four talks and discussion
sessions. Details about our speakers and their talks can be found below.

This workshop is part of the Special Quarter on Data Science and Law
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=97f8a0863d&e=01c4c9f9ea>,
co-organized by Northwestern Professors Jason D. Hartline
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=a78e99606b&e=01c4c9f9ea>
 and Daniel W. Linna Jr.
<https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=050cc0f43c&e=01c4c9f9ea>

*Synopsis*

There is great potential for and significant challenge to the use
algorithms in legal contexts.  Computer scientists and legal scholars need
to collaborate to determine appropriate definitions for algorithms that
capture notions like bias, fairness, justice, accountability,
explainability, verification, certification, auditability, contestability,
etc.  These definitions then need to be integrated into algorithm design
and analysis frameworks.  This workshop focuses on developing technical
definitions for concepts like these and the technical challenges for
algorithm design that they introduce.

*Logistics*

   - Date: Friday, May 28,11:00 am to 3:00 pm Central Time (Chicago time)
   - Location: Talks and General Q&A on Zoom; small-group discussions
   during intermissions on Gather.Town
   - Free Registration: Attendees must register
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=68ca9ce8ce&e=01c4c9f9ea>
to
   receive receive information for joining. Login information for Gather.Town
   will be provided via email.

*Speakers*

   - Ran Canetti
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=32604809db&e=01c4c9f9ea>
(Boston
   University)
   - Jonathan Shafer
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=4c500739a1&e=01c4c9f9ea>
(University
   of California, Berkeley)
   - Aloni Cohen
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=267c97851f&e=01c4c9f9ea>
(Boston
   University)
   - Omer Reingold
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=e8f9f49949&e=01c4c9f9ea>
    (Stanford University)

*Schedule*

   - *11:00-11:10*: Introduction (Jason Hartline and Dan Linna)
   - *11:10-11:40*: Ran Canetti
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=d73e06a661&e=01c4c9f9ea>

   - *11:40-12:10*: Jonathan Shafer
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=6a26157240&e=01c4c9f9ea>

   - *12:10-1:00*: Extended Q&A on gather.town
   - *1:00-1:30*: Aloni Cohen
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=ec3420c69c&e=01c4c9f9ea>
   - *1:30-2:00*: Omer Reingold
   <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=1dc3b1ff72&e=01c4c9f9ea>
   - *2:00-3:00*: Extended Q&A on gather.town

*Abstracts *

*Speaker:* Ran Canetti
*Title:* *Verification Dilemmas, Law, and the Promise of Zero-Knowledge
Proofs*
*Abstract:* Individuals expose personally identifying information to access
a website or qualify for a loan, undermining privacy and security. Firms
share proprietary information in dealmaking negotiations; if the deal
fails, the negotiating partner may use that information to compete.
Regulators that comply with public transparency and oversight requirements
can risk subjecting algorithmic governance tools to gaming that destroys
their efficacy. Litigants might have to reveal trade secrets in court
proceedings to prove a claim or defense. Such “verification dilemmas,” or
costly choices between opportunities that require the verification of some
fact, and risks of exposing sensitive information in order to perform
verification, appear across the legal landscape. Yet, existing legal
responses to them are imperfect. Legal responses often depend on ex post
litigation remedies that are prohibitively expensive for those most in
need, or that fail to address abuses of information entirely.

Zero-knowledge proofs (ZKPs)—a class of cryptographic protocols that allow
one party to verify a fact or characteristic of secret information without
revealing the actual secret—can help solve these verification dilemmas.
ZKPs have recently demonstrated their mettle, for example, by providing the
privacy backbone for the blockchain. Yet they have received scant notice in
the legal literature. This Article fills that gap by providing the first
deep dive into ZKPs’ broad relevance for law. It explains ZKPs’ conceptual
power and technical operation to a legal audience. It then demonstrates
how, and that, ZKPs can be applied as a governance tool to transform
verification dilemmas in multiple legal contexts. Finally, the Article
surfaces, and provides a framework to address, the policy issues implicated
by the potential substitution of ZKP governance tools in place of existing
law and practice.

*Speaker: *Jonathan Shafer
*Title: **Interactive Proofs for Verifying Machine Learning*
*Abstract:* We consider the following question: using a source of labeled
data and interaction with an untrusted prover, what is the complexity of
verifying that a given hypothesis is “approximately correct”? We study
interactive proof systems for PAC verification, where a verifier that
interacts with a prover is required to accept good hypotheses, and reject
bad hypotheses. Both the verifier and the prover are efficient and have
access to labeled data samples from an unknown distribution. We are
interested in cases where the verifier can use significantly less data than
is required for (agnostic) PAC learning, or use a substantially cheaper
data source (e.g., using only random samples for verification, even though
learning requires membership queries). We believe that today, when data and
data-driven algorithms are quickly gaining prominence, the question of
verifying purported outcomes of data analyses is very well-motivated. We
show three main results. First, we prove that for a specific hypothesis
class, verification is significantly cheaper than learning in terms of
sample complexity, even if the verifier engages with the prover only in a
single-round (NP-like) protocol. Moreover, for this class we prove that
single-round verification is also significantly cheaper than testing
closeness to the class. Second, for the broad class of Fourier-sparse
boolean functions, we show a multi-round (IP-like) verification protocol,
where the prover uses membership queries, and the verifier is able to
assess the result while only using random samples. Third, we show that
verification is not always more efficient. Namely, we show a class of
functions where verification requires as many samples as learning does, up
to a logarithmic factor.

*Speaker: *Aloni Cohen
*Title:* *Towards formalizing the GDPR’s notion of singling out*
*Abstract:* There is a significant conceptual gap between legal and
mathematical thinking around data privacy. The effect is uncertainty as to
which technical offerings meet legal standards. This uncertainty is
exacerbated by a litany of successful privacy attacks demonstrating that
traditional statistical disclosure limitation techniques often fall short
of the privacy envisioned by regulators. We define “predicate singling
out,” a type of privacy attack intended to capture the concept of singling
out appearing in the General Data Protection Regulation (GDPR). An
adversary predicate singles out a dataset x using the output of a
data-release mechanism M(x) if it finds a predicate p matching exactly one
row in x with probability much better than a statistical baseline. A
data-release mechanism that precludes such attacks is “secure against
predicate singling out” (PSO secure). We argue that PSO security is a
mathematical concept with legal consequences. Any data-release mechanism
that purports to “render anonymous” personal data under the GDPR must
prevent singling out and, hence, must be PSO secure. We analyze the
properties of PSO security, showing that it fails to compose. Namely, a
combination of more than logarithmically many exact counts, each
individually PSO secure, facilitates predicate singling out. Finally, we
ask whether differential privacy and kanonymity are PSO secure. Leveraging
a connection to statistical generalization, we show that differential
privacy implies PSO security. However, and in contrast with current legal
guidance, kanonymity does not: There exists a simple predicate singling out
attack under mild assumptions on the k-anonymizer and the data distribution.

*Speaker: *Omer Reingold
*Title:* *Outcome Indistinguishability*
*Abstract:* Prediction algorithms assign numbers to individuals that are
popularly understood as individual “probabilities”—what is the probability
of 5-year survival after cancer diagnosis?—and which increasingly form the
basis for life-altering decisions. Drawing on an understanding of
computational indistinguishability developed in complexity theory and
cryptography, we introduce Outcome Indistinguishability. Predictors that
are Outcome Indistinguishable yield a generative model for outcomes that
cannot be efficiently refuted on the basis of the real-life observations
produced by Nature. We investigate a hierarchy of Outcome
Indistinguishability definitions, whose stringency increases with the
degree to which distinguishers may access the predictor in question. Our
findings reveal that Outcome Indistinguishability behaves qualitatively
differently than previously studied notions of indistinguishability. First,
we provide constructions at all levels of the hierarchy. Then, leveraging
recently-developed machinery for proving average-case fine-grained
hardness, we obtain lower bounds on the complexity of the more stringent
forms of Outcome Indistinguishability. This hardness result provides the
first scientific grounds for the political argument that, when inspecting
algorithmic risk prediction instruments, auditors should be granted oracle
access to the algorithm, not simply historical predictions.
Hope to see you all there virtually!

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 517*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*

On Thu, May 20, 2021 at 8:24 PM Mary Marre <mmarre at ttic.edu> wrote:

>
> Upcoming IDEAL Workshop:
> CS+Law: Definitions for Algorithms
>
> Please join us for an upcoming IDEAL Workshop: CS+Law: Definitions for
> Algorithms
> <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=f7421d5fa0&e=01c4c9f9ea> on
> Friday, May 28th from 11:00 am - 3:00 pm CDT.
>
>
> *About this series*
>
> The IDEAL workshop series brings in experts on topics related to the
> foundations of data science to present their perspective and research on a
> common theme. This virtual workshop will feature four talks and discussion
> sessions. Details about our speakers and their talks can be found below.
>
> This workshop is part of the Special Quarter on Data Science and Law
> <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=97f8a0863d&e=01c4c9f9ea>,
> co-organized by Northwestern Professors Jason D. Hartline
> <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=a78e99606b&e=01c4c9f9ea>
>  and Daniel W. Linna Jr.
> <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=050cc0f43c&e=01c4c9f9ea>
>
>
> *Synopsis*
>
> There is great potential for and significant challenge to the use
> algorithms in legal contexts.  Computer scientists and legal scholars need
> to collaborate to determine appropriate definitions for algorithms that
> capture notions like bias, fairness, justice, accountability,
> explainability, verification, certification, auditability, contestability,
> etc.  These definitions then need to be integrated into algorithm design
> and analysis frameworks.  This workshop focuses on developing technical
> definitions for concepts like these and the technical challenges for
> algorithm design that they introduce.
>
> *Logistics*
>
>    - Date: Friday, May 28,11:00 am to 3:00 pm Central Time (Chicago time)
>    - Location: Talks and General Q&A on Zoom; small-group discussions
>    during intermissions on Gather.Town
>    - Free Registration: Attendees must register
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=68ca9ce8ce&e=01c4c9f9ea> to
>    receive receive information for joining. Login information for Gather.Town
>    will be provided via email.
>
> *Speakers*
>
>    - Ran Canetti
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=32604809db&e=01c4c9f9ea> (Boston
>    University)
>    - Jonathan Shafer
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=4c500739a1&e=01c4c9f9ea> (University
>    of California, Berkeley)
>    - Aloni Cohen
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=267c97851f&e=01c4c9f9ea> (Boston
>    University)
>    - Omer Reingold
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=e8f9f49949&e=01c4c9f9ea>
>     (Stanford University)
>
> *Schedule*
>
>    - *11:00-11:10*: Introduction (Jason Hartline and Dan Linna)
>    - *11:10-11:40*: Ran Canetti
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=d73e06a661&e=01c4c9f9ea>
>
>    - *11:40-12:10*: Jonathan Shafer
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=6a26157240&e=01c4c9f9ea>
>
>    - *12:10-1:00*: Extended Q&A on gather.town
>    - *1:00-1:30*: Aloni Cohen
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=ec3420c69c&e=01c4c9f9ea>
>    - *1:30-2:00*: Omer Reingold
>    <https://northwestern.us20.list-manage.com/track/click?u=3fc8e0df393510ea0a5b018e1&id=1dc3b1ff72&e=01c4c9f9ea>
>    - *2:00-3:00*: Extended Q&A on gather.town
>
> *Abstracts *
>
> *Speaker:* Ran Canetti
> *Title:* *Verification Dilemmas, Law, and the Promise of Zero-Knowledge
> Proofs*
> *Abstract:* Individuals expose personally identifying information to
> access a website or qualify for a loan, undermining privacy and security.
> Firms share proprietary information in dealmaking negotiations; if the deal
> fails, the negotiating partner may use that information to compete.
> Regulators that comply with public transparency and oversight requirements
> can risk subjecting algorithmic governance tools to gaming that destroys
> their efficacy. Litigants might have to reveal trade secrets in court
> proceedings to prove a claim or defense. Such “verification dilemmas,” or
> costly choices between opportunities that require the verification of some
> fact, and risks of exposing sensitive information in order to perform
> verification, appear across the legal landscape. Yet, existing legal
> responses to them are imperfect. Legal responses often depend on ex post
> litigation remedies that are prohibitively expensive for those most in
> need, or that fail to address abuses of information entirely.
>
> Zero-knowledge proofs (ZKPs)—a class of cryptographic protocols that allow
> one party to verify a fact or characteristic of secret information without
> revealing the actual secret—can help solve these verification dilemmas.
> ZKPs have recently demonstrated their mettle, for example, by providing the
> privacy backbone for the blockchain. Yet they have received scant notice in
> the legal literature. This Article fills that gap by providing the first
> deep dive into ZKPs’ broad relevance for law. It explains ZKPs’ conceptual
> power and technical operation to a legal audience. It then demonstrates
> how, and that, ZKPs can be applied as a governance tool to transform
> verification dilemmas in multiple legal contexts. Finally, the Article
> surfaces, and provides a framework to address, the policy issues implicated
> by the potential substitution of ZKP governance tools in place of existing
> law and practice.
>
> *Speaker: *Jonathan Shafer
> *Title: **Interactive Proofs for Verifying Machine Learning*
> *Abstract:* We consider the following question: using a source of labeled
> data and interaction with an untrusted prover, what is the complexity of
> verifying that a given hypothesis is “approximately correct”? We study
> interactive proof systems for PAC verification, where a verifier that
> interacts with a prover is required to accept good hypotheses, and reject
> bad hypotheses. Both the verifier and the prover are efficient and have
> access to labeled data samples from an unknown distribution. We are
> interested in cases where the verifier can use significantly less data than
> is required for (agnostic) PAC learning, or use a substantially cheaper
> data source (e.g., using only random samples for verification, even though
> learning requires membership queries). We believe that today, when data and
> data-driven algorithms are quickly gaining prominence, the question of
> verifying purported outcomes of data analyses is very well-motivated. We
> show three main results. First, we prove that for a specific hypothesis
> class, verification is significantly cheaper than learning in terms of
> sample complexity, even if the verifier engages with the prover only in a
> single-round (NP-like) protocol. Moreover, for this class we prove that
> single-round verification is also significantly cheaper than testing
> closeness to the class. Second, for the broad class of Fourier-sparse
> boolean functions, we show a multi-round (IP-like) verification protocol,
> where the prover uses membership queries, and the verifier is able to
> assess the result while only using random samples. Third, we show that
> verification is not always more efficient. Namely, we show a class of
> functions where verification requires as many samples as learning does, up
> to a logarithmic factor.
>
> *Speaker: *Aloni Cohen
> *Title:* *Towards formalizing the GDPR’s notion of singling out*
> *Abstract:* There is a significant conceptual gap between legal and
> mathematical thinking around data privacy. The effect is uncertainty as to
> which technical offerings meet legal standards. This uncertainty is
> exacerbated by a litany of successful privacy attacks demonstrating that
> traditional statistical disclosure limitation techniques often fall short
> of the privacy envisioned by regulators. We define “predicate singling
> out,” a type of privacy attack intended to capture the concept of singling
> out appearing in the General Data Protection Regulation (GDPR). An
> adversary predicate singles out a dataset x using the output of a
> data-release mechanism M(x) if it finds a predicate p matching exactly one
> row in x with probability much better than a statistical baseline. A
> data-release mechanism that precludes such attacks is “secure against
> predicate singling out” (PSO secure). We argue that PSO security is a
> mathematical concept with legal consequences. Any data-release mechanism
> that purports to “render anonymous” personal data under the GDPR must
> prevent singling out and, hence, must be PSO secure. We analyze the
> properties of PSO security, showing that it fails to compose. Namely, a
> combination of more than logarithmically many exact counts, each
> individually PSO secure, facilitates predicate singling out. Finally, we
> ask whether differential privacy and kanonymity are PSO secure. Leveraging
> a connection to statistical generalization, we show that differential
> privacy implies PSO security. However, and in contrast with current legal
> guidance, kanonymity does not: There exists a simple predicate singling out
> attack under mild assumptions on the k-anonymizer and the data distribution.
>
> *Speaker: *Omer Reingold
> *Title:* *Outcome Indistinguishability*
> *Abstract:* Prediction algorithms assign numbers to individuals that are
> popularly understood as individual “probabilities”—what is the probability
> of 5-year survival after cancer diagnosis?—and which increasingly form the
> basis for life-altering decisions. Drawing on an understanding of
> computational indistinguishability developed in complexity theory and
> cryptography, we introduce Outcome Indistinguishability. Predictors that
> are Outcome Indistinguishable yield a generative model for outcomes that
> cannot be efficiently refuted on the basis of the real-life observations
> produced by Nature. We investigate a hierarchy of Outcome
> Indistinguishability definitions, whose stringency increases with the
> degree to which distinguishers may access the predictor in question. Our
> findings reveal that Outcome Indistinguishability behaves qualitatively
> differently than previously studied notions of indistinguishability. First,
> we provide constructions at all levels of the hierarchy. Then, leveraging
> recently-developed machinery for proving average-case fine-grained
> hardness, we obtain lower bounds on the complexity of the more stringent
> forms of Outcome Indistinguishability. This hardness result provides the
> first scientific grounds for the political argument that, when inspecting
> algorithmic risk prediction instruments, auditors should be granted oracle
> access to the algorithm, not simply historical predictions.
> Hope to see you all there virtually!
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue*
> *Room 517*
> *Chicago, IL  60637*
> *p:(773) 834-1757*
> *f: (773) 357-6970*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20210528/eb7689ee/attachment-0001.html>