字幕記錄


00:03
alright hello everyone let's get started
00:08
I want to talk about a system called
00:11
certificate transparency today and this
00:15
is a bit of a departure from most of the
00:18
topics we talked about so far we've
00:21
talked about distributed systems that
00:23
are really closed systems where all the
00:24
participants are trustworthy they're all
00:27
maybe be run being run by the same sort
00:30
of mutually trusting organization like
00:32
rafters that way you know you just
00:34
assume at the RAF's tiers do what
00:36
they're supposed to do but there's also
00:40
plenty of systems out there particularly
00:42
systems sort of built an internet scale
00:44
where the systems are open and anyone
00:48
can participate being active participant
00:50
I mean in some big systems out there and
00:54
if you build systems that are completely
00:56
open in that way there's often no single
01:01
universally trusted Authority that
01:03
everybody is willing to trust to run the
01:06
system or to protect it that is
01:09
everybody you sort of potentially
01:11
mutually suspicious of everyone else and
01:14
if that's the situation you have to be
01:16
able to build useful systems out of
01:18
mutually distrusting pieces and this
01:23
makes in any sort of internet wide open
01:26
systems to make trust and security sort
01:27
of top level systems issues when you're
01:30
thinking about designing a distributed
01:31
system so the most basic question when
01:34
you're building an open system is when
01:37
I'm talking to another computer or
01:38
another person you need to know are you
01:41
talking to the right other computer or
01:43
are you talking to the right website and
01:46
this problem is actually close to
01:48
unsolvable it turns out there's really
01:51
there's lots of solutions and none
01:54
really work that well but it is the
01:56
problem that certificate transparency
01:59
today's topic is trying to help with the
02:04
material today ties sort of backwards in
02:06
the course to consistency it turns out
02:08
that a lot of what certificate
02:09
transparency do doing is ensuring that
02:12
all parties see
02:13
the same information about certificates
02:16
that's a real consistency issue and this
02:18
material also ties forward to blockchain
02:21
systems like blockchain which is what we
02:23
talk talking about next week a
02:26
certificate transparency is among the
02:29
relatively few non cryptocurrency uses
02:32
of a blockchain like design alright so
02:37
by way of introduction I want to start
02:39
with the situation on the web with web
02:45
security at any rate as it existed
02:47
before 1995 before certificates so this
02:50
is for 1995 and in particular there was
02:56
a there was a kind of attack in those
02:57
days that people were worried about
03:00
called a man-in-the-middle attacks this
03:02
is man in
03:07
middle and this is a name for a class of
03:10
attacks style of attack so you know the
03:14
set up in those days is you have the
03:16
internet and you have people running
03:20
browsers
03:23
um sitting with our computer attached to
03:26
the Internet
03:27
anyone sitting in front of my computer I
03:29
want to talk to a specific server
03:31
exposing what I want to do is talk to
03:33
gmail.com right and ordinarily I would
03:39
you know maybe contact the DNS system I
03:42
would as a user I maybe type gmail.com I
03:45
would sort of know what it was I wanted
03:47
to talk to name Li gmail.com my browser
03:49
would talk to DNS servers say what's
03:51
gmail.com it would reply with a IP
03:54
address I connected that IP address and
03:56
you know I need to authenticate myself
03:58
so I'd probably type my password to
04:00
Gmail to Gmail's website and then Gmail
04:02
would show me my email without some kind
04:08
of story for security this system is
04:10
actually quite easy to attack and turn
04:12
out to be easy to attack and the one
04:18
style of attack is that what's called a
04:19
man-in-the-middle attack where some evil
04:21
person sets up a another web server that
04:25
serves pages that look just like Gmail
04:28
web servers like the last for your login
04:30
and password right and then the attacker
04:34
would maybe intercept my DNS packets or
04:39
just guess when I would have sent a DNS
04:41
packet and come up with a fake reply
04:43
that instead of providing the real IP
04:47
address of the real gmail.com server
04:49
would provide the email address of ma of
04:52
the attackers fake computer and then the
04:54
user's browser instead of talking to
04:56
Gmail would actually unknown to them be
05:00
talking to the attackers computer the
05:02
attackers computer would provide a web
05:04
page looks just like a login page user
05:05
types are paths log and a password and
05:08
now the attackers computer can forward
05:11
that to the real Gmail login for you of
05:14
course you don't know that you know get
05:16
your current inbox back to the attackers
05:18
computer which presumably records it
05:20
along with your password and then sends
05:22
your inbox or whatever to the browser
05:24
and this allows a you know if you can
05:28
execute this kind of man-in-the-middle
05:29
attack the attackers computer can record
05:32
your password record your email and
05:34
you'll never be the wiser
05:35
and
05:36
before certificates on SSL and HTTPS
05:40
there was really no defense against this
05:42
mom okay so this is the man in the
05:46
middle attack and this attacker here is
05:48
the man in the middle looks just like
05:50
Gmail to the browser pretends to be the
05:53
user when talking to Gmail so that it
05:54
can actually get the information from
05:57
Gmail required to trick the user into
05:59
thinking it's really Gmail all right so
06:01
this is the attack in the mid-90s people
06:05
came up with certificates with SSL or
06:11
it's also called TLS it's what the
06:14
protocol the security protocol that
06:15
you're using when you use HTTPS links um
06:20
and here the game was that Gmail comm
06:24
was gonna have a public/private key pair
06:28
so we'd have a private key that only
06:34
Gmail knows sitting in its server and
06:38
then when you connect well your the user
06:41
you connect somewhere you ask to connect
06:44
to Gmail you know and in order to verify
06:48
that you're really talking to Gmail the
06:50
users going to demand Gmail prove that
06:52
it really owns Gmail is private key well
06:55
of course
06:55
where does your browser find out Gmail
06:58
is private key from your Gmail public
07:01
key which is what you need to check that
07:03
it really has the private key there's
07:05
also this notion of certificate
07:07
authorities and certificates so there'd
07:09
be a certificate authority when Gmail
07:11
set up its server it would contact the
07:14
certificate authority may be on the
07:15
phone or by email or something and say
07:17
look you know I want a certificate for
07:19
the DNS name gmail.com and the
07:24
certificate authority would sort of try
07:25
to verify that oh yes whoever's asking
07:28
for certificate really owns that name
07:30
it really is Google or whoever owns
07:32
gmail.com and if so the certificate
07:35
authority would provide a certificate
07:39
back to gee
07:40
komm which basically what a certificate
07:43
contains is the name of the web server
07:50
the web servers public key and a
07:57
signature over this certificate made
08:01
with the certificate authorities private
08:04
key so this is sort of a self-contained
08:08
assertion checkable by checking the
08:11
signature an assertion by the
08:12
certificate authority that the public
08:15
key of gmail.com is really this public
08:18
key gmail.com server would I just keep a
08:21
copy of the certificate if you connect
08:23
to gmail.com server with HTTPS the first
08:27
thing it does is sends you back this
08:28
certificate at this point is just a
08:32
certificate right now of course since
08:33
gmail.com is willing to give it to
08:35
anybody it's the certificate itself is
08:37
not at all pregnant it's quite public
08:38
and then the browser would send some
08:42
information like a random number for
08:44
example to the server and ask it to sign
08:48
it with its private key and then the
08:53
browser can check using the public key
08:55
in the certificate that the random
08:57
number is ran and remember was really
08:59
signed by the private key that's
09:02
associated with the public key in the
09:04
certificate and therefore that whoever
09:05
it's talking to is really the entity
09:08
that the certificate authority believes
09:10
is gmail.com all right and now the
09:14
reason why this makes man-in-the-middle
09:15
attacks much harder is that yeah you
09:17
know you can set up a rogue server that
09:20
looks just like Gmail calm and maybe you
09:23
can even hack the DNS system indeed you
09:25
still can if you're sufficiently clever
09:27
powerful hack the DNS system to tell
09:32
people's browsers that oh they should go
09:34
to your server instead of gmail.com but
09:36
once somebody's browser contacts your
09:38
server
09:40
you're not presumably going to be able
09:42
to produce a certificate that says but
09:46
you you can produce Gmail certificate
09:47
but then Gmail certificate as Gmail's
09:50
public key your server doesn't have
09:51
their private key so you can
09:53
sign the challenge the browser sent you
09:55
and presumably since you're not the real
09:58
Google and not the real Gmail you're not
10:01
going to be able to persuade a
10:01
certificate authority to give you a
10:03
certificate associating gmail comm with
10:06
your public key that unit and so this
10:10
certificate scheme made
10:11
man-in-the-middle attacks quite a bit
10:13
harder and you know indeed they are
10:14
quite a bit harder now because of
10:16
certificates okay so it turns out though
10:21
that the certificate scheme as people
10:24
now have a lot of experience with it
10:27
almost 25 years experience within so we
10:30
now know there's some kind of things
10:32
that go wrong it was originally imagined
10:34
that there would just be a couple of
10:35
trustworthy certificate authorities who
10:38
would do a good job of checking that
10:40
request really came from who they
10:42
claimed to come from that if somebody
10:43
asked for a certificate for gmail.com
10:45
that this certificate authorities would
10:46
indeed actually verified that the
10:49
request came from the owner gmail.com
10:50
and not hand out certificates to random
10:53
people for gmail comp but it that turns
10:57
out to be very challenging for google
11:00
maybe you can convince this certificate
11:02
authority can convince itself that a
11:04
request comes from Google but you know
11:06
for just XCOM that's very hard to have a
11:09
certificate authority reliably able to
11:11
say oh yeah gosh this request really
11:14
came from the person who really does own
11:16
the DMS name XCOM all right a worse
11:20
problem is that while originally they
11:23
were envisioned there'd be only a few
11:25
certificate authority there are now
11:26
literally hundreds of certificate
11:28
authorities out there and any
11:30
certificate authority can generate a
11:33
certificate for any name and indeed may
11:38
want to you're allowed to change
11:39
certificate authorities if you're a
11:40
website owner you can change certificate
11:42
authority to whoever you like so there's
11:46
no sense in which certificate
11:48
authorities have limits on their powers
11:49
they can any certificate authority can
11:51
produce any certificate and now browsers
11:56
have you know there's a couple hundred
11:57
certificate authorities and that means
11:59
that each browser has built into it like
12:00
Chrome or Firefox or something has built
12:03
into it a list of the public keys of all
12:05
the certificate all couple hundred sort
12:07
good authorities and if any of them sign
12:09
has signed a certificate produced by web
12:11
server certificates acceptable the
12:16
result of this is that there have been
12:18
multiple incidents of certificate
12:21
authorities producing bogus certificates
12:23
that is producing certificates that said
12:27
they were certificate for Google or
12:28
Gmail or some other real company but
12:31
were actually issued to someone totally
12:34
else absolutely not issued certificate
12:37
for one of Google's names but not issued
12:40
to Google issued to someone else like
12:44
and you know sometimes this happens just
12:47
by mistake because superior Authority
12:50
doesn't realize that they're doing the
12:52
wrong thing and sometimes it's actually
12:53
quite malicious I mean there have
12:55
certainly been certificates issued to
12:57
people who just wanted to snoop on
12:59
people's traffic and mount
13:01
man-in-the-middle attacks and did
13:02
Mountain Man the middle attacks today's
13:05
readings are mentioned a couple of these
13:07
incidents and they're particularly
13:09
troubling because they're hard to
13:12
prevent because there's so many
13:13
certificate authorities and not all of
13:15
them
13:16
although sorry the last question let was
13:19
the last line insert box it's a
13:21
signature over the certificate by the
13:23
sir tip using by the certificate
13:25
authorities using the certificate
13:27
authorities private key okay so there
13:32
have been incidents of bogus
13:33
certificates certificates for real
13:35
websites like Google issued to totally
13:38
the wrong people and those certificates
13:40
have been abused and it's not clear how
13:43
to fix the certificate authority system
13:45
itself to prevent them because there's
13:47
so many certificate authorities and they
13:50
really you just can't expect that
13:54
they're going to be completely reliable
13:55
so what can we do about this one
14:00
possibility would be to have a single
14:03
online database of all valid
14:05
certificates so that when a browser
14:07
you know browser Comcast websites web
14:09
site hands at a certificate you know
14:11
might or might be valid then maybe you
14:13
could imagine the browser would contact
14:15
the global valid certificate database
14:18
ins assays this really is certificate
14:20
a bogus certificate issued by a row
14:24
certificate authority um the problem is
14:28
as many problems with that approach one
14:32
is it's still not clear how you can how
14:36
anybody can distinguish valid correctly
14:38
issued certificates from bogus
14:40
certificates because typically you just
14:42
don't know who the proper owner of DNS
14:44
names it is furthermore you need to
14:47
allow certificate owners to change
14:49
certificate authorities or renew their
14:51
certificates or they may lose their
14:52
private key and need a new certificate
14:54
to replace their older to think because
14:57
using a new public/private key pair so
15:00
people's certificates change all the
15:02
time and finally even if technically or
15:05
were possible to distinguish correct
15:07
certificates from bogus ones
15:10
there's no entity that everybody would
15:12
trust to do it you know everybody in the
15:14
world those you know the Chinese
15:15
Iranians the Americans you know there's
15:18
not any one outfit that they all trust
15:21
and that's the root reason why there's
15:23
so many certificate authorities so we
15:26
really can't you really can't expect
15:29
there to be a single Clearing House that
15:31
accurately distinguishes between valid
15:33
and invalid certificates however what
15:38
certificate authority certificate
15:40
transparency doing is doing is
15:42
essentially try not do the best that
15:47
it's possible to do you know the longest
15:51
step it can towards a database of the
15:54
holid trustworthy certificates so now
15:59
I'm gonna give an overview of the
16:02
general strategy of certificate
16:04
transparency the style of certificate
16:10
transparency is that it's an audit
16:13
system because it's so hard hard to
16:18
impossible to just decide does this
16:21
person own a name a certificate
16:23
transparency isn't a building a system
16:25
that prevents bad things from happening
16:27
which would require you to be able to
16:29
detect right away that as
16:32
certificate was bogus instead
16:35
certificate transparency is going to
16:37
enable audit that is it'll it's a system
16:42
to cause all the information to be
16:44
public so that it can be inspected by
16:47
people who care that is it's gonna if
16:49
you know maybe people it'll still allow
16:51
people to issue bogus certificates but
16:53
it's gonna insure those certificates are
16:55
public and that everybody can see them
16:58
including whoever it is that owns the
17:01
name that the name that's in the bogus
17:06
certificate and so this fixes the
17:07
problem with the pre certificate
17:10
transparency system where certificate
17:12
authorities could issue bogus
17:13
certificates and no one would ever know
17:15
and they could even give them to victim
17:19
a few victim browsers who would be
17:21
tricked by them and still because
17:23
certificates aren't generally public
17:24
they could somebody could a certificate
17:28
authority could issue a bogus
17:30
certificate for anybody for Google or
17:32
Microsoft and Google Microsoft might
17:34
never realize it and the incidents that
17:35
have come to light have generally been
17:37
discovered only by accident not because
17:41
they were sort of foredoomed to be
17:43
discovered so instead of relying on
17:46
accidental discovery of bogus
17:48
certificates certificate transparency
17:50
it's going to sort of force them into
17:51
the light where they is much easier to
17:54
notice them again so it has a sort of
17:57
audit flavor or nada not a prevention
17:59
flavor okay so the basic structure again
18:04
we have gmail.com or some other service
18:08
that wants a certificate as usual
18:11
they're gonna ask someone of the
18:12
hundreds of CAS for a certificate when
18:15
when when the cert web servers first set
18:18
up so we're gonna ask a certificate and
18:21
the certificate authority is gonna send
18:23
this certificate back to the web server
18:26
because of course is the web server that
18:28
gives a certificate to the browser and
18:32
at the same time though the certificate
18:34
authority is going to send a copy of the
18:36
certificate or equivalent information to
18:41
a sort
18:43
Transparency vlog server there's gonna
18:46
the real system there's multiple
18:48
independent certificate transparently
18:50
log servers i can assume there's just
18:52
one so this is some service that you
18:55
know we don't have turns out we're not
18:56
gonna have to trust the certificate
19:00
authorities gonna send it certificate to
19:01
this certificate log service which has
19:04
been maintaining a log of all issued
19:08
certificates or all ones that
19:10
certificate authorities have told it
19:12
about when it gets a new certificate
19:13
it's gonna append it to its log so this
19:17
you know might have millions of
19:18
certificates in it after a while now
19:22
when the browser and some human wants to
19:26
talk to a website they you know they
19:29
talk did set up an HTTPS connection to
19:32
Gmail Gmail sends them a certificate
19:33
back and the browser's gonna send that
19:38
certificate to the certificate log
19:40
server see is this certificate in the
19:42
log there's difficut log servers gonna
19:46
say yes or no is their certificate in
19:48
the log now and if it is then the
19:50
browser will go ahead and use it now the
19:53
fact that it's in the log you know
19:55
doesn't mean it's not bogus right
19:56
because any certificate authority
19:58
including the ones that are out there
20:00
that are malicious or badly run any
20:03
certificate authority can insert a
20:06
certificate into the log system and
20:09
therefore perhaps trick users into using
20:13
it so for so far we haven't built a
20:14
system that prevents abuse however it is
20:20
the case that no browser will use a
20:22
certificate unless it's in the log so at
20:25
the same time gmail is going to run up
20:29
with the CT system calls a monitor and
20:34
for now well
20:36
just assume that there's a monitor
20:37
associated with every website so this
20:39
monitor periodically also talks to the
20:44
certificate log servers an asset please
20:47
give me a copy of your log or really you
20:49
know please give me a copy of whatever
20:51
new has been added to your long since I
20:52
last asked and that means that the
20:54
monitor is going to build up it's going
20:55
to be aware of every single certificate
20:58
that's going to be enough that's in the
21:00
log and but also because the monitor is
21:03
associated with Gmail the monitor knows
21:05
what Gmail's correct certificate is so
21:10
if some rogue certificate authority
21:12
issues a certificate for Gmail it's not
21:14
the one that Gmail itself asked for then
21:18
Gmail's monitor will stumble across it
21:20
in the certificate log because Gmail's
21:24
monitor knows Gmail's correct
21:26
certificate now of course the rogue
21:29
certificate authority doesn't have to
21:30
send its certificate to the certificate
21:32
log system but in that case when
21:34
browsers you know maybe accidentally
21:37
connect to the attackers web server and
21:40
the attacker would swipe server gives
21:42
them the bogus certificate if they
21:43
haven't put it in the log then the
21:45
browser won't believe it and will abort
21:47
the connection it's not because it's not
21:48
in the log
21:49
so the log sort of forces because
21:53
browsers require certificates being a
21:55
log the log forces all certificates to
21:58
be public where they can be audited and
22:00
checked by monitors who know what the
22:03
proper certificates our and so some
22:05
monitors are run by big companies and
22:07
companies know their own certificates
22:10
some monitors are run by certificate
22:12
authorities on behalf of their customers
22:14
and again those certificate authorities
22:15
know what certificates they've issued to
22:17
their customers and they can at least
22:19
alert their customers if they see a
22:21
certificate they didn't issue for one of
22:23
their customers names I'm in addition
22:26
there's some totally third-party monitor
22:28
systems where you give the third-party
22:30
monitor your names and yours and your
22:34
valid certificates and it checks for
22:37
expected certificates for your names
22:41
alright this is the overall scheme but
22:47
it depends very much on browsers seeing
22:51
the very same log contents that monitors
22:54
see and but remember we were up against
22:59
this problem that we're not sure that we
23:00
can trust any component in this system
23:02
so indeed we found this certificate
23:04
authorities some of them are malicious
23:06
or have employees who can't be trusted
23:07
or are sloppy and don't follow the rules
23:11
so we're going to assume we have to
23:13
assume that the same will be true the
23:14
certificate log servers that some of
23:16
them will be malicious some of them may
23:18
conspire with rogue certificate
23:21
authorities and intentionally try to
23:23
help them issue bogus certificates some
23:27
of them may be sloppy some of them may
23:29
be legitimate but maybe some of their
23:31
employees or are corruptible you pay
23:33
them being a bribe so I'll do something
23:36
funny to the log delete something or add
23:38
something to it so what we need to build
23:41
is a log that even though the log
23:43
operator may be not cooperating not
23:47
trustworthy we can still be sure or at
23:50
least know if it's not the case that
23:52
browsers are seeing the same log contest
23:54
as monitors so if our browser uses a
23:56
certificate that was in the log the
23:59
monitor who owns that name will
24:01
eventually see it so what we need to do
24:05
is we need to build a log system that is
24:13
append-only so that it can't show a
24:16
certificate to a browser then delete it
24:20
before monitors see it so append-only
24:27
no Forks
24:28
in the sense that we don't want the log
24:33
system to basically keep two logs one of
24:36
which it shows two browsers and one of
24:38
which shows two monitors so we need no
24:41
Forks and we need untrusted we can't be
24:53
sure that the certificate servers are
24:56
correct so just to back up a bit the
25:02
critical properties we need for the log
25:05
system so larger than just a log servers
25:08
but the entire system of the log servers
25:10
plus the various checks is we have to
25:14
prevent deletion that is we need the
25:16
logs to be append only because if a log
25:19
server could delete items out of its log
25:24
then they could effectively show a bogus
25:26
certificate to a browser claimants in
25:29
the long and maybe in the log at that
25:31
time the browser uses it but then maybe
25:34
this certificate server could delete
25:35
that certificate from its log so that by
25:38
the time the monitor's came to look at
25:40
the log the bogus certificate wouldn't
25:42
be there so we need to have a system
25:44
that either prevents deletion or at
25:46
least detects if deletion occurred so
25:49
that's the sense in which the system
25:52
needs to be append-only and we also have
25:56
to prevent what's called equivocation or
26:00
not' we have to prevent Forks or
26:02
equivalently equivocation
26:08
so you know it's
26:12
maybe the certificate log servers could
26:15
be implementing append-only logs but if
26:17
it if it uh implemented two different
26:21
depend the only logs and showed one two
26:23
browsers and show the other append-only
26:25
log two monitors then we could be in a
26:27
position where yeah you know that the
26:30
browser that we showed the log we showed
26:31
the browser's contains the bogus
26:33
certificate but the log we showed a
26:36
monitors doesn't doesn't contain the
26:39
bogus certificate so we have to rule out
26:42
equivocation to all without trusting the
26:45
servers so how can we do this now we're
26:50
getting into the kind of details that
26:53
the last of the assignments was talking
26:56
about the first step is this thing
27:00
called a Merkel tree and this is
27:05
something that's sort of that the log
27:08
servers are expected to build on top of
27:10
the log so the idea is that there's the
27:12
actual log itself which is a sequence of
27:14
certificates you know certificate one
27:17
certificate to presumably in the order
27:19
that a certificate
27:24
certificates to be added to the system
27:26
and the prime millions I'm just going to
27:28
assume there's a couple now it's gonna
27:33
turn out you know we don't want to have
27:35
the browser's have to download the whole
27:36
log and so we need tools to so that we
27:40
can allow the logging system to
27:42
basically send trustworthy summaries or
27:48
unambiguous summaries of what's in the
27:50
log to the the browsers and I'll talk in
27:53
a bit about it exactly what those
27:54
summaries are used for but the basic
27:57
scheme is that the log servers are gonna
28:03
use cryptographic hashes to sort of hash
28:07
up the complete set of records that are
28:10
in the log can produce a single
28:11
cryptographic hash which is typically
28:14
these days about 256 bits long so the
28:16
cryptographic hash summarizes the
28:19
countenance of the log and the way
28:23
that's done is that the is as a
28:25
basically a tree structure of pairs over
28:28
hash always hashing together pairs of
28:30
numbers at the zeroeth level so I'm
28:35
gonna write each for a hash each one of
28:38
the log entries has a hash so we're
28:40
gonna have sort of at the base level we
28:42
have the hash of each log entry each
28:46
certificate and then we're going to hash
28:50
up peers so that the next level we're
28:55
gonna have a hash of this and
28:59
concatenated with this and a hash of
29:04
this concatenated with this these two
29:07
hashes and then at the top level sort of
29:12
we're we're overdoing is hashing these
29:14
two the concatenation of these two
29:16
hashes and this single hash here is a
29:21
unambiguous sort of stand-in for the
29:26
complete log one of the properties of
29:28
these cryptographic hashes like sha-256
29:31
is that it's not feasible to find two
29:33
inputs to the hash function that produce
29:35
the same output and that means if you
29:37
tell somebody the output of the hash
29:39
function there's only one input you're
29:43
ever going to be able to find that
29:44
produce that output so if the log server
29:48
does hash up in this way the contents of
29:51
its logs only this sequence of these log
29:54
records will ever be able to produce
29:56
that hash or guaranteed effectively that
29:59
the log server is not going to be able
30:02
to find some other log that produces the
30:05
same final tree hash as this sequence of
30:09
log entries all right so this is the
30:12
Merkel tree this is the sort of tree
30:14
hash that summarizes the entire log at
30:18
the top of the Merkel tree there there's
30:23
will actually call it a signed tree head
30:27
because in fact the log servers take
30:29
this hash this at the top of the tree
30:32
and sign it with their private key and
30:33
give that to clients to browsers and
30:36
monitors and the fact that they've
30:40
signed it means that they they can't
30:42
disavow it later
30:43
that was really them and produced it so
30:45
that's you know just to be able to catch
30:47
lying lying log servers and so the point
30:53
here is that once a log server has
30:55
revealed a particular sign tree head to
30:59
a browser or monitor its committed to
31:03
some specific log contents because it
31:05
won't be able to ever produce a
31:06
different log contents to produce the
31:08
same hash so you hashes are really
31:10
function as kind of commitments okay so
31:14
this is the with the log but the Merkel
31:17
tree looks like for a particular log now
31:20
the third reading today sort of outlined
31:23
how to
31:25
and the law how to add records to the
31:27
log for arbitrary numbers of Records I'm
31:32
just going to assume that the log always
31:34
grows by factors of 2 which is
31:37
impractical but makes it easier to
31:39
explain Naumann so that means that as
31:41
certificate authorities send in new
31:43
certificates to add to the log the log
31:45
server will wait until it has as many
31:48
new records as it has old records and
31:50
then produce another tree head and the
31:54
way it does that is it's gonna in order
31:56
to extend the log the log servers going
32:01
to wait off as another four records and
32:02
then it's gonna hash them pairwise just
32:05
as before and then it'll produce a new
32:10
tree head that is the hash of the
32:14
concatenation of these two hashes and
32:21
this is the new tree head for the new
32:26
expanded law and so that means as time
32:28
goes on and a log server this log grows
32:33
longer and longer it produces sort of
32:34
higher and higher a sequence of higher
32:37
and higher tree heads as the logarithms
32:44
okay so this is the structure that we're
32:50
expecting log servers to maintain of
32:53
course who knows what they're actually
32:54
doing especially if they're malicious
32:57
but the protocol the certificate
32:59
transparency protocol sort of is written
33:01
you know as if the log server was was
33:03
actually doing this all right so what do
33:06
we need to do but do the point of this
33:08
Merkle trees is to use them to force log
33:14
servers to prove certain things about
33:16
the logs that they can about the log
33:18
that they're maintaining we're going to
33:21
want to know what those those proofs
33:24
look like the first kind of
33:27
is what I'll call a proof of inclusion
33:33
and this is what a
33:40
NEADS when it when it wants to find out
33:42
if a certificate that has just been
33:44
given by a web server if that
33:46
certificate is really in the law it's
33:49
gonna ask the certificate it's gonna ask
33:54
the log server look here's a certificate
33:57
you know is it an is it in your log and
33:59
the certificate server is gonna send
34:01
back a proof of actually not just that
34:05
the certificate is in the log but
34:07
actually where it is what its position
34:08
is in the log and of course the browser
34:14
wants this proof because it doesn't want
34:16
to use the certificate if it's not in
34:17
the log because if it's not I'm along
34:19
then monitors won't see it and there's
34:21
no / - no protection against their
34:23
certificate being bogus and it needs to
34:27
be a proof because we we can't afford to
34:33
let this log server a malicious log
34:35
forever change its mind we don't want to
34:37
take the log servers word for it because
34:39
then they might a malicious log server
34:40
might say yes and this proof is gonna
34:44
help us catch it you know if a log
34:46
server does lie these proofs are gonna
34:49
help us catch the fact that the log
34:50
servers lied and produce evidence that
34:54
the log server is malicious and should
34:56
be ignored from now on is that sort of
34:59
the ultimate sanction against the log
35:01
servers is that the browser's actually
35:03
have a list of acceptable log servers
35:05
and these proofs would be part of the
35:10
evidence to cause one of the log servers
35:14
to be taken out of the log if it was
35:16
malicious okay so we need a proof we
35:18
want the log server to produce a proof
35:20
that a given certificate is in its log
35:24
so actually the first step is that the
35:29
browser asks the log server for the
35:31
current sign tree head so what the
35:35
browser's really asking is is this
35:37
certificate in the log that summarized
35:41
by this current by this sign tree head
35:45
and the log server may lie about the
35:47
sign tree head right the browser asks it
35:49
for the current sign tree head and then
35:52
for a proof that the certificate is in
35:54
the log the log server could lie about
35:56
the sign tree headband will deal about
35:58
that we'll consider that later but for
36:01
now let's assume that the the browser
36:06
has the correct sign tree head and is
36:09
demanding a proof okay so for simplicity
36:12
I'm just gonna explain how to do this
36:15
for a log with two records and it turns
36:16
out that extending that to a log with
36:18
with other more higher power of two
36:21
records is relatively easy um so the
36:26
browser actually has a particular sign
36:27
tree head let's suppose the correct log
36:32
that sits under that sign tree head is
36:35
the two LM in log a B for particular
36:39
certificates a and B and that means that
36:44
the correct
36:46
Merkle tree for that it securely is at
36:49
the bottom as the hashes of a and B and
36:52
then the sign tree head is actually the
36:56
hash of a hash of a concatenated with a
37:01
hash would be so let's suppose this is
37:06
the sign tree head that the certificate
37:09
that the log server actually gave to the
37:11
client of course the client doesn't this
37:16
client only knows this value this is
37:20
final hash value doesn't actually know
37:21
what is in the log the proof if the if
37:26
the browser asked for a proof that a is
37:28
in the log then the proof that the log
37:33
server can return is simply the proof
37:38
for a is a in the log is simply eizan in
37:42
the log and the hash of the other
37:50
element in the log so zero and the hash
37:55
of b
37:56
and that is enough information for a to
38:00
convince itself that for sorry for the
38:03
client to convince itself that a really
38:05
is at position zero because it can take
38:08
it knows the certificate is interested
38:10
in it can hash it part of the proof was
38:13
the hash of the other element in this
38:16
lowest level hash so the browser can
38:21
that now knows H a and H B you can hash
38:23
them together can execute this hash and
38:26
see if the result is the same as the
38:27
sine tree head that it happens and if it
38:29
is then that means that the certificate
38:34
log is actually produce a valid proof
38:35
that certificate a is at position B
38:39
that's a sorry it's a position zero in
38:42
the log summarized by this sign tree
38:45
head and it turns out that in larger
38:50
larger logs you know if you're looking
38:55
for if you need a proof that a is really
38:57
here all you need is the sequence of
38:59
hashes of the other branch of each hash
39:05
up to the sign tree head that you have
39:07
so in a for element log if you if you
39:11
need a proof that a is position zero you
39:13
need this hash units then you need this
39:15
hash and if the lock is bigger you know
39:17
eight elements then you also need this
39:19
hash assuming that you have the signed
39:22
tree hit so you can take the element you
39:23
know and hash it together with each of
39:25
these other hashes see if it's equal to
39:28
the sign tree head okay so if the
39:32
browser asks is supposing the browser
39:34
asks whether X is in the log at position
39:37
zero well X isn't in the log right so
39:41
hopefully there's no easy way for the
39:44
log server to produce the proof that X
39:46
is in the log in position zero but
39:48
suppose the log servers wants to lie and
39:50
it's in the position where it already
39:52
exposed a sign tree head for log that
39:55
contain a and then B browser doesn't
39:59
know was a and B doesn't know what's in
40:01
the log and the log server wants to
40:03
trick the client into the browser into
40:06
thinking that it's really
40:07
at position zero well it turns out that
40:11
in order to do that the for this small
40:17
log the certificate server has to
40:20
produce for some why it needs to find a
40:31
why that if it takes it's hash one
40:36
concatenated with X you know so this is
40:38
that's that it's equal to the sign tree
40:41
head right because the client we're
40:44
assuming the client already has to sign
40:45
tree head we need to find a some number
40:48
here that when hashed together with the
40:50
hash of X that the clients asking about
40:52
produces that same sign tree hit well we
40:55
know the sign tree head or the
40:57
assumption is assigned tree it was
40:58
actually for some other log right
40:59
because we're trying to rule out the
41:00
possibility that the log server can give
41:04
you a sign tree head for one log but
41:06
that convince you that something else is
41:09
in that log that's not there so the sign
41:10
tree had really was produced by from the
41:14
hashes of the records that really were
41:17
in the log and now we need and since you
41:22
know X is definitely different from a
41:24
that means the hash of X is different
41:26
from the hash of a and that means that
41:28
the log server needs to find two
41:32
different inputs to the hash function
41:35
that produced the same output and the
41:38
Assumption widely believed to be true
41:41
for practical purposes is that that's
41:43
not possible for cryptographic hashes
41:46
therefore the cent sign tree head was
41:50
produced by hashing up one log that it
41:53
will not be possible to find these sort
41:56
of other hash values that would be
42:00
required to produce a proof that some
42:04
other element was in the log that wasn't
42:06
really there
42:07
any questions about this about anything
42:17
[Music]
42:18
interesting a nice thing about this is
42:20
that the proofs are the proofs consist
42:24
of just the sort of other hashes on the
42:27
way up to the root if there's n
42:29
certificates there's only log in other
42:32
hashes and so the proofs are reasonably
42:34
concise in particular that are much much
42:36
smaller than the full log and since you
42:39
know every browser that needs to connect
42:40
to a website he's going to need one of
42:42
these proofs it's good if they're small
42:48
okay well this was whole discussion was
42:51
assuming that the sign tree had the
42:55
theum
42:58
or had was the correct sign tree head if
43:04
the but no there's no immediate reason
43:07
to believe that the log server would
43:09
have given if the logs are is malicious
43:11
and it wants to trick a client you know
43:13
why would it give the client the correct
43:14
see sign tree head why doesn't it give
43:16
it just me giving the sign tree head for
43:18
the bogus log that it wants to trick the
43:20
client into using so we have to be
43:24
prepared for the possibility that the
43:26
log server has cooked up I just
43:28
completely different log for the browser
43:29
that's not like anybody else's log and
43:31
it just contains the bogus certificates
43:33
that a malicious log server wants to
43:36
trick this client into believing so what
43:43
do we do about that well it turns out
43:47
that this is at least in the first
43:50
instance this is totally possible
43:52
you know usually what's gonna happen
43:55
usually the way this will play out is
43:57
that we'd have some browser that was you
44:00
know seeing the correct logs until some
44:03
point in time when when somebody wanted
44:06
to attack it and you know you want the
44:10
browser student be able to use all the
44:11
websites that it's ordinarily seeing
44:13
plus a sort of different log with bogus
44:18
certificates that the log server wants
44:21
to trick just that client just that
44:23
victim browser into using so now this is
44:25
a fork fork attack or more broadly
44:31
equivocation and the reason why people
44:35
call this kind of attack
44:39
a fork attack is that if we just never
44:41
mind the Merkel tree for a moment if we
44:42
just consider the log usually the log
44:45
already has you know millions of
44:47
certificates in it and everybody's seen
44:50
the beginning part of the log then at
44:52
some point in time we want to attack we
44:57
want to persuade our victim to use some
45:00
bogus certificate B but we don't want to
45:04
show B to anybody else certainly not to
45:05
the monitor so we're gonna sort of cook
45:07
up this other log the sort of continues
45:10
as usual and contains new submissions
45:12
but definitely doesn't contain the bogus
45:14
certificate B and you know what this
45:18
looks like is a fork because both the
45:20
sort of main log that monitors are shown
45:23
is kind of off on one fork and then this
45:26
vlog we're cooking up especially to
45:28
trick a victim is a different fork this
45:31
is the construction that the malicious
45:33
log server would have to produce if it
45:35
wants to trick a browser into using a
45:37
bogus certificate and again these are
45:42
possible it's possible to do this at
45:45
least briefly in with certificate
45:48
authority the sift a fit transparency
45:52
luckily though is not the end of the
45:54
story and certificate authority contains
45:57
some tools that allow it to make Forks
46:01
much more difficult so the basic scheme
46:06
is that this isn't this is the way the
46:15
certificate authority sort of intended
46:16
to work all certificate transparency is
46:18
intended to work but doesn't quite
46:20
what's going on here is that the the the
46:24
monitors and people are not being
46:26
attacked or gonna see a a sign tree
46:30
particular sign tree head let's say
46:32
science we hit one of course is gonna
46:33
change as the log extends and the victim
46:37
we know must see some other sign tree
46:39
head because this is a signed tree hit
46:41
that is hashed over this
46:44
certificates guaranteed to be different
46:46
from the sign tree heads this is the
46:48
militia service showing two monitors
46:51
if only the browsers and monitors could
46:53
compare notes they would maybe instantly
46:56
realize that they were seeing different
46:58
trees and all it takes is comparing you
47:00
know if we play our cards right all it
47:02
takes is comparing the sign tree had its
47:04
they've gotten from the log server to
47:06
realize wait a minute we're seeing
47:08
different logs now something's terribly
47:10
wrong so the critical thing we need to
47:15
do is have have the different
47:18
participants in the system be able to
47:21
compare sign tree heads and the
47:24
certificate transparency has a provision
47:27
for this called gossip and the way it's
47:30
intended to works that browsers well the
47:33
details don't really matter but what it
47:36
really amounts to is that all the
47:38
participants sort of drop off the recent
47:41
sign tree heads they've seen into a big
47:43
pool that they all inspect to try to
47:47
figure out if there's inconsistent sign
47:50
tree heads that clearly indicate
47:52
divergent logs that have for it so we're
47:55
going to gossip which really means
47:58
exchange
48:02
I'm sign tree heads and compare it turns
48:07
out that current certificate
48:09
transparency implementations don't do
48:12
this but they ought to and they'll
48:16
figure it out at some point
48:17
all right okay so the question is given
48:21
to sign tree heads how do we decide if
48:25
they're evidence that the log has been
48:27
forked the thing that makes this hard is
48:33
that even if a log hasn't been forked as
48:36
it's depended to new sign tree heads
48:40
will become current so you know maybe
48:42
sign tree head one was the legitimate so
48:46
he had a vlog at this point of then some
48:47
more certificates are added and sign
48:50
tree head 3 becomes the correct head of
48:54
the law and then signed tree head for
48:55
etc so really what this gossip
48:59
comparison least to do is distinguish
49:04
situations where one sign tree head is
49:07
really describes a prefix a log that's a
49:09
prefix of the log described by another
49:11
sign tree head because this is the
49:13
legitimate situation where you have the
49:15
two these two sign tree heads are
49:17
different but the second one really does
49:20
subsume the first one we want to
49:21
distinguish that from two signed tree as
49:24
that are different where neither
49:26
describes a log that's a prefix of the
49:28
other one's log one tell these two cases
49:31
apart this telling that situation apart
49:40
is the purpose of the consistency proof
49:44
the log or Merkel consistency proof that
49:47
the reading is talked about so this is
49:49
the
49:52
la consistency proof
49:58
you
50:05
so the game here is that we're given to
50:08
sign tree heads H 1 and H 2 and we're
50:12
asking is h 1s log prefix really it's
50:22
not these are - these are hashes so it's
50:24
really asking about the log that the
50:26
hashes represent and you know we're
50:38
hoping the answer is yes and if the
50:40
answer's no that means that the log
50:41
servers Fork Dustin is hiding something
50:43
from one party or the other okay well it
50:51
turns out that um as we as I mentioned
50:54
before the as the Merkel tree as the log
50:57
grows the Merkel tree also grows and
50:59
what we see is a sequence of signs of
51:03
tree heads each one as a log doubles in
51:11
size each one has its as its left thing
51:14
let me draw in the actual hash functions
51:17
of this hash function is hashing up two
51:20
things the result of this hash function
51:24
is one of the inputs to the next sign
51:27
tree head the result of this hash
51:28
function is one of the inputs to the
51:30
next sign tree head I know we get this
51:34
kind of tree of life sign tree heads all
51:42
right and I need to sign tree heads if
51:45
they're legitimate you know if each one
51:47
is log is a prefix of H 2 that means
51:49
that maybe this one's H 1 and this one's
51:50
H 2 and they're gonna have this
51:52
relationship thing you know if each one
51:55
is a piece of H 2 then they must have
51:57
this relationship where each 2 was
51:59
produced by taking each one hashing it
52:02
with some other thing and maybe hashing
52:04
that with some other thing until we get
52:06
to the point where we find H 2 and with
52:10
means is that if a browser or monitor
52:14
challenges a log a log server to prove
52:20
that each one's log is really a prefix
52:23
of h2s log what the log server has to
52:27
produce is this sequence of other the
52:31
other side of each of the Hat sign tree
52:34
head hashes on the way from h1 to h2 and
52:39
this is the proof and then again you
52:43
know this is reminiscent of the
52:46
inclusion proofs then to check the proof
52:51
you need to take each one hash it with
52:54
the first other thing you know hash that
52:57
along with the second other things that
52:58
you get to the last one of these and
53:00
that had better be equal to h2 if it is
53:03
it's a proof that h2 is a suffix of each
53:08
one otherwise the log servers evidently
53:13
tried to fork you and again you know the
53:18
basis of this is that there's no other
53:22
you know h2 really isn't as supposing h1
53:25
isn't a prefix of h2 there's no way that
53:29
uh since h2 was created from some actual
53:34
log that's not the same as h1 there's no
53:36
way that the log server could cook up
53:40
these values that are required to cause
53:44
the hashes this sort of repeated hash of
53:47
h1 to equal H to H do really encompass
53:50
ooming that the cryptographic hash does
53:53
prevent you from binding to different
53:55
inputs that produce the same out
54:02
alright ok so this is the log
54:06
consistency proof okay so the question
54:12
is who usually challenges the log server
54:14
so I'll actually talk about that in a
54:15
minute but it turns out that um both
54:19
browsers and monitors
54:25
well Luke browsers and monitors
54:28
challenge the log server you it's
54:30
actually usually the browser's
54:31
challenging the log server that's the
54:33
most important thing but there's two
54:35
points in time at which you need to
54:36
challenge the log server to produce
54:37
these proofs and I'll talk about both of
54:41
them all right okay actually so the
55:00
first place at which one point at which
55:06
these proofs are used as for gossip as
55:07
part of gossip as I outlined and the the
55:11
scheme that's intended for gossip is
55:12
that browsers will periodically talk to
55:16
some central repository of some set of
55:18
central repositories and just contribute
55:22
to a pool of sign tree heads the sign
55:24
tree hits the recently seen from the log
55:28
server and the browsers were also
55:30
periodically pull out random elements of
55:34
sign tree heads that other browsers have
55:36
seen just Brandon they pulled them out
55:37
of the pool and it'll be multiple of
55:39
these collects these pools run by
55:41
different people so that if one of them
55:43
is cheating that will be proof against
55:46
that and then the browser will for
55:51
whatever just any random sign tree has
55:53
it apples out of the pool it will ask
55:56
the log server to produce the logs
55:59
insistency proof for that pair of sign
56:01
tree heads and you know if nobody's
56:03
cheating design it should always be easy
56:06
for the log server to produce you know
56:09
any consistency proof that's demanded of
56:12
it but if it's for somebody suppose it
56:15
the log server is for somebody and given
56:18
them a sign tree had this really
56:19
describes a totally different log or
56:21
even a long the difference in one
56:22
element from the logs that everybody
56:25
else is seeing eventually that browser
56:27
will contribute that's that sign tree
56:30
head to the pool the gossip pool then
56:34
eventually somebody else
56:36
we'll pull that sign tray head out of
56:38
the pool and ask for a proof for you
56:41
know some other sign tree had that
56:42
presumably is on a different Fork and
56:43
then the log server will not be able to
56:46
produce the proof and I'm since they're
56:49
signed since the scientist or signed by
56:52
the log server that's just absolute
56:55
proof that the log server has forked two
56:59
of its clients presumably with intent
57:02
reveal a bogus certificate to one of
57:05
them and hide it from the other okay but
57:09
there's actually another place where it
57:11
turns out you need the these consistency
57:15
proves not just during gossip but
57:18
actually also during the ordinary
57:19
operation of the browsers so the the
57:28
difficulty is that suppose you know
57:31
suppose the browser is it's kind of
57:33
seeing consistent version of the log is
57:36
the same as everybody else but then log
57:39
server wants to trick it into using this
57:41
bogus certificate so the log server
57:48
sends it a signed tree you know makes
57:52
signed RIA that's different from
57:53
everybody else that refers to a you know
57:56
malicious log that contains this bad
57:58
certificate preferred video since it
57:59
doesn't want other people to notice
58:00
certainly doesn't want you know the
58:02
monitors to notice you know cooks up
58:04
this other log that is what everybody
58:07
else is seeing all right so now the you
58:12
know the browser checks and sees you
58:16
know I asked for inclusion proof and the
58:18
inclusion that log server will be able
58:20
to produce the inclusion proof because
58:21
this sign tree had that the browser has
58:23
really does refer to this bad log the
58:25
browser will go ahead and use this bogus
58:27
certificate and maybe get tricked and
58:30
give away the user's password
58:31
you know who knows what but depending on
58:36
the details of other browsers work we're
58:38
at risk of the next time the browser
58:40
which it doesn't realize anything's gone
58:42
wrong talks to the log server the log
58:44
server might then say you know there's a
58:46
new log with a bunch of new stuff on it
58:47
and here is the sign tree
58:49
of the current log why don't you switch
58:52
my to use that as your sign tree hit and
58:54
so now if that were allowed to happen
58:59
then the browser's now would completely
59:01
lost the evidence that anything went
59:03
wrong because now the browser is using
59:04
the same trees everybody else no it's
59:06
going to contribute this sign tree head
59:08
to the gossip pool it's all gonna look
59:10
good and we had this sort of brief evil
59:15
tree that was evil log that was revealed
59:17
evil log Fork but if the browser's are
59:20
willing to accept a new sign tree head
59:22
then we can basically have the browser
59:25
forget about so we want what we want is
59:30
this what we want is for if a browser if
59:34
the log service shows a particular log
59:38
to the browser that the browser that
59:41
they can't trick the browser into
59:43
switching away from that log that is
59:46
that we want to be able to enforce that
59:48
the browser sees only strict extensions
59:52
to the log that it's seen already and
59:55
doesn't simply get switched to a log
59:57
that is not compatible with the log the
60:00
browser seen before it's the property
60:01
that we're looking for it's actually
60:03
called for consistency and with any
60:12
first two is that if the browser's been
60:14
forked onto a different fork from other
60:16
people then they must stay on that fork
60:18
in it it should never be able to switch
60:22
to the main fork and the reason for that
60:25
is we want to preserve you need to
60:27
preserve this bad sign tree head and its
60:29
successors so that when the browser
60:33
participates in the gossip protocol it's
60:36
contributing sign tree heads that nobody
60:41
else has and that cannot be proved to be
60:44
compatible using the log consistency
60:46
proof okay so how do we achieve for
60:48
consistency well um it's actually easy
60:52
with the tools we have now every time
60:53
the log server tells a browser oh here's
60:56
a new sign tree head for a longer log
60:58
the browser will require the will not
61:01
accept the new sign tree head until the
61:04
log server has has produced a log
61:08
consistency proof that the new sign tree
61:10
head describes a suffix of the old sign
61:15
tree that is that the log of the old
61:17
sign tree has a prefix of the log of the
61:19
new sign tree and of course if a log
61:21
server is as forked the browser and it's
61:24
keeping the browser on that same Fork it
61:26
can produce the proofs but of course you
61:28
know it's digging its grave even deeper
61:30
because I'm as producing more and more
61:33
sign tree heads for a which will
61:35
eventually be caught by the gossip
61:37
protocol whereas if the blog server
61:40
tries to cause the browser to switch to
61:43
a sign tree head that describes the same
61:45
log everybody else has been seeing the
61:48
browser will demand a consistency proof
61:50
and the log server will not be able to
61:52
produce it because deed the log
61:55
described by the first sign tree head is
61:57
not a prefix of the log described by the
62:00
second sign tree
62:05
okay okay so the system these these log
62:11
consistency proofs provide for
62:13
consistency and for consistency plus
62:15
gossiping and that requiring this log
62:20
consistency proves for the science found
62:23
by gossiping
62:24
I'm the two of them together make it
62:27
likely that all the participants or
62:31
seeing the same log and that if they're
62:33
not seeing the same log they'll be able
62:34
to detect that fact by the failure of a
62:38
log consistency proof
62:45
any questions
62:53
okay so that how many log service are
62:58
there that is a great question
62:59
so I describe the system as if there was
63:02
just one log server it turns out in the
63:03
real system there's lots of log servers
63:05
at least dozens so this is a deployed
63:07
system which you can programmed in that
63:09
is actually used by Chrome and I think
63:12
Safari there are at least dozens of
63:15
these log servers and when certificate
63:17
and certificate authorities are actually
63:19
required by chrome to submit all their
63:21
certificates to the to the log servers
63:25
to multiple log servers the different
63:29
log servers don't actually keep
63:30
identical logs the convention is that a
63:32
certificate authority will submit a new
63:34
certificate to save you know a couple
63:37
maybe five different log servers and
63:41
actually in the certificate information
63:44
that a website tells your browser it
63:46
includes the identities of log servers
63:50
of the certificate transparency log
63:52
servers that have the certificate in
63:54
their log so your browser knows which
63:56
log servers to talk to and the reason
64:01
why there's more than one of them is of
64:03
course some of them may go bad some of
64:05
them may turn out to be malicious or go
64:06
out of business or who knows what and in
64:09
that case you still want to have a
64:10
couple more to fall back on they don't
64:15
have to be identical because they don't
64:17
as long as the certificate is in at
64:20
least one log that's you know as far as
64:23
anybody knows is trustworthy that's
64:25
sufficient because you know the issue
64:32
here
64:33
not really necessarily the fact that the
64:36
log had the certificate in it because
64:37
that's not proof that the certificate is
64:40
good all we're looking for is log
64:43
servers that aren't forking the monitors
64:47
and browsers that use them so it's
64:50
enough for a certificate to be in even a
64:52
single log server that's not forking
64:56
people because then the monitors are
64:58
guaranteed to see it because the
64:59
monitors check all the log servers so if
65:04
a bogus certificate shows up even even a
65:06
single log server the monitors will
65:07
eventually notice because all the
65:10
monitors look at all the log servers
65:15
that the browsers are willing to accept
65:18
all right another question what prevents
65:22
a log server from going down and issuing
65:25
bogus certificates before they get
65:28
caught you know nothing actually if
65:31
you're willing to that's definitely a
65:34
defect in the system that at least for a
65:36
while you can
65:38
malicious log server contributing bogus
65:43
certificates so if you have a
65:44
certificate authority that's become
65:47
malicious and this issuing bogus
65:49
certificates they look correct but
65:51
they're bogus and a log server then that
65:59
that's willing to serve these it's
66:00
willing to put these certificates in the
66:01
log and of course they all are then at
66:04
least for a while browsers will be
66:05
willing to use them the thing is though
66:07
that the you know they will be caught
66:09
and this is the system is its intent is
66:12
to improve the situation in the priests
66:15
or to make a transparency system if
66:17
somebody was issuing bogus certificates
66:19
and browsers were being tricked into
66:21
using them you might never find out ever
66:23
in the certificate transparency world
66:26
you may not find out right away and so
66:28
some some people may use them but then
66:31
relatively quickly you know a few days
66:32
or something the monitors will start to
66:35
notice that there's bad certificates in
66:37
the logs and somebody will go and track
66:39
it down and figure out who is malicious
66:41
or who is making mistakes
66:52
yeah so I guess a certificate a
66:56
certificate transparency law could
66:58
refuse to talk to the monitors yeah I'm
67:02
not sure I think ultimately the if you
67:08
know we're now treading into a kind of
67:09
non-technical region you know what to do
67:11
if there's evidence that something's
67:13
gone wrong this is actually quite hard
67:15
because much of the time is something
67:18
seems to go wrong even bogus
67:20
certificates often often the reason it's
67:22
just somebody made a mistake it was a
67:24
legitimate mistake you know somebody
67:26
blew it and it's not evidence of malice
67:28
is just that somebody made a mistake I
67:31
think what would happen if a monitor was
67:33
misbehaving in almost any way like not
67:35
answering requests if it was doing
67:37
consistently people notice and either
67:41
ask them to shape up or take them out of
67:43
the list
67:44
stop using them the browser vendors
67:46
would take that logs her out of a list
67:48
of acceptable log servers after a while
67:50
but yeah there's like a gray area of bad
67:53
behavior that's not bad enough to the
67:56
warrant being taken out of the
67:57
acceptable list I think of a log server
68:00
has been found to work the question is
68:01
what if the log server has been found
68:03
before what happens then I think I think
68:07
what would happen is the people who were
68:09
run you know the people who the browser
68:11
vendors would talk to the log server and
68:15
ask them the people running the log
68:17
server and ask them what happened and if
68:19
they came up with a convincing
68:21
explanation that they didn't made a
68:22
mistake you know which maybe they
68:24
couldn't maybe I don't know they their
68:26
machine crashes it loses part of their
68:28
log they restart you know starting from
68:31
a prefix of the log and start growing a
68:34
different log if it seems like a mistake
68:37
honest mistake then well it was a
68:41
mistake but if it if the log server
68:44
operators can't provide a convincing
68:46
explanation of what happened then I
68:48
think the browser vendors would just
68:49
delete them from the list of acceptable
68:53
klog servers okay but these are you know
69:02
these are sort of problems with the
69:06
system because you can you know the
69:09
definitions of like who owns a name or
69:11
what acceptable but you know whether
69:13
it's okay for your server to be down or
69:14
not these are very hard to pin down
69:18
properties you know I think the system
69:24
is not full you could definitely get
69:26
away with bad behavior at least for a
69:28
while but the hope is that there's
69:32
strong enough auditing here that if some
69:35
certificate authority or log server was
69:39
persistently badly behaved that people
69:42
would notice the monitors would notice
69:44
they may not do anything for a while but
69:45
eventually they would decide that you
69:50
know you're either too much of a pain or
69:52
to malicious to be part of the system
69:54
and delete you from the browser lists of
69:58
course they split the browser vendors in
69:59
a position of quite strong power so Wow
70:03
the system is in general pretty
70:04
decentralized yeah there can be lots of
70:06
certificate authorities and lots of
70:08
certificate transparency log servers
70:10
there's only a handful of browser
70:12
vendors and that there because they
70:15
maintain the lists of acceptable
70:17
certificate authorities and log servers
70:21
they do have a lot of power and you know
70:26
it's the way it is unfortunately okay so
70:31
things to take away from a certificate
70:35
transparency design so one thing is the
70:38
key property it has super important is
70:40
just that everyone sees the same log
70:43
even if some of the parties are
70:46
malicious either everyone sees the same
70:48
long or they can accumulate evidence
70:50
from failed proofs that something's
70:53
funny is going on and because both
70:55
browsers who are using those
70:56
certificates and the owners of the DNS
70:59
names who are running monitors see the
71:01
same log because of these proofs
71:05
the monitors can detect problems and
71:08
therefore the browser's even though the
71:10
browsers can't actually detect bogus
71:11
certificates they can at least be
71:13
confident that there if there's bogus
71:14
certificates out there that monitors
71:16
will detect them and possibly put them
71:19
on revocation lists actually that's
71:20
something I didn't mention if if there's
71:23
evidence of a monitor spots what must be
71:26
a bogus certificate like MIT sees
71:29
somebody they don't know about being
71:32
issued a certificate for MIT did you it
71:34
turns out there's a pre-existing
71:35
revocation service that you can put bad
71:38
certificates on that the browser's check
71:41
so if a monitor sees a bogus certificate
71:44
it can actually be effectively disabled
71:46
by putting it on in the revocation
71:49
certificate revocation system that's not
71:51
part of certificate transparency it's
71:53
been around for a long time okay so the
71:57
key property is everyone sees the same
71:58
log of certificates another thing to
72:02
take away from this is that if you can't
72:04
figure out a way to prevent bad behavior
72:07
maybe you can build something these
72:10
usable that relies on auditing instead
72:13
of preventing that is can detect bad
72:16
things after the fact that might be good
72:19
enough it's often much easier than
72:21
preventing the bad things some technical
72:24
ideas are here in this this work one is
72:27
this idea of equivocation that I'm a big
72:30
danger is the possibility that a
72:33
malicious server will sort of provide
72:35
split views one viewed one set of people
72:38
another view to another set of people
72:39
it's usually called a fork or
72:42
equivocation it's an important kind of
72:43
attack another property this for
72:46
consistency property it turns out it's
72:48
often valuable to when you're worried
72:50
about Forks to build a system that
72:52
forces the malicious server once it has
72:55
formed somebody to keep them on that
72:57
fork so it can't erase evidence by
73:00
erasing a fork I'm the final technical
73:03
trick is the notion of gossiping in
73:06
order to detect for because it's
73:08
actually gen if the participants don't
73:10
communicate with each other it's
73:13
actually typically not possible to
73:14
notice that there has been a fork so if
73:17
you want to detect Forks there has to be
73:18
one way or another
73:19
some kind of gossip some kind of
73:22
communication between the parties so
73:23
they can compare notes and detect forks
73:26
and we'll see most of these things again
73:30
next week when we look at Bitcoin and
73:36
that's all I had to say