QUESTIONS LAST TIME: - Ontime quiz, design discussion: SiteRiter, Dave's SiteRiter. TODAY: - Memory & pointers quiz results :( - More hashing; amortized O(1) - Quiz 2 return PROJECT 1 ALL DONE! PROJECT BREAK THIS WEEK! DO EVERYTHING ELSE! DEFINITION OF THE DAY OUT-OF-BAND adj. [telecommunications] 1. In software, describes values of a function which are not in its 'natural' range of return values, but are rather signals that some kind of exception has occurred. Many C functions, for example, return a nonnegative value, and indicate failure with an out-of-band return of -1. ... 3. In personal communication, using methods other than email, such as telephones or snail-mail. ON-TIME QUIZ POST-MORTEM BLECCH! n = 42 max = 9.0 avg = 6.7 ( < 75/100 ) min = 2.0 ON-TIME QUIZ POST-MORTEM BLECCH! n = 42 max = 9.0 avg = 6.7 ( < 75/100 ) min = 2.0 We Will Try This Again Next Week, in Ontime Quiz 3 ON-TIME QUIZ POST-MORTEM BLECCH! n = 42 max = 9.0 avg = 6.7 ( < 75/100 ) min = 2.0 We Will Try This Again Next Week, in Ontime Quiz 3 OntimeQuiz2FinalAdjusted = max(OntimeQuiz2,OntimeQuiz3) ON-TIME QUIZ POST-MORTEM BLECCH! n = 42 max = 9.0 avg = 6.7 ( < 75/100 ) min = 2.0 We Will Try This Again Next Week, in Ontime Quiz 3 OntimeQuiz2FinalAdjusted = max(OntimeQuiz2,OntimeQuiz3) (Note there will be no adjustments to OntimeQuiz3.) ON-TIME QUIZ POST-MORTEM 1. (2 points) Draw a 'memory and pointers' diagram as of the point marked '/* HERE */' in the following program. Omit 'args'. public class Test { public static void main(String[] args) { int x1 = 3; x1 Integer x2 = x1; +-----+ Integer x3 = x2; | | /* HERE */ | 3 | System.out.println(x1+x2+x3); +-----+ } } x2 +-------+ Integer | | +---------+ x3 | --------->| | +---------+ +-------+ | 3 | | | | | | ----------------------->+---------+ | | +---------+ ON-TIME QUIZ POST-MORTEM 2. (3 points) Draw a 'memory and pointers' diagram as of the point marked '/* HERE */' in the following program. Omit 'args'. public class Test { t1 int num = 6; +------+ Test(int n) { num = n; } | -----\ public static void main(String[] args) { | | | Test t1 = new Test(4); +------+ | Test t2 = new Test(1); t2 | Test t1 = t2; +------+ | +--------+ /* HERE */ | -----\->| | System.out.println(t1.num+t2.num); | | | 1 | } +------+ | | } +--------+ ON-TIME QUIZ POST-MORTEM 3. (4 points) Draw a 'memory and pointers' diagram as of the point marked '/* HERE */' in the following program. Omit 'args'. public class Test { int a; Test b, c; Test(int d, Test e, Test f) { a = d; b = e; c = f; } public static void main(String[] args) { Test t1 = new Test(7,null,null); t1 Test Test t2 = new Test(3,null,t1); +--------+ +------+ t1.b = t2; | | | 7 | t1 = t2.c.b; | | | |------| /* HERE */ +-----|--+ /----- | System.out.println(t1.a+" "+t2.a); | | |------| } t2 Test v | |null | } +----+ +------+<---/ | | | ---->| 3 | +------+ +----+ |------| ^ |null | | |------| | | ------------/ +------+ SITERITER DISCUSSION - What was harder/slower than you expected? - What was easier/faster than you expected? - What was the first bit of code you wrote? - What was the last bit of code you threw out? - Design issues: = Lexing, reading, marking, tokens = Parsing strategy = Symbol table design Dave's choices: = Lexing: YES, reading: BY CHAR, marking: YES, tokens: String = Parsing strategy: Top down parser = Symbol table design: Map Symbol has ArrayList Sequence has ArrayList for tokens [ Eclipse code peeks ] CLASS HASHMAP - UPSHOTS - Associates arbitrary 'keys' and 'values' - Built on = a hash function, that converts key objects to (hopefully usually different) integer values, and on = an equality function to specify when two objects should be considered as the same key. - So, objects used as keys must provide two essential properties: P1: boolean equals(Object other) Returns true iff the 'other' object is equal to the 'this' object in all ways that matter. P2: int hashCode() Returns a numeric code derived from the object 'this' such that for any Object A and B, if A.equals(B), then A.hashCode() == B.hashCode() - WARNING: You can get way screwed if P1 or P2 isn't true! - When everything works: Approximately O(1) lookup! Sweet! CLASS HASHMAP - DANGERS import java.util.Map; import java.util.HashMap; class Test { private Map map = new HashMap(); private class Key { int x, y; /* public int hashCode() { return x*1771+y; } */ public Key(int x, int y) { this.x = x; this.y = y; } public boolean equals(Object o) { if (!(o instanceof Key)) return false; Key k = (Key) o; return k.x == x && k.y == y; } } private void insert(int x, int y, Object value) { map.put(new Key(x,y), value); } private Object lookup(int x, int y) { return map.get(new Key(x,y)); } public static void main(String[] args) { Test t = new Test(); t.insert(8,3,"boom"); System.out.println(t.lookup(8,3)); } } $ javac Test.java $ java Test null CLASS HASHMAP - DANGERS import java.util.Map; import java.util.HashMap; class Test { private Map map = new HashMap(); private class Key { int x, y; public int hashCode() { return x*1771+y; } public Key(int x, int y) { this.x = x; this.y = y; } /* public boolean equals(Object o) { if (!(o instanceof Key)) return false; Key k = (Key) o; return k.x == x && k.y == y; } */ } private void insert(int x, int y, Object value) { map.put(new Key(x,y), value); } private Object lookup(int x, int y) { return map.get(new Key(x,y)); } public static void main(String[] args) { Test t = new Test(); t.insert(8,3,"boom"); System.out.println(t.lookup(8,3)); } } $ javac Test.java - The default implementations in Object.hashCode $ java Test and Object.equals means (1) No compiler errors, but null (2) Incorrect behavior. Thanks a lot, class Object! HASH TABLES - HOW DO THEY REALLY WORK? Getting an index from a name IDEA: 'Hash' the name up into a reasonably small number to use as an array index, with a HASH FUNCTION Upside: Can use a reasonably small array Downside: Have to deal with COLLISIONS: When two different names get hashed to the same number Issues: - What hash function? -> Speed -> Spread - How to deal with collisions? -> 'Open addressing' - put collided entries somewhere else in the table -> 'Separate chaining' - make a linked list of collided entries at each index in the array HASH TABLES - HOW DO THEY REALLY WORK? Getting an index from a name IDEA: 'Hash' the name up into a reasonably small number to use as an array index, with a HASH FUNCTION Upside: Can use a reasonably small array Downside: Have to deal with COLLISIONS: When two different names get hashed to the same number Issues: - What hash function? -> Speed -> Spread - How to deal with collisions? -> 'Open addressing' - put collided entries somewhere +----------------------------------------------+ ->|'Separate chaining' - make a linked list of | | collided entries at each index in the array | +----------------------------------------------+ HASH TABLES - SEPARATE CHAINING import com.remain.always.MyHash; class Whatever { public static void main(String[] args) { MyHash h = new MyHash(); h.insert("foo",1); h.insert("bar",7); h.insert("bletch",2); h.insert("mumble",3); h.insert("chaining",0); } } HASH TABLES - SEPARATE CHAINING import com.remain.always.MyHash; class Whatever { public static void main(String[] args) { MyHash h = new MyHash(); h.insert("foo",1); h.insert("bar",7); h.insert("bletch",2); h.insert("mumble",3); h.insert("chaining",0); } Inside MyHash somewhere.. } +----+ +--------+ +--------+ [0]| | |"mumble"| /--->|"bletch"| |----| /-->|--------| / |--------| [1]| -----/ | ------/ | null | |----| +--------+ +--------+ [2]| | |----| +----------+ +------+ +-------+ [3]| -----\ |"chaining"| |"bar" | /---->|"foo" | |----| \-->|----------| /->|------| / |-------| [4]| | | ---------/ | -----/ | null | +----+ +----------+ +------+ +-------+ HASH TABLES - SEPARATE CHAINING import com.remain.always.MyHash; class Whatever { public static void main(String[] args) { MyHash h = new MyHash(); h.insert("foo",1); h.insert("bar",7); h.insert("bletch",2); h.insert("mumble",3); h.insert("chaining",0); } Inside MyHash somewhere.. } (Wow, 5 inserts, 3 collisions! +----+ +--------+ +--------+ How did that happen?) [0]| | |"mumble"| /--->|"bletch"| |----| /-->|--------| / |--------| [1]| -----/ | ------/ | null | |----| +--------+ +--------+ [2]| | |----| +----------+ +------+ +-------+ [3]| -----\ |"chaining"| |"bar" | /---->|"foo" | |----| \-->|----------| /->|------| / |-------| [4]| | | ---------/ | -----/ | null | +----+ +----------+ +------+ +-------+ HASH TABLES - SEPARATE CHAINING import com.remain.always.MyHash; class Whatever { public static void main(String[] args) { MyHash h = new MyHash(); h.insert("foo",1); class MyHash { ... h.insert("bar",7); int hash(String s) { return s.length(); } h.insert("bletch",2); // ^^^ AWFUL HASH FUNCTION! ^^ h.insert("mumble",3); .. }; h.insert("chaining",0); } Inside MyHash somewhere.. } (Wow, 5 inserts, 3 collisions! +----+ +--------+ +--------+ How did that happen?) [0]| | |"mumble"| /--->|"bletch"| |----| /-->|--------| / |--------| [1]| -----/ | ------/ | null | |----| +--------+ +--------+ [2]| | |----| +----------+ +------+ +-------+ [3]| -----\ |"chaining"| |"bar" | /---->|"foo" | |----| \-->|----------| /->|------| / |-------| [4]| | | ---------/ | -----/ | null | +----+ +----------+ +------+ +-------+ HASH TABLES - SEPARATE CHAINING import com.remain.always.MyHash; class Whatever { public static void main(String[] args) { MyHash h = new MyHash(); h.insert("foo",1); class MyHash { ... h.insert("bar",7); int hash(String s) { return s.length(); } h.insert("bletch",2); // ^^^ AWFUL HASH FUNCTION! ^^ h.insert("mumble",3); .. }; h.insert("chaining",0); } Inside MyHash somewhere.. } (Wow, 5 inserts, 3 collisions! +----+ +--------+ +--------+ How did that happen?) [0]| | |"mumble"| /--->|"bletch"| |----| /-->|--------| / |--------| Hmm, better store the [1]| -----/ | 3 | ---/ | 2 |null| values here too.. Might |----| +--------+ +--------+ want the hashcodes too.. [2]| | |----| +----------+ +------+ +-------+ [3]| -----\ |"chaining"| |"bar" | /---->|"foo" | |----| \-->|----------| /->|------| / |-------| [4]| | | 0 | ----/ |7 | ---/ |1 |null| +----+ +----------+ +------+ +-------+ HASH TABLES - SEPARATE CHAINING import com.remain.always.MyHash; class Whatever { public static void main(String[] args) { MyHash h = new MyHash(); h.insert("foo",1); class MyHash { ... h.insert("bar",7); int hash(String s) { return s.length(); } h.insert("bletch",2); // ^^^ AWFUL HASH FUNCTION! ^^ h.insert("mumble",3); .. }; h.insert("chaining",0); } Inside MyHash somewhere.. } (Wow, 5 inserts, 3 collisions! +----+ +--------+ +--------+ How did that happen?) [0]| | |"mumble"| /--->|"bletch"| |----| /-->|--------| / |--------| Hmm, better store the [1]| -----/ | 3 | ---/ | 2 |null| values here too.. Might |----| +--------+ +--------+ want the hashcodes too.. [2]| | |----| +----------+ +------+ +-------+ [3]| -----\ |"chaining"| |"bar" | /---->|"foo" | |----| \-->|----------| /->|------| / |-------| sample [4]| | | 0 | ----/ |7 | ---/ |1 |null| code +----+ +----------+ +------+ +-------+ HASH TABLES Getting an index from a name IDEA: 'Hash' the name up into a reasonably small number to use as an array index, with a HASH FUNCTION Upside: Can use a reasonably small array Downside: Have to deal with COLLISIONS: When two different names get hashed to the same number Issues: - What hash function? -> Speed -> Spread - How to deal with collisions? -> 'Open addressing' - put collided entries somewhere else in the table -> 'Separate chaining' - make a linked list of collided entries at each index in the array - What happens if our reasonably small table fills up? HASH TABLES Getting an index from a name IDEA: 'Hash' the name up into a reasonably small number to use as an array index, with a HASH FUNCTION Upside: Can use a reasonably small array Downside: Have to deal with COLLISIONS: When two different names get hashed to the same number Issues: - What hash function? -> Speed -> Spread - How to deal with collisions? -> 'Open addressing' - put collided entries somewhere else in the table -> 'Separate chaining' - make a linked list of collided entries at each index in the array - What happens if our reasonably small table fills up? -> "With separate chaining, it never will!" Except for what? HASH TABLES Getting an index from a name IDEA: 'Hash' the name up into a reasonably small number to use as an array index, with a HASH FUNCTION Upside: Can use a reasonably small array Downside: Have to deal with COLLISIONS: When two different names get hashed to the same number Issues: - What hash function? -> Speed -> Spread - How to deal with collisions? -> 'Open addressing' - put collided entries somewhere else in the table -> 'Separate chaining' - make a linked list of collided entries at each index in the array - What happens if our reasonably small table fills up? -> "With separate chaining, it never will!" Except for what? -> Need to rehash into a larger array HASH TABLES - SEPARATE CHAINING +----+ +------------+ +------------+ [0]| | |"mumble"| 6 | /-->|"bletch"| 6 | |----| /-->|------------| / |------------| [1]| -----/ | 3 | ------/ | 2 | null | |----| +------------+ +------------+ [2]| | |----| +-------------+ +---------+ +---------+ [3]| -----\ |"chaining"|8 | |"bar" |3 | /->|"foo" | 3| |----| \>|-------------| /->|---------| / |---------| [4]| | | 0 | ---- | 7 | ----/ | 1 | null| +----+ +-------------+ +---------+ +---------+ HASH TABLES - SEPARATE CHAINING +----+ +------------+ +------------+ [0]| | |"mumble"| 6 | /-->|"bletch"| 6 | |----| /-->|------------| / |------------| [1]| -----/ | 3 | ------/ | 2 | null | |----| +------------+ +------------+ [2]| | |----| +-------------+ +---------+ +---------+ [3]| -----\ |"chaining"|8 | |"bar" |3 | /->|"foo" | 3| |----| \>|-------------| /->|---------| / |---------| [4]| | | 0 | ---- | 7 | ----/ | 1 | null| +----+ +-------------+ +---------+ +---------+ 0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10 . HASH TABLES - SEPARATE CHAINING +----+ +------------+ [0]| | /-->|"bletch"| 6 | |----| / |------------| [1]| -----------/ | 2 | null | |----| +------------+ [2]| | |----| +-------------+ +---------+ +---------+ [3]| -----\ |"chaining"|8 | |"bar" |3 | /->|"foo" | 3| |----| \>|-------------| /->|---------| / |---------| [4]| | | 0 | ---- | 7 | ----/ | 1 | null| +----+ +-------------+ +---------+ +---------+ 0 . 1 . 2 . 3 . 4 . 5 . +------------+ 6 .----->|"mumble"| 6 | 7 . |------------| 8 . | 3 | null | 9 . +------------+ 10 . HASH TABLES - SEPARATE CHAINING +----+ [0]| | |----| [1]| | |----| [2]| | |----| +-------------+ +---------+ +---------+ [3]| -----\ |"chaining"|8 | |"bar" |3 | /->|"foo" | 3| |----| \>|-------------| /->|---------| / |---------| [4]| | | 0 | ---- | 7 | ----/ | 1 | null| +----+ +-------------+ +---------+ +---------+ 0 . 1 . 2 . 3 . 4 . 5 . +------------+ +------------+ 6 .----->|"bletch"| 6 | |"mumble"| 6 | 7 . |------------| /->|------------| 8 . | 2 | -------/ | 3 | null | 9 . +------------+ +------------+ 10 . HASH TABLES - SEPARATE CHAINING +----+ [0]| | |----| [1]| | |----| [2]| | |----| +---------+ +---------+ [3]| -----\ |"bar" |3 | /->|"foo" | 3| |----| \---->|---------| / |---------| [4]| | | 7 | ----/ | 1 | null| +----+ +---------+ +---------+ 0 . 1 . 2 . 3 . 4 . 5 . +------------+ +------------+ +-------------+ 6 .----->|"bletch"| 6 | |"mumble"| 6 | |"chaining"|8 | 7 . |------------| /->|------------| |-------------| 8 .-\ | 2 | -------/ | 3 | null | | 0 | null | 9 . \ +------------+ +------------+ +-------------+ 10 . \-------------------------------------/ HASH TABLES - SEPARATE CHAINING +----+ [0]| | |----| [1]| | |----| [2]| | |----| +---------+ [3]| ----------------->|"foo" | 3| |----| |---------| [4]| | | 1 | null| +----+ +---------+ 0 . +---------+ 1 . |"bar" |3 | 2 . /->|---------| 3 .--/ | 7 | null| 4 . +---------+ 5 . +------------+ +------------+ +-------------+ 6 .----->|"bletch"| 6 | |"mumble"| 6 | |"chaining"|8 | 7 . |------------| /->|------------| |-------------| 8 .-\ | 2 | -------/ | 3 | null | | 0 | null | 9 . \ +------------+ +------------+ +-------------+ 10 . \-------------------------------------/ HASH TABLES - SEPARATE CHAINING +----+ [0]| | |----| [1]| | |----| [2]| | |----| [3]| | |----| [4]| | +----+ 0 . +---------+ +---------+ 1 . |"foo" | 3| |"bar" |3 | 2 . /->|---------| /-->|---------| 3 .--/ | 1 | ---/ | 7 | null| 4 . +---------+ +---------+ 5 . +------------+ +------------+ +-------------+ 6 .----->|"bletch"| 6 | |"mumble"| 6 | |"chaining"|8 | 7 . |------------| /->|------------| |-------------| 8 .-\ | 2 | -------/ | 3 | null | | 0 | null | 9 . \ +------------+ +------------+ +-------------+ 10 . \-------------------------------------/ HASH TABLES - SEPARATE CHAINING 0 . +---------+ +---------+ 1 . |"foo" | 3| |"bar" |3 | 2 . /->|---------| /-->|---------| 3 .--/ | 1 | ---/ | 7 | null| 4 . +---------+ +---------+ 5 . +------------+ +------------+ +-------------+ 6 .----->|"bletch"| 6 | |"mumble"| 6 | |"chaining"|8 | 7 . |------------| /->|------------| |-------------| 8 .-\ | 2 | -------/ | 3 | null | | 0 | null | 9 . \ +------------+ +------------+ +-------------+ 10 . \-------------------------------------/ HASH TABLES - SEPARATE CHAINING Now 5 elements in an array of 11, with already 2 collisions.. A big array won't fix a pitiful hash function.. 0 . +---------+ +---------+ 1 . |"foo" | 3| |"bar" |3 | 2 . /->|---------| /-->|---------| 3 .--/ | 1 | ---/ | 7 | null| 4 . +---------+ +---------+ 5 . +------------+ +------------+ +-------------+ 6 .----->|"bletch"| 6 | |"mumble"| 6 | |"chaining"|8 | 7 . |------------| /->|------------| |-------------| 8 .-\ | 2 | -------/ | 3 | null | | 0 | null | 9 . \ +------------+ +------------+ +-------------+ 10 . \-------------------------------------/ HASH FUNCTIONS .. private static int hash1(String p) { int val = 0; for (int i = 0; i