[Data_Structer] Hash Table

Hash table & BST(Binary Search Tree)


โญโญBST

[ํ•ต์‹ฌ]

์ด์ง„ํƒ์ƒ‰ํŠธ๋ฆฌ(Binary Search Tree; BST)๋Š” ์ •๋ ฌ๋œ tree.

์–ด๋Š node๋ฅผ ์„ ํƒํ•˜๋“  ํ•ด๋‹น node์˜ left subtree์—๋Š” ๊ทธ node์˜ ๊ฐ’๋ณด๋‹ค ์ž‘์€ ๊ฐ’๋“ค์„ ์ง€๋‹Œ node๋“ค๋กœ๋งŒ ์ด๋ฃจ์–ด์ ธ ์žˆ๊ณ , node์˜ right subtree์—๋Š” ๊ทธ node์˜ ๊ฐ’๋ณด๋‹ค ํฐ ๊ฐ’๋“ค์„ ์ง€๋‹Œ node๋“ค๋กœ๋งŒ ์ด๋ฃจ์–ด์ ธ ์žˆ๋Š” binary tree

๊ฒ€์ƒ‰๊ณผ ์ €์žฅ, ์‚ญ์ œ์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” ๋ชจ๋‘ O(logn)์ด๊ณ , worst case๋Š” ํ•œ์ชฝ์œผ๋กœ ์น˜์šฐ์นœ tree๊ฐ€ ๋์„ ๋•Œ O(n)

[ ๐Ÿฏย TIP]

๐Ÿ’ก BST๋Š” ์ €์žฅ๊ณผ ๋™์‹œ์— ์ •๋ ฌ์„ ํ•˜๋Š” ์ž๋ฃŒ๊ตฌ์กฐ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ๋•Œ ์ผ์ •ํ•œ ๊ทœ์น™์— ๋”ฐ๋ผ ์ €์žฅ.

BST

image

image

BST ์กฐ๊ฑด

  1. root node์˜ ๊ฐ’๋ณด๋‹ค ์ž‘์€ ๊ฐ’์€ left subtree์—, ํฐ ๊ฐ’์€ right subtree์— ์žˆ๋‹ค.
  2. subtree๋„ 1๋ฒˆ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•œ๋‹ค.(Recursive)

image

image

search O(logn)

โญโญ Hash table

hash table์€ ํšจ์œจ์ ์ธ ํƒ์ƒ‰(๋น ๋ฅธ ํƒ์ƒ‰)์„ ์œ„ํ•œ ์ž๋ฃŒ๊ตฌ์กฐ๋กœ์จ key-value์Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ. hash function $h$์— key๊ฐ’์„ ์ž…๋ ฅ์œผ๋กœ ๋„ฃ์–ด ์–ป์€ ํ•ด์‹œ๊ฐ’ h(k)๋ฅผ ์œ„์น˜๋กœ ์ง€์ •ํ•˜์—ฌ key- value ๋ฐ์ดํ„ฐ ์Œ์„ ์ €์žฅ. ์ €์žฅ, ์‚ญ์ œ, ๊ฒ€์ƒ‰์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” ๋ชจ๋‘ O(1)

[ ๐Ÿฏย TIP]

์ข‹์€ hash function์˜ ์กฐ๊ฑด.

์ข‹์€ hash function์˜ ํ•ต์‹ฌ์ ์ธ ์กฐ๊ฑด์€ ํ•ด์‹œ๊ฐ’์ด ๊ณ ๋ฅด๊ฒŒ ๋ถ„ํฌ๋˜๊ฒŒ ํ•˜๋Š” ๊ฒƒ

Direct-address Table

Direct-address Table(์ง์ ‘ ์ฃผ์†Œํ™” ํ…Œ์ด๋ธ”)์ด๋ž€, key ๊ฐ’์œผ๋กœ k๋ฅผ ๊ฐ–๋Š” ์›์†Œ๋Š” index k์— ์ €์žฅํ•˜๋Š” ๋ฐฉ์‹

key: ์ถœ์„๋ฒˆํ˜ธ, value: ์ด๋ฆ„

(3, ๋…ธ์ •ํ˜ธ)
(5, ๋ฐฐ์ค€์„)
(6, ์ •์žฌํ—Œ)
(7, ๋‚จ์˜์šฑ)

์ง์ ‘ ์ฃผ์†Œํ™” ๋ฐฉ๋ฒ•์œผ๋กœ ํ†ตํ•ด key-value ์Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๊ณ ์ž ํ•˜๋ฉด ๋งŽ์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒ

  • ๋ถˆํ•„์š”ํ•œ ๊ณต๊ฐ„ ๋‚ญ๋น„
key: ํ•™๋ฒˆ, value: ์ด๋ฆ„

(2022390, ๋…ธ์ •ํ˜ธ)
(2022392, ๋ฐฐ์ค€์„)
(2022393, ์ •์žฌํ—Œ)
(2022401, ๋‚จ์˜์šฑ)

  • key๊ฐ€ ๋‹ค์–‘ํ•œ ์ž๋ฃŒํ˜•์„ ๋‹ด์„ ์ˆ˜ ์—†๊ฒŒ ๋จ
key: ID, value: ์ด๋ฆ„

(nossi8128, ๋…ธ์ •ํ˜ธ)
(js9876, ๋ฐฐ์ค€์„)
(zebra001, ์ •์žฌํ—Œ)
(nam1234, ๋‚จ์˜์šฑ)

Hash table

(key, value) ๋ฐ์ดํ„ฐ ์Œ์„ ์ €์žฅํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•.

hash table์€ hash function h๋ฅผ ์ด์šฉํ•ด์„œ (key, value)๋ฅผ index: h(k)์— ์ €์žฅ.

์ด ๋•Œ, โ€œํ‚ค k๊ฐ’์„ ๊ฐ–๋Š” ์›์†Œ๊ฐ€ ์œ„์น˜ h(k)์— hash๋œ๋‹ค.โ€ ๋˜๋Š” โ€œh(k)๋Š” ํ‚ค k์˜ ํ•ด์‹œ๊ฐ’์ด๋‹คโ€๋ผ๊ณ  ํ‘œํ˜„. key๋Š” ๋ฌด์กฐ๊ฑด ์กด์žฌํ•ด์•ผ ํ•˜๋ฉฐ, ์ค‘๋ณต๋˜๋Š” key๊ฐ€ ์žˆ์–ด์„œ๋Š” ์•ˆ๋œ๋‹ค.

ํ•œํŽธ, hash table์„ ๊ตฌ์„ฑํ•˜๊ณ  ์žˆ๋Š”, (key, value)๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ๊ฐ์˜ ๊ณต๊ฐ„์„ slot ๋˜๋Š” bucket์ด๋ผ ํ•œ๋‹ค.

Collision

collision์ด๋ž€ ์„œ๋กœ ๋‹ค๋ฅธ key์˜ ํ•ด์‹œ๊ฐ’์ด ๋˜‘๊ฐ™์„ ๋•Œ๋ฅผ ๋งํ•œ๋‹ค.

์ฆ‰, ์ค‘๋ณต๋˜๋Š” key๋Š” ์—†์ง€๋งŒ ํ•ด์‹œ๊ฐ’์€ ์ค‘๋ณต๋  ์ˆ˜ ์žˆ๋Š”๋ฐ ์ด ๋•Œ collision์ด ๋ฐœ์ƒํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค.

๋”ฐ๋ผ์„œ collision์ด ์ตœ๋Œ€ํ•œ ์ ๊ฒŒ ๋‚˜๋„๋ก hash function์„ ์ž˜ ์„ค๊ณ„ํ•ด์•ผํ•˜๊ณ , ์–ด์ฉ” ์ˆ˜ ์—†์ด collision์ด ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ seperate chaining ๋˜๋Š” open addressing๋“ฑ์˜ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๊ฒฐํ•ด์•ผ ํ•œ๋‹ค.

์‹œ๊ฐ„๋ณต์žก๋„์™€ ๊ณต๊ฐ„ํšจ์œจ

์‹œ๊ฐ„๋ณต์žก๋„๋Š” ์ €์žฅ, ์‚ญ์ œ, ๊ฒ€์ƒ‰ ๋ชจ๋‘ ๊ธฐ๋ณธ์ ์œผ๋กœ O(1)์ด์ง€๋งŒ, collision์œผ๋กœ ์ธํ•˜์—ฌ ์ตœ์•…์˜ ๊ฒฝ์šฐ O(n)์ด ๋  ์ˆ˜ ์žˆ๋‹ค.

๊ณต๊ฐ„ํšจ์œจ์„ฑ์€ ๋–จ์–ด์ง„๋‹ค. ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋˜๊ธฐ ์ „์— ๋ฏธ๋ฆฌ ์ €์žฅ๊ณต๊ฐ„(slot, bucket)์„ ํ™•๋ณดํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ.

๋”ฐ๋ผ์„œ ์ €์žฅ๊ณต๊ฐ„์ด ๋ถ€์กฑํ•˜๊ฑฐ๋‚˜ ์ฑ„์›Œ์ง€์ง€ ์•Š์€ ๋ถ€๋ถ„์ด ๋งŽ์€ ๊ฒฝ์šฐ๊ฐ€ ์ƒ๊ธธ ์ˆ˜ ์žˆ๋‹ค.

โญโญโญโญ Hash table์—์„œ collision์ด ๋ฐœ์ƒํ•  ๋•Œ ํ•ด๊ฒฐ๋ฐฉ๋ฒ•

[ํ•ต์‹ฌ ๋‹ต๋ณ€]

collision์ด ๋ฐœ์ƒํ•  ๊ฒฝ์šฐ ๋Œ€ํ‘œ์ ์œผ๋กœ 2๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•ด๊ฒฐ

์ฒซ ๋ฒˆ์งธ, open addressing ๋ฐฉ์‹์€ collision์ด ๋ฐœ์ƒํ•˜๋ฉด ๋ฏธ๋ฆฌ ์ •ํ•œ ๊ทœ์น™์— ๋”ฐ๋ผ hash table์˜ ๋น„์–ด์žˆ๋Š” slot์„ ์ฐพ๋Š”๋‹ค.

๋นˆ slot์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์— ๋”ฐ๋ผ ํฌ๊ฒŒ Linear Probing, Quadratic Probing, Double Hashing์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค.

๋‘ ๋ฒˆ์งธ, separete chaining ๋ฐฉ์‹์€ linked list๋ฅผ ์ด์šฉํ•œ๋‹ค. ๋งŒ์•ฝ์— collision์ด ๋ฐœ์ƒํ•˜๋ฉด linked list์— ๋…ธ๋“œ(slot)๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•œ๋‹ค.

๐Ÿ’ก hashtable์—์„œ collision์ด ๋ฐœ์ƒํ–ˆ์„ ๋•Œ, seperate chaining๊ณผ open addressing ๋‘ ๊ฐ€์ง€ ๋ฐฉ์‹์œผ๋กœ ํ•ด๊ฒฐ์„ ํ•œ๋‹ค.

๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์–ด๋–ค ๋ฉ”์ปค๋‹ˆ์ฆ˜์œผ๋กœ ์ž‘๋™ํ•˜๋Š”์ง€, ์–ด๋–ค ์ฐจ์ด์ ์ด ์žˆ๋Š”์ง€ ์ดํ•ด

๋˜ํ•œ seperate chaining์˜ ๊ฒฝ์šฐ worst case์˜ ์‹œ๊ฐ„๋ณต์žก๋„์— ๋Œ€ํ•ด์„œ ์„ค๋ช…ํ•  ์ค„ ์•Œ์•„์•ผ ํ•œ๋‹ค.

์‚ฝ์ž…๊ณผ ๊ฒ€์ƒ‰, ์‚ญ์ œ์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” O(1)์ด์ง€๋งŒ, worst case์˜ ๊ฒฝ์šฐ์— O(n)๊ฐ€ ๋  ์ˆ˜ ์žˆ๋‹ค.

Open addressing

open addressing ๋ฐฉ์‹์€ collision์ด ๋ฐœ์ƒํ•˜๋ฉด ๋ฏธ๋ฆฌ ์ •ํ•œ ๊ทœ์น™์— ๋”ฐ๋ผ hash table์˜ ๋น„์–ด์žˆ๋Š” slot์„ ์ฐพ๋Š”๋‹ค.

์ถ”๊ฐ€์ ์ธ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ linked list ๋˜๋Š” tree์ž๋ฃŒ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ์ถ”๊ฐ€๋กœ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น์„ ํ•˜๋Š” separate chaining๋ฐฉ์‹์— ๋น„ํ•ด ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ ๊ฒŒ ์‚ฌ์šฉํ•œ๋‹ค.

open addressing์€ ๋นˆ slot์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์— ๋”ฐ๋ผ ํฌ๊ฒŒ Linear Probing, Quadratic Probing, Double Hashing์œผ๋กœ ๋‚˜๋‰œ๋‹ค.

Linear Probing(์„ ํ˜• ์กฐ์‚ฌ๋ฒ•)& Quadratic Probing(์ด์ฐจ ์กฐ์‚ฌ๋ฒ•) : ์„ ํ˜• ์กฐ์‚ฌ๋ฒ•์€ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•œ ํ•ด์‹œ๊ฐ’์œผ๋กœ ๋ถ€ํ„ฐ ์ผ์ •ํ•œ ๊ฐ’๋งŒํผ(+1, +2, +3, โ€ฆ) ๊ฑด๋„ˆ ๋›ฐ์–ด, ๋น„์–ด ์žˆ๋Š” slot์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•œ๋‹ค.

์ด์ฐจ ์กฐ์‚ฌ๋ฒ•์€ ์ œ๊ณฑ์ˆ˜(+1^2, +2^2, +3^2, โ€ฆ)๋กœ ๊ฑด๋„ˆ ๋›ฐ์–ด, ๋น„์–ด ์žˆ๋Š” slot์„ ์ฐพ๋Š”๋‹ค.

์ถฉ๋Œ์ด ์—ฌ๋Ÿฌ๋ฒˆ ๋ฐœ์ƒํ•˜๋ฉด ์—ฌ๋Ÿฌ๋ฒˆ ๊ฑด๋„ˆ ๋›ฐ์–ด ๋นˆ slot์„ ์ฐพ๋Š”๋‹ค.

์„ ํ˜• ์กฐ์‚ฌ๋ฒ•๊ณผ ์ด์ฐจ ์กฐ์‚ฌ๋ฒ•์˜ ๊ฒฝ์šฐ ์ถฉ๋Œ ํšŸ์ˆ˜๊ฐ€ ๋งŽ์•„์ง€๋ฉด ํŠน์ • ์˜์—ญ์— ๋ฐ์ดํ„ฐ๊ฐ€ ์ง‘์ค‘์ ์œผ๋กœ ๋ชฐ๋ฆฌ๋Š” ํด๋Ÿฌ์Šคํ„ฐ๋ง(clustering)ํ˜„์ƒ์ด ๋ฐœ์ƒํ•˜๋Š” ๋‹จ์ 

ํด๋Ÿฌ์Šคํ„ฐ๋ง ํ˜„์ƒ์ด ๋ฐœ์ƒํ•˜๋ฉด, ํ‰๊ท  ํƒ์ƒ‰ ์‹œ๊ฐ„์ด ์ฆ๊ฐ€

  • Double Hashing(์ด์ค‘ํ•ด์‹œ, ์ค‘๋ณตํ•ด์‹œ) : ์ด์ค‘ ํ•ด์‹ฑ์€ open addressing ๋ฐฉ์‹์„ ํ†ตํ•ด collision์„ ํ•ด๊ฒฐํ•  ๋•Œ, probingํ•˜๋Š” ๋ฐฉ์‹์ค‘์— ํ•˜๋‚˜

linear probing์ด๋‚˜ quadratic probing์„ ํ†ตํ•ด ํƒ์‚ฌํ•  ๋•Œ๋Š” ํƒ์‚ฌ์ด๋™ํญ์ด ๊ฐ™๊ธฐ ๋•Œ๋ฌธ์— ํด๋Ÿฌ์Šคํ„ฐ๋ง ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.

ํด๋Ÿฌ์Šคํ„ฐ๋ง ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋„๋ก 2๊ฐœ์˜ ํ•ด์‹œํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹์„ ์ด์ค‘ ํ•ด์‹ฑ์ด๋ผ๊ณ  ํ•œ๋‹ค. ํ•˜๋‚˜๋Š” ์ตœ์ดˆ์˜ ํ•ด์‹œ๊ฐ’์„ ์–ป์„ ๋•Œ ์‚ฌ์šฉํ•˜๊ณ  ๋˜ ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ํ•ด์‹œ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•  ๋•Œ ํƒ์‚ฌ ์ด๋™ํญ์„ ์–ป๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•œ๋‹ค.

Separate chaining

Separate chaining ๋ฐฉ์‹์€ linked list(๋˜๋Š” Tree)๋ฅผ ์ด์šฉํ•˜์—ฌ collision์„ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•˜๋ฉด linked list์— ๋…ธ๋“œ(slot)๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅ

  • ์‚ฝ์ž…: ์„œ๋กœ ๋‹ค๋ฅธ ๋‘ key๊ฐ€ ๊ฐ™์€ ํ•ด์‹œ๊ฐ’์„ ๊ฐ–๊ฒŒ ๋˜๋ฉด linked list์— node๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ (key, value) ๋ฐ์ดํ„ฐ ์Œ์„ ์ €์žฅ. ์‚ฝ์ž…์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” O(1)
  • ๊ฒ€์ƒ‰: ๊ธฐ๋ณธ์ ์œผ๋กœ O(1)์˜ ์‹œ๊ฐ„๋ณต์žก๋„ ์ด์ง€๋งŒ ์ตœ์•…์˜ ๊ฒฝ์šฐ O(n)์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ ๊ฐ–๋Š”๋‹ค.
  • ์‚ญ์ œ: ์‚ญ์ œ๋ฅผ ํ•˜๊ธฐ ์œ„ํ•ด์„  ๊ฒ€์ƒ‰์„ ๋จผ์ € ํ•ด์•ผํ•˜๋ฏ€๋กœ ๊ฒ€์ƒ‰์˜ ์‹œ๊ฐ„๋ณต์žก๋„์™€ ๋™์ผ

๊ธฐ๋ณธ์ ์œผ๋กœ $O(1)$์ด์ง€๋งŒ ์ตœ์•…์˜ ๊ฒฝ์šฐ O(n)์˜ ์‹œ๊ฐ„๋ณต์žก๋„ ๊ฐ€์ง„๋‹ค.

worst case์˜ ๊ฒฝ์šฐ n๊ฐœ์˜ ๋ชจ๋“  key๊ฐ€ ๋™์ผํ•œ ํ•ด์‹œ๊ฐ’์„ ๊ฐ–๊ฒŒ ๋˜๋ฉด ๊ธธ์ด n์˜ linked list๊ฐ€ ์ƒ์„ฑ๋˜๊ฒŒ ๋œ๋‹ค. ์ด ๋•Œ, ๊ฒ€์ƒ‰์˜ ์‹œ๊ฐ„๋ณต์žก๋„๊ฐ€ O(n)

์ฐธ๊ณ : seperate chaining์€ ๊ธฐ๋ณธ์ ์œผ๋กœ linked list๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜์ง€๋งŒ, collision์ด ๋งŽ์ด ๋ฐœ์ƒํ•˜์—ฌ linked list์˜ ๊ธธ์ด๊ฐ€ ๊ธธ์–ด์ง€๋ฉด Binary Search Tree(BST)์ž๋ฃŒ๊ตฌ์กฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๊ธฐ๋„ ํ•จ.

BST๋ฅผ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ๊ฒ€์ƒ‰์˜ worst case ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ O(n) ์—์„œ O(logn)์œผ๋กœ ๋‚ฎ์ถœ ์ˆ˜ ์žˆ๋‹ค.

์ฐธ์กฐ : ๊ฐœ๋ฐœ๋‚จ๋…ธ์”จ ์ž๋ฃŒ๊ตฌ์กฐ

Leave a comment