data_structure) implementing binary search tree and searching word algorithm ~!!
Hello~~ I'm lulu!
Today, I'm going to show how to implement binary search tree and searching word algorithm using a hundred thousand value which is from Shakespeare's works and lines.
Binary search tree is effective when we have to insert, delete, search a lot. I think it is really well balanced because the left side of the tree is smaller than current node, and right side is bigger than current node.
Let's see how we can implement this beautiful algorithm using python.
Today, I'm going to show how to implement binary search tree and searching word algorithm using a hundred thousand value which is from Shakespeare's works and lines.
Binary search tree is effective when we have to insert, delete, search a lot. I think it is really well balanced because the left side of the tree is smaller than current node, and right side is bigger than current node.
Let's see how we can implement this beautiful algorithm using python.
import random
import pandas as pd
class BinarySearchTree:
class Node:
def __init__(self, key, value, left, right):
self.key = key
self.value = [value]
self.left = left
self.right = right
def __init__(self):
self.root = None
self.key_count = 0
self.value_count = 0
def insert(self, key, value):
self.root = self.__insert(self.root, key, value)
self.value_count += 1
def __insert(self, node, key, value):
if node:
if node.key > key:
node.left = self.__insert(node.left, key, value)
elif node.key < key:
node.right = self.__insert(node.right, key, value)
elif node.key == key:
node.value.append(value)
else:
node = self.Node(key, value, None, None)
self.key_count += 1
# Tree์ ๋
ธ๋์ key, value๋ฅผ insertํ๋ ์๊ณ ๋ฆฌ์ฆ์
๋๋ค. / we are inserting
key and value into Tree's node.
# ์๋ก์ด ํค๊ฐ ์ถ๊ฐ๋๋ฉด key_count๋ ์ฆ๊ฐ์ํต๋๋ค. (๋์ผํ ํค๊ฐ ๋ค์ด์ค๋ ๊ฒฝ์ฐ๋
์ฆ๊ฐ ์ํจ) when new key is added,
#key_count also increase. (when same key is added, key_count
does not increase.)
return node
def in_order(self):
for x in self.__in_order(self.root):
yield x
def __in_order(self, node):
if not node:
return
if node.left:
for n in self.__in_order(node.left):
yield n
yield node
if node.right:
for n in self.__in_order(node.right):
yield n
def find(self, key):
return self.__find(self.root, key)
def __find(self, node, key):
if node:
if node.key == key:
return node.value
elif node.key > key:
return self.__find(node.left, key)
elif node.key < key:
return self.__find(node.right, key)
else:
return []
# Tree์ ๋
ธ๋์ key, value๋ฅผ insertํ๋ ์๊ณ ๋ฆฌ์ฆ์
๋๋ค. we are inserting key
and value into a Tree's node.
# ํค๊ฐ ์กด์ฌํ์ง ์์ผ๋ฉด [] (๋น ๋ฆฌ์คํธ)๋ฅผ ๋ฐํํฉ๋๋ค. when the key does not exist,
it returns [](empty list)
# ํค๊ฐ ์กด์ฌํ๋ฉด node.value๋ฅผ ๋ฐํํฉ๋๋ค. when the key does exist,
it returns node.value
def height(self):
return self.__height(self.root)
def __height(self, node):
if node is None:
return 0
else:
return max(self.__height(node.left), self.__height(node.right)) + 1
# ๋
ธ๋์ ๋์ด๋ฅผ ๋ฐํํ๋ ์๊ณ ๋ฆฌ์ฆ์
๋๋ค. This is a algorithm which returns
node's height
def print_stat(tree):
print(f"Height = {tree.height()}")
print("{:,} keys / {:,} values".format(tree.key_count, tree.value_count))
if __name__ == "__main__":
#-------------------------------------------------
# # ๊ฐ๋จํ BST ํ
์คํธ a simple BST(Binary Search Tree) Test
tree = BinarySearchTree()
print_stat(tree)
tree.insert(5, "F")
print_stat(tree)
tree.insert(3, "E")
tree.insert(8, "D")
tree.insert(6, "C")
print_stat(tree)
tree.insert(7, "B")
tree.insert(1, "A")
for x in tree.in_order():
print(f"{x.key}:{x.value}", end="->")
print()
print("Key=6", tree.find(6))
print("Key=4", tree.find(4))
print("Key=1", tree.find(1))
print("Key=5", tree.find(5))
print("Key=9", tree.find(9))
print_stat(tree)
#-------------------------------------------------
# # 10๋ง๊ฐ ๋ฐ์ดํฐ ๋ฃ๋ ์คํ This is a experiment inserting a hundred
thousand data
t = BinarySearchTree()
for i in range(100000):
k = random.randint(0, 100000)
t.insert(k, f"DATA {i}")
print_stat(t)
#-------------------------------------------------
# ์ธ์ต์คํผ์ด์ ํฌ๊ณก์์ ๋จ์ด๋ฅผ ๊ฒ์ํ๋ ์คํ Searching a word in a
Shakespeare's works and lines
# ๋จ์ด๋ฅผ ์
๋ ฅํ๋ฉด ํด๋น ๋จ์ด๊ฐ ๋์ค๋ ํฌ๊ณก์ ๋ผ์ธ์ ๊ฒ์ํด์ค. When we input a word,
it searches the word and lines
bst = BinarySearchTree()
#csv ํ์ผ์ ์ฝ์ด๋ค์ธ๋ค. read csv file
df = pd.read_csv('./Shakespeare_data.csv', dtype=str)
print("Indexing all data ...")
for index, row in df.iterrows():
#๋์ฌ๋ฅผ ๋จ์ด๋จ์๋ก ์๋ผ keys ๋ฆฌ์คํธ์ ์ ์ฅํ๊ณ , ์ ์ฒด ํ์ ํฉ์ณ value์ ์ ์ฅํ๋ค.
break lines into word
# and store it in a keys list, and then bring all the lines into value
keys = [x.upper() for x in row[5].split(" ")]
value = " ".join([x for x in row if str(x) != 'nan'])
for key in keys:
# ๊ฐ key, value๋ฅผ ํธ๋ฆฌ์ ์ ์ฅํ๋ค. Store each key and value in a Tree
bst.insert(key, value)
print("Done")
print_stat(bst)
while True:
#์ฌ์ฉ์์๊ฒ keyword๋ฅผ ์
๋ ฅ๋ฐ์์ ํธ๋ฆฌ์์ ์ฐพ์์ค๋ค. Let user
input a keyword and then find in a Tree
keyword = input(">> Enter keyword ('\\q' for quit) : ")
if keyword == '\\q':
break
result = bst.find(keyword.upper())
print("\n".join(result))
print("--- Found {} lines".format(len(result)))
If you have any questions or any opinions, tell me you'll be welcomed :-)
Thank you :-)
Titanium Wok Poker - Play Online at TITaniumArt
ReplyDeleteTITanium Wok is a great titanium white octane blueprint poker table with revlon hair dryer brush titanium a nice titanium symbol layout, and is very similar medical grade titanium earrings to many others made in the titanium alloy same design.